LlamaIndex 使用指南
简介
LlamaIndex 是一个强大的开源工具,帮助开发者构建基于大型语言模型 (LLM) 的应用程序。提供工具和 API 连接 LLM 与外部数据源,功能类似 LangChain。
快速入门
环境配置
1 2 3 4 5 6
| python -m venv LlamaIndex source LlamaIndex/bin/activate
pip install llama-index
|
基础依赖
1 2 3 4 5
| pip install \ llama-index-core \ llama-index-llms-openai \ llama-index-embeddings-openai \ llama-index-readers-file
|
5行入门代码
1 2 3 4 5 6 7
| from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data() index = VectorStoreIndex.from_documents(documents) query_engine = index.as_query_engine() response = query_engine.query("你的问题") print(response)
|
注意:默认使用 OpenAI 模型,需配置 OPENAI_API_KEY 环境变量
模型配置
1. LLM 配置
本地部署
1 2 3 4 5 6 7 8 9 10 11 12 13
| import os from llama_index.core import Settings from llama_index.llms.openai_like import OpenAILike
Settings.llm = OpenAILike( model="DeepSeek-R1-Distill-Qwen-1.5B", api_base="http://localhost:8000/v1", is_chat_model=True ) response = llm.complete("Hello World!") print(response) )
|
云平台(示例:阿里百炼)
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| import os from llama_index.core import Settings from llama_index.llms.openai_like import OpenAILike
Settings.llm = OpenAILike( model="qwen-plus", api_base="https://dashscope.aliyuncs.com/compatible-mode/v1", api_key=os.getenv("DASHSCOPE_API_KEY"), is_chat_model=True ) response = llm.complete("Hello World!") print(response) )
|
2. Embedding 配置
本地部署
1 2 3 4 5 6 7 8 9
| from llama_index.embeddings.textembed import TextEmbedEmbedding
embed = TextEmbedEmbedding( model_name="Qwen3-Embedding-8B", base_url="http://0.0.0.0:8000/v1", auth_token="TextEmbed", )
embeddings = embed.get_text_embedding_batch( [ "这里下着倾盆大雨!", "印度拥有多元化的文化遗产。", ] ) print(embeddings)
|
云平台(示例:DashScope)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| from llama_index.embeddings.dashscope import DashScopeEmbedding
embedder = DashScopeEmbedding( model_name="text-embedding-v2" ) text_to_embedding = ["风急天高猿啸哀", "渚清沙白鸟飞回", "无边落木萧萧下", "不尽长江滚滚来"]
result_embeddings = embedder.get_text_embedding_batch(text_to_embedding)
for index, embedding in enumerate(result_embeddings): print("Dimension of embeddings: %s" % len(embedding)) print( "Input: %s, embedding is: %s" % (text_to_embedding[index], embedding[:5]) )
|
RAG 检索问答
完整示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
| import os import dashscope from llama_index.core import SimpleDirectoryReader, Settings from llama_index.core import VectorStoreIndex from llama_index.core import VectorStoreIndex, get_response_synthesizer from llama_index.core.node_parser import SentenceSplitter, TokenTextSplitter, SemanticSplitterNodeParser from llama_index.core.retrievers import VectorIndexRetriever from llama_index.core.query_engine import RetrieverQueryEngine from llama_index.core.postprocessor import SimilarityPostprocessor from llama_index.embeddings.dashscope import DashScopeEmbedding from llama_index.llms.openai_like import OpenAILike os.environ['DASHSCOPE_API_KEY']='' dashscope.api_key=api_key=os.getenv("DASHSCOPE_API_KEY")
Settings.embed_model = DashScopeEmbedding( model_name="text-embedding-v2" )
Settings.llm = OpenAILike( model="qwen-plus", api_base="https://dashscope.aliyuncs.com/compatible-mode/v1", api_key=os.getenv("DASHSCOPE_API_KEY"), is_chat_model=True )
documents = SimpleDirectoryReader("./data", exclude=['text.txt'], exclude_hidden=True, recursive=True, required_exts=['.jsonl'] ).load_data() print(f"加载了 {len(documents)} 个文档")
splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=200)
index = VectorStoreIndex.from_documents(documents,transformations=[splitter],show_progress=True)
query_engine = index.as_query_engine(streaming=True, similarity_top_k=5) response = query_engine.query( "基本竞争战略是什么" ) print(response)
|
文档切片规则
| 切片类型 |
特点 |
适用场景 |
示例配置 |
| Token切片 |
按Token数量切分 |
小上下文模型 |
TokenTextSplitter(chunk_size=1024) |
| 句子切片 |
保持句子完整性(默认) |
通用场景 |
SentenceSplitter(chunk_size=512) |
| 句子窗口 |
包含上下文窗口 |
需要上下文关联的任务 |
SentenceWindowNodeParser(window_size=3) |
| 语义切片 |
按语义相关性切分 |
复杂语义分析 |
SemanticSplitterNodeParser() |
最佳实践建议
- 根据模型上下文长度选择切片策略
- 检索时设置
similarity_top_k=3-5 平衡精度与效率
- 使用语义切片时配合后处理器提升效果
- 生产环境建议使用云平台部署保证稳定性
更多细节参考 LlamaIndex 官方文档