LlamaIndex
LlamaIndex is a data framework for LLM applications. It provides tools to ingest, structure, and access private or domain-specific data for use with large language models. Originally known as GPT Index, it has evolved into a comprehensive framework focused on making it easy to connect LLMs with your data.
Why Choose LlamaIndex Over LangChain?
While LangChain is our core stack recommendation, LlamaIndex might be the better choice if:
- RAG-focused applications: Your primary use case is Retrieval-Augmented Generation
- Data ingestion priority: You need robust data loading and indexing capabilities
- Simpler architecture: You prefer a more focused framework over a general-purpose one
- Document-heavy workflows: You’re working primarily with documents and knowledge bases
Key Features
- Data Connectors: 100+ integrations for loading data from various sources
- Data Indexes: Efficient storage and retrieval structures for your data
- Query Engines: Natural language querying over your indexed data
- Chat Engines: Conversational interfaces over your data
- Observability: Built-in logging and debugging tools
Installation
1
|
pip install llama-index
|
For specific integrations:
1
2
3
|
pip install llama-index-readers-file
pip install llama-index-vector-stores-chroma
pip install llama-index-embeddings-openai
|
Quick Start
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.settings import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
# Configure settings
Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
Settings.embed_model = OpenAIEmbedding()
# Load documents
documents = SimpleDirectoryReader("data").load_data()
# Create index
index = VectorStoreIndex.from_documents(documents)
# Query the index
query_engine = index.as_query_engine()
response = query_engine.query("What is the main topic of these documents?")
print(response)
|
Building a Document Q&A System
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.settings import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
# Configure global settings
Settings.llm = OpenAI(model="gpt-4", temperature=0)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
# Load and parse documents
documents = SimpleDirectoryReader("./docs").load_data()
# Configure text splitter
text_splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=20)
Settings.text_splitter = text_splitter
# Create index
index = VectorStoreIndex.from_documents(documents)
# Create query engine with custom settings
query_engine = index.as_query_engine(
similarity_top_k=3,
response_mode="compact"
)
# Query the system
while True:
question = input("Ask a question (or 'quit' to exit): ")
if question.lower() == 'quit':
break
response = query_engine.query(question)
print(f"\nAnswer: {response}")
print(f"Sources: {[node.metadata.get('file_name', 'Unknown') for node in response.source_nodes]}")
|
Advanced RAG with Custom Retrievers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
from llama_index.core import VectorStoreIndex, get_response_synthesizer
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.postprocessor import SimilarityPostprocessor
# Create index (assuming documents are already loaded)
index = VectorStoreIndex.from_documents(documents)
# Configure retriever
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=10,
)
# Configure response synthesizer
response_synthesizer = get_response_synthesizer(
response_mode="tree_summarize",
)
# Add post-processing
postprocessor = SimilarityPostprocessor(similarity_cutoff=0.7)
# Assemble query engine
query_engine = RetrieverQueryEngine(
retriever=retriever,
response_synthesizer=response_synthesizer,
node_postprocessors=[postprocessor],
)
# Query with enhanced retrieval
response = query_engine.query("Explain the key concepts in detail")
|
Chat Engine for Conversational RAG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
from llama_index.core.memory import ChatMemoryBuffer
# Create chat engine from index
chat_engine = index.as_chat_engine(
chat_mode="condense_question",
memory=ChatMemoryBuffer.from_defaults(token_limit=3900),
verbose=True
)
# Have a conversation
print("Chat with your documents (type 'quit' to exit)")
while True:
user_input = input("You: ")
if user_input.lower() == 'quit':
break
response = chat_engine.chat(user_input)
print(f"Assistant: {response}")
|
LlamaIndex vs LangChain Comparison
| Feature |
LlamaIndex |
LangChain |
| Primary Focus |
Data indexing & RAG |
General LLM applications |
| Learning Curve |
Moderate |
Moderate to High |
| RAG Capabilities |
Excellent |
Good |
| Agent Support |
Limited |
Extensive |
| Data Connectors |
100+ specialized |
Many general-purpose |
| Query Engines |
Highly optimized |
Flexible chains |
| Documentation |
Excellent |
Good |
| Ecosystem |
Growing |
Large |
Data Connectors
LlamaIndex supports numerous data sources:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
from llama_index.readers.file import PDFReader
from llama_index.readers.web import SimpleWebPageReader
from llama_index.readers.database import DatabaseReader
# PDF documents
pdf_reader = PDFReader()
pdf_docs = pdf_reader.load_data("document.pdf")
# Web pages
web_reader = SimpleWebPageReader()
web_docs = web_reader.load_data(["https://example.com"])
# Database
db_reader = DatabaseReader(
sql_database=sql_database,
query="SELECT * FROM articles"
)
db_docs = db_reader.load_data()
# Combine all documents
all_docs = pdf_docs + web_docs + db_docs
index = VectorStoreIndex.from_documents(all_docs)
|
Custom Data Structures
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
from llama_index.core import TreeIndex, KeywordTableIndex
from llama_index.core.composability import ComposableGraph
# Create different index types
vector_index = VectorStoreIndex.from_documents(documents)
tree_index = TreeIndex.from_documents(documents)
keyword_index = KeywordTableIndex.from_documents(documents)
# Create a composable graph
graph = ComposableGraph.from_indices(
TreeIndex,
[vector_index, tree_index, keyword_index],
index_summaries=["Vector index", "Tree index", "Keyword index"]
)
# Query the composed graph
query_engine = graph.as_query_engine()
response = query_engine.query("Complex query requiring multiple approaches")
|
Integration with Vector Databases
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
import chromadb
# Initialize Chroma client
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("quickstart")
# Create vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
# Create index with persistent storage
index = VectorStoreIndex.from_documents(
documents,
storage_context=storage_context
)
# Save and load index
index.storage_context.persist(persist_dir="./storage")
# Later, load the index
from llama_index.core import load_index_from_storage
storage_context = StorageContext.from_defaults(persist_dir="./storage")
loaded_index = load_index_from_storage(storage_context)
|
When to Choose LlamaIndex
Choose LlamaIndex if you:
- Are building RAG-focused applications
- Need robust data ingestion and indexing
- Want optimized document Q&A systems
- Prefer specialized tools over general frameworks
- Need excellent vector database integrations
- Want simpler architecture for data-centric apps
Choose LangChain if you:
- Need complex agent workflows
- Want extensive chain compositions
- Are building diverse LLM applications
- Need broader ecosystem support
- Want more flexibility in application architecture
Migration Between Frameworks
You can often use both frameworks together:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
# Use LlamaIndex for data indexing
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
# Use LangChain for agent workflows
from langchain.agents import initialize_agent
from langchain.tools import Tool
def query_documents(query: str) -> str:
query_engine = index.as_query_engine()
return str(query_engine.query(query))
tools = [
Tool(
name="Document Search",
func=query_documents,
description="Search through indexed documents"
)
]
agent = initialize_agent(tools, llm, agent="zero-shot-react-description")
|
Resources