LangChain RAG Application Template
A complete Retrieval-Augmented Generation (RAG) application template using LangChain, FastAPI, and vector databases
Overview
This LangChain RAG (Retrieval-Augmented Generation) template provides a complete foundation for building document-based question-answering systems. It combines the power of vector databases for semantic search with large language models for natural language generation.
The template includes document processing, vector storage, retrieval mechanisms, and a FastAPI interface for easy integration into web applications.
Features
Document Processing Pipeline
- Multi-format Support: PDF, Word, text files, and web scraping
- Intelligent Chunking: Smart text splitting with overlap for context preservation
- Metadata Extraction: Automatic extraction of document metadata
- Batch Processing: Efficient processing of large document collections
Vector Database Integration
- ChromaDB Integration: Local vector database for development
- Scalable Storage: Support for production vector databases (Pinecone, Weaviate)
- Semantic Search: Advanced similarity search with filtering
- Embedding Management: Flexible embedding model configuration
LangChain RAG Pipeline
- Retrieval Chain: Optimized document retrieval with relevance scoring
- Generation Chain: Context-aware response generation
- Memory Integration: Conversation history and context management
- Multi-step Reasoning: Support for complex queries requiring multiple retrievals
FastAPI REST API
- Type-safe Endpoints: Pydantic models for request/response validation
- Async Processing: Non-blocking operations for better performance
- Error Handling: Comprehensive error handling and logging
- Interactive Documentation: Automatic OpenAPI documentation
Quick Start
Prerequisites
- Python 3.8 or higher
- OpenAI API key (or alternative LLM provider)
- Git
- Docker (optional)
Installation
-
Clone the template
1 2git clone https://github.com/pragmatic-ai-stack/langchain-rag-template.git cd langchain-rag-template -
Create virtual environment
1 2python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate -
Install dependencies
1pip install -r requirements.txt -
Set up environment variables
1 2cp .env.example .env # Edit .env with your configuration -
Initialize the vector database
1python scripts/setup_database.py -
Ingest sample documents
1python scripts/ingest_documents.py --path data/sample_docs/ -
Run the application
1uvicorn app.main:app --reload
Expected Output
|
|
Visit http://localhost:8000/docs to see the interactive API documentation.
Project Structure
|
|
Configuration
The template uses environment variables for configuration. Copy .env.example to .env and customize:
|
|
API Endpoints
Document Management
POST /documents/upload- Upload and process documentsGET /documents/- List processed documentsDELETE /documents/{doc_id}- Remove document from index
RAG Operations
POST /query- Ask questions about your documentsPOST /chat- Conversational interface with memoryGET /similar- Find similar documents
System Operations
GET /health- Application health checkGET /stats- System statistics and metrics
Core Components
Document Processor
|
|
Vector Store Operations
|
|
RAG Chain
|
|
Customization Guide
Adding New Document Types
- Create a document loader in
app/rag/loaders/ - Register the loader in
document_processor.py - Update the API to accept the new file type
|
|
Switching Vector Databases
- Update configuration in
.env - Implement vector store adapter in
app/rag/vector_stores/ - Update dependency injection in
app/api/dependencies.py
Custom LLM Integration
|
|
Testing
Run the test suite:
|
|
Performance Optimization
Caching Strategies
|
|
Batch Processing
|
|
Vector Database Optimization
|
|
Deployment
Docker Deployment
-
Build the image
1docker build -t langchain-rag-app . -
Run with docker-compose
1docker-compose up -d
Production Considerations
- Load Balancing: Use multiple API instances behind a load balancer
- Database Scaling: Consider managed vector database services
- Monitoring: Implement comprehensive logging and metrics
- Security: Add authentication and rate limiting
- Caching: Implement Redis for response caching
Monitoring and Observability
Metrics Collection
|
|
Logging
|
|
Support
- Documentation: Full Documentation
- GitHub Issues: Report bugs or request features
- Community: Join our Discord
License
This template is released under the MIT License. See LICENSE for details.
Project Structure
app/:Main application directory
app/api/:API route definitions
app/main.py:FastAPI application entry point
app/models/:Pydantic data models
app/rag/:RAG-specific modules
app/rag/document_processor.py:Document ingestion and processing
app/rag/generator.py:Response generation with LLM
app/rag/retriever.py:Document retrieval logic
app/rag/vector_store.py:Vector database operations
data/:Sample documents and datasets
docker-compose.yml:Docker services configuration
requirements.txt:Python dependencies
tests/:Test suite
Customization Options
Vector Database
Choose between ChromaDB, Pinecone, or Weaviate
LLM Provider
Configure OpenAI, Anthropic, or local models
Document Types
Support for PDF, Word, text, and web scraping
Embedding Model
Choose embedding model (OpenAI, Sentence Transformers, etc.)
Authentication
Add API key authentication and user management
Deployment Options
Docker
Deploy using Docker containers
- Build Docker image
- Configure environment variables
- Run with docker-compose
- Set up reverse proxy
AWS
Deploy to AWS with ECS and RDS
- Create ECS cluster
- Set up RDS for metadata
- Configure S3 for document storage
- Deploy with CloudFormation
Google Cloud
Deploy to Google Cloud Run
- Build container image
- Push to Container Registry
- Deploy to Cloud Run
- Configure Cloud Storage