Templates Ai LangChain RAG Application Template

LangChain RAG Application Template

Featured Free License

A complete Retrieval-Augmented Generation (RAG) application template using LangChain, FastAPI, and vector databases

Setup Time

20 minutes

Difficulty

Intermediate

Components

5 included

Overview

This LangChain RAG (Retrieval-Augmented Generation) template provides a complete foundation for building document-based question-answering systems. It combines the power of vector databases for semantic search with large language models for natural language generation.

The template includes document processing, vector storage, retrieval mechanisms, and a FastAPI interface for easy integration into web applications.

Features

Document Processing Pipeline

Multi-format Support: PDF, Word, text files, and web scraping
Intelligent Chunking: Smart text splitting with overlap for context preservation
Metadata Extraction: Automatic extraction of document metadata
Batch Processing: Efficient processing of large document collections

Vector Database Integration

ChromaDB Integration: Local vector database for development
Scalable Storage: Support for production vector databases (Pinecone, Weaviate)
Semantic Search: Advanced similarity search with filtering
Embedding Management: Flexible embedding model configuration

LangChain RAG Pipeline

Retrieval Chain: Optimized document retrieval with relevance scoring
Generation Chain: Context-aware response generation
Memory Integration: Conversation history and context management
Multi-step Reasoning: Support for complex queries requiring multiple retrievals

FastAPI REST API

Type-safe Endpoints: Pydantic models for request/response validation
Async Processing: Non-blocking operations for better performance
Error Handling: Comprehensive error handling and logging
Interactive Documentation: Automatic OpenAPI documentation

Quick Start

Prerequisites

Python 3.8 or higher
OpenAI API key (or alternative LLM provider)
Git
Docker (optional)

Installation

Clone the template

1
2


git clone https://github.com/pragmatic-ai-stack/langchain-rag-template.git
cd langchain-rag-template

Create virtual environment

1
2


python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies
1

pip install -r requirements.txt

Set up environment variables

1
2


cp .env.example .env
# Edit .env with your configuration

Initialize the vector database
1

python scripts/setup_database.py

Ingest sample documents

1

python scripts/ingest_documents.py --path data/sample_docs/

Run the application
1

uvicorn app.main:app --reload

Expected Output

1
2
3
4
5
6


INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [28720]
INFO:     Started server process [28722]
INFO:     Waiting for application startup.
INFO:     Vector database initialized with 150 documents
INFO:     Application startup complete.

Visit http://localhost:8000/docs to see the interactive API documentation.

Project Structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41


langchain-rag-template/
├── app/
│   ├── __init__.py
│   ├── main.py                 # FastAPI application
│   ├── config.py              # Configuration settings
│   ├── api/
│   │   ├── __init__.py
│   │   ├── routes.py          # API endpoints
│   │   └── dependencies.py    # FastAPI dependencies
│   ├── models/
│   │   ├── __init__.py
│   │   ├── requests.py        # Request models
│   │   └── responses.py       # Response models
│   ├── rag/
│   │   ├── __init__.py
│   │   ├── document_processor.py  # Document processing
│   │   ├── vector_store.py        # Vector database operations
│   │   ├── retriever.py           # Document retrieval
│   │   ├── generator.py           # Response generation
│   │   └── chains.py              # LangChain chains
│   └── utils/
│       ├── __init__.py
│       ├── logging.py         # Logging configuration
│       └── exceptions.py      # Custom exceptions
├── data/
│   ├── sample_docs/           # Sample documents
│   └── processed/             # Processed document cache
├── tests/
│   ├── __init__.py
│   ├── test_api.py           # API tests
│   ├── test_rag.py           # RAG functionality tests
│   └── conftest.py           # Test configuration
├── scripts/
│   ├── setup_database.py     # Database initialization
│   ├── ingest_documents.py   # Document ingestion
│   └── benchmark.py          # Performance benchmarking
├── docker-compose.yml        # Docker services
├── Dockerfile               # Docker image
├── requirements.txt         # Python dependencies
├── .env.example            # Environment template
└── README.md               # Documentation

Configuration

The template uses environment variables for configuration. Copy .env.example to .env and customize:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27


# LLM Configuration
OPENAI_API_KEY=your-openai-api-key
LLM_MODEL=gpt-3.5-turbo
LLM_TEMPERATURE=0.7

# Embedding Configuration
EMBEDDING_MODEL=text-embedding-ada-002
EMBEDDING_PROVIDER=openai

# Vector Database
VECTOR_DB_TYPE=chromadb
VECTOR_DB_PATH=./data/vector_db
COLLECTION_NAME=documents

# API Configuration
API_HOST=0.0.0.0
API_PORT=8000
API_WORKERS=1

# Document Processing
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
MAX_DOCUMENTS=10000

# Retrieval Configuration
RETRIEVAL_K=5
SIMILARITY_THRESHOLD=0.7

API Endpoints

Document Management

POST /documents/upload - Upload and process documents
GET /documents/ - List processed documents
DELETE /documents/{doc_id} - Remove document from index

RAG Operations

POST /query - Ask questions about your documents
POST /chat - Conversational interface with memory
GET /similar - Find similar documents

System Operations

GET /health - Application health check
GET /stats - System statistics and metrics

Core Components

Document Processor

1
2
3
4
5
6
7
8
9


from app.rag.document_processor import DocumentProcessor

processor = DocumentProcessor()

# Process a single document
result = await processor.process_document("path/to/document.pdf")

# Batch process multiple documents
results = await processor.process_directory("path/to/documents/")

Vector Store Operations

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


from app.rag.vector_store import VectorStore

vector_store = VectorStore()

# Add documents to the vector store
await vector_store.add_documents(documents)

# Search for similar documents
results = await vector_store.similarity_search(
    query="What is machine learning?",
    k=5,
    filter={"category": "AI"}
)

RAG Chain

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


from app.rag.chains import RAGChain

rag_chain = RAGChain()

# Ask a question
response = await rag_chain.query(
    question="How does neural network training work?",
    chat_history=[]
)

print(response.answer)
print(response.source_documents)

Customization Guide

Adding New Document Types

Create a document loader in app/rag/loaders/
Register the loader in document_processor.py
Update the API to accept the new file type

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


# app/rag/loaders/custom_loader.py
from langchain.document_loaders.base import BaseLoader

class CustomDocumentLoader(BaseLoader):
    def __init__(self, file_path: str):
        self.file_path = file_path
    
    def load(self):
        # Your custom loading logic
        pass

Switching Vector Databases

Update configuration in .env
Implement vector store adapter in app/rag/vector_stores/
Update dependency injection in app/api/dependencies.py

Custom LLM Integration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


# app/rag/llms/custom_llm.py
from langchain.llms.base import LLM

class CustomLLM(LLM):
    def _call(self, prompt: str, stop=None):
        # Your custom LLM integration
        pass
    
    @property
    def _llm_type(self):
        return "custom"

Testing

Run the test suite:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


# Run all tests
pytest

# Run with coverage
pytest --cov=app

# Run specific test file
pytest tests/test_rag.py

# Run integration tests
pytest tests/integration/

Performance Optimization

Caching Strategies

1
2
3
4
5
6
7


# Enable response caching
ENABLE_RESPONSE_CACHE=true
CACHE_TTL=3600

# Enable embedding caching
ENABLE_EMBEDDING_CACHE=true
EMBEDDING_CACHE_SIZE=10000

Batch Processing

1
2
3


# Process documents in batches
BATCH_SIZE=50
CONCURRENT_WORKERS=4

Vector Database Optimization

1
2
3


# Optimize vector search
VECTOR_INDEX_TYPE=hnsw
VECTOR_SEARCH_PARAMS={"ef": 128, "M": 16}

Deployment

Docker Deployment

Build the image
1

docker build -t langchain-rag-app .
Run with docker-compose
1

docker-compose up -d

Production Considerations

Load Balancing: Use multiple API instances behind a load balancer
Database Scaling: Consider managed vector database services
Monitoring: Implement comprehensive logging and metrics
Security: Add authentication and rate limiting
Caching: Implement Redis for response caching

Monitoring and Observability

Metrics Collection

1
2
3
4
5
6


# Built-in metrics
- Query response time
- Document processing time
- Vector search performance
- LLM token usage
- Error rates

Logging

1
2
3
4
5
6
7
8


# Structured logging with context
import structlog

logger = structlog.get_logger()
logger.info("Query processed", 
           query=query, 
           response_time=response_time,
           documents_retrieved=len(docs))

Support

Documentation: Full Documentation
GitHub Issues: Report bugs or request features
Community: Join our Discord

License

This template is released under the MIT License. See LICENSE for details.

Project Structure

app/:Main application directory
app/api/:API route definitions
app/main.py:FastAPI application entry point
app/models/:Pydantic data models
app/rag/:RAG-specific modules
app/rag/document_processor.py:Document ingestion and processing
app/rag/generator.py:Response generation with LLM
app/rag/retriever.py:Document retrieval logic
app/rag/vector_store.py:Vector database operations
data/:Sample documents and datasets
docker-compose.yml:Docker services configuration
requirements.txt:Python dependencies
tests/:Test suite

Customization Options

Vector Database

Choose between ChromaDB, Pinecone, or Weaviate

LLM Provider

Configure OpenAI, Anthropic, or local models

Document Types

Support for PDF, Word, text, and web scraping

Embedding Model

Choose embedding model (OpenAI, Sentence Transformers, etc.)

Authentication

Add API key authentication and user management

Deployment Options

Docker

Deploy using Docker containers

Build Docker image
Configure environment variables
Run with docker-compose
Set up reverse proxy

AWS

Deploy to AWS with ECS and RDS

Create ECS cluster
Set up RDS for metadata
Configure S3 for document storage
Deploy with CloudFormation

Google Cloud

Deploy to Google Cloud Run

Build container image
Push to Container Registry
Deploy to Cloud Run
Configure Cloud Storage

Similar Templates

FastAPI Starter Template

A complete FastAPI starter template with authentication, database integration, and best practices

Api 15 minutes setup

#langchain #rag #ai #vector-database #template

Template Info

Customize Your Template

Components Used

Key Features

Use Cases

Overview

Features

Document Processing Pipeline

Vector Database Integration

LangChain RAG Pipeline

FastAPI REST API

Quick Start

Prerequisites

Installation

Expected Output

Project Structure

Configuration

API Endpoints

Document Management

RAG Operations

System Operations

Core Components

Document Processor

Vector Store Operations

RAG Chain

Customization Guide

Adding New Document Types

Switching Vector Databases

Custom LLM Integration

Testing

Performance Optimization

Caching Strategies

Batch Processing

Vector Database Optimization

Deployment

Docker Deployment

Production Considerations

Monitoring and Observability

Metrics Collection

Logging

Support

License

Project Structure

Customization Options

Vector Database

LLM Provider

Document Types

Embedding Model

Authentication

Deployment Options

Docker

AWS

Google Cloud

Similar Templates

FastAPI Starter Template