Skip to main content

Tutorial Progress

1

Basic Models and Validation

10 minutes

2

Custom Validators and Advanced Features

10 minutes

3

Settings Management and FastAPI Integration

10 minutes

Overall Progress 0%

Tutorial Info

Difficulty
Beginner
Duration 30 minutes
Reading Time 11 min
Last Updated 2024-01-15

Components Used

Tutorials Mastering Data Validation with Pydantic

Mastering Data Validation with Pydantic

Featured Beginner

Learn how to use Pydantic for robust data validation, serialization, and API development

What You'll Learn

  • Create Pydantic models with proper validation
  • Use custom validators for business logic
  • Handle configuration with Pydantic Settings
  • Integrate Pydantic with FastAPI

Prerequisites

  • Basic Python knowledge
  • Understanding of Python type hints
  • Python 3.8+ installed

What You'll Build

A complete understanding of Pydantic for data validation and API development

Overview

Pydantic is a powerful data validation library that uses Python type hints to validate, serialize, and deserialize data. In this tutorial, you’ll learn how to leverage Pydantic for robust data handling in your applications.

Step 1: Basic Models and Validation

Let’s start with the fundamentals of creating Pydantic models.

Installation

1
pip install pydantic fastapi uvicorn

Basic Model Creation

Create basic_models.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
from pydantic import BaseModel, Field, ValidationError
from typing import Optional, List
from datetime import datetime
from enum import Enum

# Define an enum for user roles
class UserRole(str, Enum):
    ADMIN = "admin"
    USER = "user"
    MODERATOR = "moderator"

# Basic user model
class User(BaseModel):
    id: int
    name: str = Field(..., min_length=1, max_length=100)
    email: str = Field(..., regex=r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')
    age: Optional[int] = Field(None, ge=0, le=150)
    role: UserRole = UserRole.USER
    is_active: bool = True
    created_at: datetime = Field(default_factory=datetime.now)
    tags: List[str] = []

# Test the model
if __name__ == "__main__":
    # Valid user data
    user_data = {
        "id": 1,
        "name": "John Doe",
        "email": "[email protected]",
        "age": 30,
        "role": "admin",
        "tags": ["developer", "python"]
    }
    
    try:
        user = User(**user_data)
        print("✅ Valid user created:")
        print(user.model_dump_json(indent=2))
        print(f"User role: {user.role}")
        print(f"Created at: {user.created_at}")
    except ValidationError as e:
        print("❌ Validation error:")
        print(e)
    
    # Invalid user data
    invalid_data = {
        "id": "not_an_int",  # Should be int
        "name": "",  # Too short
        "email": "invalid-email",  # Invalid format
        "age": -5,  # Negative age
        "role": "invalid_role"  # Invalid enum value
    }
    
    try:
        invalid_user = User(**invalid_data)
    except ValidationError as e:
        print("\n❌ Expected validation errors:")
        for error in e.errors():
            print(f"  - {error['loc'][0]}: {error['msg']}")

Expected Output

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
✅ Valid user created:
{
  "id": 1,
  "name": "John Doe",
  "email": "[email protected]",
  "age": 30,
  "role": "admin",
  "is_active": true,
  "created_at": "2024-01-15T10:30:00.123456",
  "tags": ["developer", "python"]
}
User role: admin
Created at: 2024-01-15 10:30:00.123456

❌ Expected validation errors:
  - id: Input should be a valid integer
  - name: String should have at least 1 character
  - email: String should match pattern '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
  - age: Input should be greater than or equal to 0
  - role: Input should be 'admin', 'user' or 'moderator'

Step 2: Custom Validators and Advanced Features

Now let’s explore custom validators and more advanced Pydantic features.

Custom Validators

Create advanced_validation.py:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
from pydantic import BaseModel, validator, root_validator, Field
from typing import Optional, Dict, Any
import re
from datetime import datetime, date

class UserProfile(BaseModel):
    username: str
    password: str
    confirm_password: str
    email: str
    birth_date: Optional[date] = None
    phone: Optional[str] = None
    metadata: Dict[str, Any] = {}
    
    @validator('username')
    def username_alphanumeric(cls, v):
        if not v.isalnum():
            raise ValueError('Username must be alphanumeric')
        if len(v) < 3:
            raise ValueError('Username must be at least 3 characters')
        return v.lower()  # Convert to lowercase
    
    @validator('password')
    def password_strength(cls, v):
        if len(v) < 8:
            raise ValueError('Password must be at least 8 characters')
        if not re.search(r'[A-Z]', v):
            raise ValueError('Password must contain at least one uppercase letter')
        if not re.search(r'[a-z]', v):
            raise ValueError('Password must contain at least one lowercase letter')
        if not re.search(r'\d', v):
            raise ValueError('Password must contain at least one digit')
        return v
    
    @validator('email')
    def email_valid(cls, v):
        pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
        if not re.match(pattern, v):
            raise ValueError('Invalid email format')
        return v.lower()
    
    @validator('birth_date')
    def birth_date_not_future(cls, v):
        if v and v > date.today():
            raise ValueError('Birth date cannot be in the future')
        return v
    
    @validator('phone')
    def phone_format(cls, v):
        if v is None:
            return v
        # Remove all non-digit characters
        digits = re.sub(r'\D', '', v)
        if len(digits) != 10:
            raise ValueError('Phone number must be 10 digits')
        # Format as (XXX) XXX-XXXX
        return f"({digits[:3]}) {digits[3:6]}-{digits[6:]}"
    
    @root_validator
    def passwords_match(cls, values):
        pw1, pw2 = values.get('password'), values.get('confirm_password')
        if pw1 is not None and pw2 is not None and pw1 != pw2:
            raise ValueError('Passwords do not match')
        return values
    
    @root_validator
    def validate_age_metadata(cls, values):
        birth_date = values.get('birth_date')
        metadata = values.get('metadata', {})
        
        if birth_date:
            age = (date.today() - birth_date).days // 365
            metadata['calculated_age'] = age
            values['metadata'] = metadata
        
        return values

# Test the advanced validation
if __name__ == "__main__":
    # Valid profile
    profile_data = {
        "username": "JohnDoe123",
        "password": "SecurePass123",
        "confirm_password": "SecurePass123",
        "email": "[email protected]",
        "birth_date": "1990-05-15",
        "phone": "555-123-4567",
        "metadata": {"preferences": {"theme": "dark"}}
    }
    
    try:
        profile = UserProfile(**profile_data)
        print("✅ Valid profile created:")
        print(f"Username: {profile.username}")
        print(f"Email: {profile.email}")
        print(f"Phone: {profile.phone}")
        print(f"Calculated age: {profile.metadata.get('calculated_age')}")
    except Exception as e:
        print(f"❌ Error: {e}")
    
    # Test password validation
    weak_password_data = {
        "username": "testuser",
        "password": "weak",
        "confirm_password": "weak",
        "email": "[email protected]"
    }
    
    try:
        weak_profile = UserProfile(**weak_password_data)
    except Exception as e:
        print(f"\n❌ Password validation error: {e}")

Model Configuration and Serialization

Create model_config.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
from pydantic import BaseModel, Field
from typing import Optional
from datetime import datetime
import json

class Product(BaseModel):
    id: int
    name: str
    price: float = Field(..., gt=0)
    description: Optional[str] = None
    created_at: datetime = Field(default_factory=datetime.now)
    _internal_id: str = "internal"  # Private attribute
    
    class Config:
        # Allow population by field name or alias
        allow_population_by_field_name = True
        # Validate assignment (validate when setting attributes)
        validate_assignment = True
        # Use enum values instead of names
        use_enum_values = True
        # Custom JSON encoders
        json_encoders = {
            datetime: lambda v: v.isoformat()
        }
        # Schema extra information
        schema_extra = {
            "example": {
                "id": 1,
                "name": "Laptop",
                "price": 999.99,
                "description": "High-performance laptop"
            }
        }
    
    def calculate_tax(self, rate: float = 0.08) -> float:
        """Calculate tax for the product"""
        return self.price * rate
    
    @property
    def price_with_tax(self) -> float:
        """Get price including default tax"""
        return self.price + self.calculate_tax()

# Test model configuration
if __name__ == "__main__":
    product = Product(
        id=1,
        name="Gaming Laptop",
        price=1299.99,
        description="High-end gaming laptop with RTX graphics"
    )
    
    print("Product details:")
    print(f"Name: {product.name}")
    print(f"Price: ${product.price:.2f}")
    print(f"Price with tax: ${product.price_with_tax:.2f}")
    print(f"Tax amount: ${product.calculate_tax():.2f}")
    
    # JSON serialization
    print("\nJSON representation:")
    print(product.model_dump_json(indent=2))
    
    # Test validation on assignment
    try:
        product.price = -100  # Should raise validation error
    except Exception as e:
        print(f"\n❌ Validation on assignment: {e}")

Step 3: Settings Management and FastAPI Integration

Let’s explore Pydantic Settings for configuration management and integration with FastAPI.

Settings Management

Create settings_example.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
from pydantic import BaseSettings, Field, validator
from typing import List, Optional
from enum import Enum

class Environment(str, Enum):
    DEVELOPMENT = "development"
    STAGING = "staging"
    PRODUCTION = "production"

class DatabaseSettings(BaseSettings):
    host: str = Field(..., env='DB_HOST')
    port: int = Field(5432, env='DB_PORT')
    username: str = Field(..., env='DB_USER')
    password: str = Field(..., env='DB_PASSWORD')
    database: str = Field(..., env='DB_NAME')
    
    @property
    def url(self) -> str:
        return f"postgresql://{self.username}:{self.password}@{self.host}:{self.port}/{self.database}"

class AppSettings(BaseSettings):
    app_name: str = Field("My App", env='APP_NAME')
    debug: bool = Field(False, env='DEBUG')
    environment: Environment = Field(Environment.DEVELOPMENT, env='ENVIRONMENT')
    secret_key: str = Field(..., env='SECRET_KEY')
    allowed_hosts: List[str] = Field(["localhost"], env='ALLOWED_HOSTS')
    api_rate_limit: int = Field(100, env='API_RATE_LIMIT')
    
    # Database settings
    database: DatabaseSettings = DatabaseSettings()
    
    @validator('secret_key')
    def secret_key_length(cls, v):
        if len(v) < 32:
            raise ValueError('Secret key must be at least 32 characters')
        return v
    
    @validator('allowed_hosts', pre=True)
    def parse_allowed_hosts(cls, v):
        if isinstance(v, str):
            return [host.strip() for host in v.split(',')]
        return v
    
    class Config:
        env_file = ".env"
        env_file_encoding = 'utf-8'
        case_sensitive = False

# Create .env file example
env_content = """
APP_NAME=Pragmatic AI Stack API
DEBUG=true
ENVIRONMENT=development
SECRET_KEY=your-super-secret-key-that-is-at-least-32-characters-long
ALLOWED_HOSTS=localhost,127.0.0.1,api.example.com
API_RATE_LIMIT=1000

DB_HOST=localhost
DB_PORT=5432
DB_USER=myuser
DB_PASSWORD=mypassword
DB_NAME=myapp
"""

# Save .env file
with open('.env', 'w') as f:
    f.write(env_content)

# Test settings
if __name__ == "__main__":
    try:
        settings = AppSettings()
        print("✅ Settings loaded successfully:")
        print(f"App Name: {settings.app_name}")
        print(f"Environment: {settings.environment}")
        print(f"Debug Mode: {settings.debug}")
        print(f"Allowed Hosts: {settings.allowed_hosts}")
        print(f"Database URL: {settings.database.url}")
        print(f"Rate Limit: {settings.api_rate_limit}")
    except Exception as e:
        print(f"❌ Settings error: {e}")

FastAPI Integration

Create fastapi_integration.py:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
from fastapi import FastAPI, HTTPException, Depends
from pydantic import BaseModel, Field, validator
from typing import List, Optional
from datetime import datetime
from enum import Enum

# Pydantic models for API
class TaskStatus(str, Enum):
    PENDING = "pending"
    IN_PROGRESS = "in_progress"
    COMPLETED = "completed"
    CANCELLED = "cancelled"

class TaskCreate(BaseModel):
    title: str = Field(..., min_length=1, max_length=200)
    description: Optional[str] = Field(None, max_length=1000)
    priority: int = Field(1, ge=1, le=5)
    due_date: Optional[datetime] = None
    
    @validator('due_date')
    def due_date_not_past(cls, v):
        if v and v < datetime.now():
            raise ValueError('Due date cannot be in the past')
        return v

class TaskUpdate(BaseModel):
    title: Optional[str] = Field(None, min_length=1, max_length=200)
    description: Optional[str] = Field(None, max_length=1000)
    priority: Optional[int] = Field(None, ge=1, le=5)
    status: Optional[TaskStatus] = None
    due_date: Optional[datetime] = None

class TaskResponse(BaseModel):
    id: int
    title: str
    description: Optional[str]
    priority: int
    status: TaskStatus
    due_date: Optional[datetime]
    created_at: datetime
    updated_at: datetime
    
    class Config:
        from_attributes = True  # For SQLAlchemy models

# FastAPI app
app = FastAPI(
    title="Task Management API",
    description="A simple task management API using Pydantic for validation",
    version="1.0.0"
)

# In-memory storage (replace with database in real app)
tasks_db = []
task_counter = 1

@app.post("/tasks/", response_model=TaskResponse)
async def create_task(task: TaskCreate):
    global task_counter
    
    new_task = {
        "id": task_counter,
        "title": task.title,
        "description": task.description,
        "priority": task.priority,
        "status": TaskStatus.PENDING,
        "due_date": task.due_date,
        "created_at": datetime.now(),
        "updated_at": datetime.now()
    }
    
    tasks_db.append(new_task)
    task_counter += 1
    
    return TaskResponse(**new_task)

@app.get("/tasks/", response_model=List[TaskResponse])
async def list_tasks(
    status: Optional[TaskStatus] = None,
    priority: Optional[int] = Field(None, ge=1, le=5)
):
    filtered_tasks = tasks_db
    
    if status:
        filtered_tasks = [t for t in filtered_tasks if t["status"] == status]
    
    if priority:
        filtered_tasks = [t for t in filtered_tasks if t["priority"] == priority]
    
    return [TaskResponse(**task) for task in filtered_tasks]

@app.get("/tasks/{task_id}", response_model=TaskResponse)
async def get_task(task_id: int):
    task = next((t for t in tasks_db if t["id"] == task_id), None)
    if not task:
        raise HTTPException(status_code=404, detail="Task not found")
    return TaskResponse(**task)

@app.put("/tasks/{task_id}", response_model=TaskResponse)
async def update_task(task_id: int, task_update: TaskUpdate):
    task = next((t for t in tasks_db if t["id"] == task_id), None)
    if not task:
        raise HTTPException(status_code=404, detail="Task not found")
    
    # Update only provided fields
    update_data = task_update.model_dump(exclude_unset=True)
    for field, value in update_data.items():
        task[field] = value
    
    task["updated_at"] = datetime.now()
    
    return TaskResponse(**task)

@app.delete("/tasks/{task_id}")
async def delete_task(task_id: int):
    global tasks_db
    task = next((t for t in tasks_db if t["id"] == task_id), None)
    if not task:
        raise HTTPException(status_code=404, detail="Task not found")
    
    tasks_db = [t for t in tasks_db if t["id"] != task_id]
    return {"message": "Task deleted successfully"}

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

Test the API

Create test_pydantic_api.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import requests
import json
from datetime import datetime, timedelta

BASE_URL = "http://localhost:8000"

def test_create_task():
    """Test creating a new task"""
    task_data = {
        "title": "Learn Pydantic",
        "description": "Complete the Pydantic tutorial",
        "priority": 3,
        "due_date": (datetime.now() + timedelta(days=7)).isoformat()
    }
    
    response = requests.post(f"{BASE_URL}/tasks/", json=task_data)
    print("Create Task Response:")
    print(json.dumps(response.json(), indent=2))
    return response.json()["id"]

def test_list_tasks():
    """Test listing tasks"""
    response = requests.get(f"{BASE_URL}/tasks/")
    print("\nList Tasks Response:")
    print(json.dumps(response.json(), indent=2))

def test_update_task(task_id):
    """Test updating a task"""
    update_data = {
        "status": "in_progress",
        "priority": 4
    }
    
    response = requests.put(f"{BASE_URL}/tasks/{task_id}", json=update_data)
    print(f"\nUpdate Task {task_id} Response:")
    print(json.dumps(response.json(), indent=2))

def test_validation_error():
    """Test validation error handling"""
    invalid_data = {
        "title": "",  # Too short
        "priority": 10,  # Too high
        "due_date": "2020-01-01T00:00:00"  # In the past
    }
    
    response = requests.post(f"{BASE_URL}/tasks/", json=invalid_data)
    print(f"\nValidation Error Response (Status {response.status_code}):")
    print(json.dumps(response.json(), indent=2))

if __name__ == "__main__":
    print("Testing Pydantic FastAPI Integration...")
    print("=" * 50)
    
    # Create a task
    task_id = test_create_task()
    
    # List tasks
    test_list_tasks()
    
    # Update task
    test_update_task(task_id)
    
    # Test validation errors
    test_validation_error()

Congratulations!

You’ve successfully learned how to use Pydantic for:

  • Data Validation: Creating robust models with automatic validation
  • Custom Validators: Implementing business logic validation
  • Settings Management: Managing application configuration
  • FastAPI Integration: Building type-safe APIs with automatic documentation

Pydantic’s combination of simplicity and power makes it an essential tool in the Pragmatic AI Stack for ensuring data integrity and API reliability.

Next Steps

  • Explore Pydantic’s advanced features like custom JSON encoders
  • Learn about Pydantic’s integration with databases (SQLAlchemy)
  • Implement complex validation scenarios for your specific use cases
  • Build production-ready APIs with comprehensive error handling