Data validation is one of the most common problems in Python applications. You build an API, receive JSON data, and need to ensure it matches your expectations. Without proper validation, your application crashes with confusing errors or, worse, silently accepts bad data.
Pydantic solves this problem by letting you define data structures with Python type annotations. The library validates data automatically and provides clear error messages when something goes wrong. Pydantic v2, released in 2023, brought significant improvements: a rewritten core for faster validation, new strict validation modes, and better compatibility with type checkers.
This guide covers everything you need to know about Pydantic v2. You will learn about the new features, understand performance optimizations, and see practical patterns for production code.
Why Pydantic v2 Matters
Pydantic v2 represents a complete rewrite of the validation engine. The previous version served millions of developers, but performance limitations became apparent as applications grew. Large models with hundreds of fields took seconds to validate. The new version addresses these issues while adding requested features.
Performance improvements in v2:
The validation speed increased by 10-50x for complex models. This comes from a rewritten core written in Rust, better caching strategies, and optimized type checking. For a typical FastAPI application with 50 validation models, initialization time dropped from 800ms to under 50ms.
Key new features:
Strict validation ensures data types match exactly, preventing unexpected type coercion. Configurable modes let you choose between “lax” (original behavior) and “strict” (no coercion) validation. The new discriminated unions feature makes conditional validation clearer. Better generic support means you can build reusable validation patterns without losing type information.
Setting Up Pydantic v2
Install Pydantic v2 with pip:
pip install pydantic>=2.0
The library works with Python 3.8 and newer versions. For the best type checking experience, use Python 3.11 or newer with native type inference improvements.
Create your first model:
from pydantic import BaseModel
from datetime import datetime
from typing import Optional
class User(BaseModel):
id: int
username: str
email: str
created_at: datetime
bio: Optional[str] = None
# Validate data
user_data = {
"id": 1,
"username": "alice",
"email": "alice@example.com",
"created_at": "2024-01-15T10:30:00",
}
user = User.model_validate(user_data)
print(user.username) # alice
print(user.created_at.year) # 2024
The model_validate method accepts dictionary data and returns a validated model instance. Pydantic automatically converts compatible types: the string timestamp becomes a datetime object, and optional fields default to None.
Defining Models with Type Hints
Pydantic uses Python’s type hint system to define validation rules. This approach keeps your code readable while providing powerful validation.
Basic Types
from pydantic import BaseModel
from typing import List, Dict, Set, Tuple
from uuid import UUID
from ipaddress import IPv4Address
class Product(BaseModel):
product_id: UUID
name: str
price: float
in_stock: bool
tags: List[str]
attributes: Dict[str, str]
sizes: Set[str]
coordinates: Tuple[float, float]
server_ip: IPv4Address
# Valid example
product = Product(
product_id="12345678-1234-1234-1234-123456789012",
name="Widget",
price=29.99,
in_stock=True,
tags=["sale", "featured"],
attributes={"color": "blue", "size": "large"},
sizes={"S", "M", "L"},
coordinates=(40.7128, -74.0060),
server_ip="192.168.1.1"
)
Optional and Default Values
Use Optional for fields that may be missing, and provide default values:
from typing import Optional
from pydantic import BaseModel, Field
class Profile(BaseModel):
username: str
display_name: Optional[str] = None
avatar_url: str = Field(default="https://example.com/default.png")
bio: str = Field(default="", max_length=500)
followers: int = Field(default=0, ge=0) # Greater than or equal to 0
following: int = Field(default=0, ge=0)
The Field function lets you add validation constraints directly to model definitions. Common constraints include default, max_length, ge (greater than or equal), lt (less than), and regex patterns for strings.
Custom Validators
Add custom validation logic with the @field_validator decorator:
from pydantic import BaseModel, field_validator, ValidationError
import re
class RegistrationForm(BaseModel):
username: str
email: str
password: str
confirm_password: str
@field_validator("username")
@classmethod
def username_alphanumeric(cls, v: str) -> str:
if not v.isalnum():
raise ValueError("Username must be alphanumeric")
if len(v) < 3 or len(v) > 20:
raise ValueError("Username must be 3-20 characters")
return v.lower()
@field_validator("email")
@classmethod
def email_valid(cls, v: str) -> str:
pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
if not re.match(pattern, v):
raise ValueError("Invalid email format")
return v.lower()
@field_validator("password")
@classmethod
def password_strength(cls, v: str) -> str:
if len(v) < 8:
raise ValueError("Password must be at least 8 characters")
if not any(c.isupper() for c in v):
raise ValueError("Password must contain uppercase letter")
if not any(c.isdigit() for c in v):
raise ValueError("Password must contain a number")
return v
@field_validator("confirm_password")
@classmethod
def passwords_match(cls, v: str, info) -> str:
if v != info.data["password"]:
raise ValueError("Passwords do not match")
return v
The validator runs after basic type validation. Use the info parameter to access other field values, enabling cross-field validation like password confirmation.
Strict Validation Mode
Pydantic v2 introduces strict validation to prevent unexpected type coercion:
from pydantic import BaseModel, ConfigDict
class StrictUser(BaseModel):
model_config = ConfigDict(strict=True)
id: int
name: str
active: bool
# This works - exact types
strict_user = StrictUser(id=1, name="alice", active=True)
# This fails in strict mode
try:
StrictUser(id="1", name="alice", active=1) # Fails!
except ValidationError as e:
print(e.errors())
In strict mode, Pydantic rejects values that need coercion. A string “1” for an integer field raises an error instead of being converted. This behavior is useful when you need predictable types from external sources.
Configuring Validation Modes
You can set validation modes at the field level or model level:
from pydantic import BaseModel, field_validator
from pydantic.config import ConfigDict
class User(BaseModel):
model_config = ConfigDict(strict=False) # Default: lax mode
id: int
email: str
@field_validator("email", mode="after")
@classmethod
def validate_email(cls, v):
# Use 'after' mode to run after type coercion
if "@" not in v:
raise ValueError("Invalid email")
return v
The “after” validator mode runs after type coercion, giving you both type safety and custom validation logic.
Advanced Features in Pydantic v2
Discriminated Unions
When validating data that could match multiple models, discriminated unions make validation deterministic:
from pydantic import BaseModel, Tag
from typing import Union
class Dog(BaseModel):
pet_type: str = Tag("dog")
name: str
breed: str
class Cat(BaseModel):
pet_type: str = Tag("cat")
name: str
indoor: bool
Pet = Union[Dog | Cat] # Python 3.10+ syntax
# Pydantic uses the 'pet_type' field to decide which model to use
dog_data = {"pet_type": "dog", "name": "Buddy", "breed": "Golden Retriever"}
cat_data = {"pet_type": "cat", "name": "Whiskers", "indoor": True}
dog = Dog.model_validate(dog_data)
cat = Cat.model_validate(cat_data)
The discriminator field tells Pydantic which model to use, avoiding ambiguous validation and providing clear error messages.
Generic Models
Build reusable validation patterns with generics:
from pydantic import BaseModel, TypeAdapter
from typing import Generic, TypeVar, List
T = TypeVar("T")
class PaginatedResponse(BaseModel, Generic[T]):
items: List[T]
total: int
page: int
page_size: int
has_next: bool
@property
def total_pages(self) -> int:
return (self.total + self.page_size - 1) // self.page_size
# Use with different item types
UserResponse = PaginatedResponse[User]
user_page = UserResponse(
items=[user1, user2, user3],
total=100,
page=1,
page_size=10,
has_next=True
)
ProductResponse = PaginatedResponse[Product]
product_page = ProductResponse(
items=[product1, product2],
total=50,
page=1,
page_size=10,
has_next=True
)
Generic models preserve type information, making your code both reusable and type-safe.
ConfigDict
Pydantic v2 replaces the old Config class with ConfigDict:
from pydantic import BaseModel, ConfigDict
class User(BaseModel):
model_config = ConfigDict(
str_max_length=100, # Max length for strings
str_min_length=3, # Min length for strings
validate_assignment=True, # Validate on assignment
extra="forbid", # Reject extra fields
frozen=True, # Make immutable
populate_by_name=True, # Accept both alias and field name
alias_generator=str.lower, # Generate aliases from field names
)
UserName: str # Will have alias "username"
ConfigDict provides all configuration options in a dictionary format, making it easier to configure models dynamically.
Performance Optimization
Model Caching
For models that you validate repeatedly, use model_cache to avoid reconstruction overhead:
from pydantic import BaseModel
from pydantic.functional_validators import cache
class Config(BaseModel):
model_config = ConfigDict(use_attribute_docstrings=True)
@cache
@field_validator("settings")
@classmethod
def validate_settings(cls, v):
# This validation runs once and caches the result
if not isinstance(v, dict):
raise ValueError("Settings must be a dict")
return v
The @cache decorator caches validation results, useful for expensive validation operations.
Reusing Field Definitions
When the same field appears in multiple models, define it once and reuse:
from pydantic import Field, BaseModel
id_field = Field(description="Unique identifier", ge=1)
name_field = Field(max_length=100, min_length=1)
class User(BaseModel):
id: int = id_field
name: str = name_field
class Product(BaseModel):
id: int = id_field
product_name: str = name_field
Reusing field definitions ensures consistent validation across your codebase.
Integration with FastAPI
FastAPI uses Pydantic v2 by default for request validation:
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, EmailStr, field_validator
app = FastAPI()
class UserCreate(BaseModel):
email: EmailStr
password: str
confirm_password: str
@field_validator("confirm_password")
@classmethod
def passwords_match(cls, v: str, info) -> str:
if v != info.data["password"]:
raise ValueError("Passwords do not match")
return v
@app.post("/users/")
async def create_user(user: UserCreate):
# user is already validated
return {"email": user.email, "status": "created"}
FastAPI automatically generates OpenAPI documentation from your Pydantic models, making API documentation automatic and accurate.
Response Models
Control API response structures with response models:
from typing import List
from fastapi import FastAPI
from pydantic import BaseModel
class User(BaseModel):
id: int
username: str
email: str
password: str # Not in response
app = FastAPI()
@app.get("/users/", response_model=List[User])
async def list_users():
# Only User fields (not password) are returned
return [{"id": 1, "username": "alice", "email": "a@b.com", "password": "secret"}]
The response model filters output, ensuring sensitive data never leaves your API.
Error Handling
Pydantic v2 provides detailed error information:
from pydantic import BaseModel, ValidationError, field_validator
class User(BaseModel):
id: int
username: str = Field(min_length=3)
email: str
age: int = Field(ge=0, le=150)
try:
User(
id="not_an_int", # Wrong type
username="ab", # Too short
email="invalid", # Invalid format
age=-5 # Below minimum
)
except ValidationError as e:
errors = e.errors()
for error in errors:
print(f"Field: {error['loc']}")
print(f"Error: {error['msg']}")
print(f"Type: {error['type']}")
Output:
Field: ('id',)
Error: Input should be a valid integer
Type: int_type
Field: ('username',)
Error: String should have at least 3 characters
Type: string_too_short
Field: ('email',)
Error: value is not a valid email address
Type: value_error
Field: ('age',)
Error: Input should be greater than or equal to 0
Type: greater_than_equal
Each error includes the field location, human-readable message, and error type for programmatic handling.
Custom Error Messages
Provide custom messages for validation errors:
from pydantic import BaseModel, field_validator, ValidationError
class Registration(BaseModel):
username: str
@field_validator("username")
@classmethod
def validate_username(cls, v: str) -> str:
if len(v) < 3:
raise ValueError("用户名至少需要3个字符")
if not v[0].isalpha():
raise ValueError("用户名必须以字母开头")
return v
try:
Registration(username="123")
except ValidationError as e:
for error in e.errors():
print(f"{error['loc']}: {error['msg']}")
Custom messages improve user experience when validation fails.
Best Practices
Keep Models Focused
Create separate models for different use cases rather than one model that does everything:
# Input model - minimal fields for creation
class UserCreate(BaseModel):
email: EmailStr
password: str
username: str
# Output model - excludes sensitive fields
class UserResponse(BaseModel):
id: int
username: str
email: str
created_at: datetime
# Update model - all fields optional
class UserUpdate(BaseModel):
username: str | None = None
email: EmailStr | None = None
bio: str | None = Field(default=None, max_length=500)
Separate models make your API clearer and prevent accidental data exposure.
Use Annotated for Complex Types
The Annotated type from Python 3.9+ provides cleaner field definitions:
from typing import Annotated
from pydantic import BaseModel, Field, AfterValidator
def validate_positive(v: int) -> int:
if v <= 0:
raise ValueError("Must be positive")
return v
class Transaction(BaseModel):
amount: Annotated[int, Field(gt=0), AfterValidator(validate_positive)]
description: Annotated[str, Field(max_length=200)]
Annotated fields are more readable than nested validators for complex validation chains.
Document Your Models
Add docstrings to models and fields:
class User(BaseModel):
"""
Represents a user in the system.
Attributes:
id: Unique identifier for the user
username: Display name, must be 3-20 characters
email: Valid email address
bio: Optional user biography
"""
id: int
username: str = Field(..., min_length=3, max_length=20)
email: str
bio: str | None = None
Docstrings appear in generated documentation and help other developers understand your models.
Summary
Pydantic v2 improves data validation in Python. The library combines type safety with runtime validation, catching errors early while keeping code readable. Key takeaways from this guide:
Type hints define validation rules, keeping your code clean and type-safe. The strict validation mode prevents unexpected coercion when you need predictable types. Custom validators handle complex business logic beyond basic type checking. Performance optimizations in v2 make validation fast even for complex models.
Integration with FastAPI provides automatic documentation and request validation. Detailed error messages help users fix input problems quickly. Generic models and discriminated unions enable advanced validation patterns.
Start using Pydantic v2 in your next project. The combination of type hints and runtime validation catches bugs early while maintaining clean, readable code.
For more Python tutorials, check our guides on FastAPI async patterns and Python best practices.
Sources:
Discussion
Leave a comment
No comments yet
Be the first to start the conversation.