Python developers face a common challenge: building APIs that handle thousands of concurrent requests without slowing down. Traditional synchronous frameworks struggle when traffic spikes, but FastAPI’s async capabilities change the game.
This guide shows you how to use FastAPI’s asynchronous power to build APIs that handle 15,000-20,000 requests per second (up to 10x faster than Flask). You’ll learn practical async patterns, avoid common pitfalls, and optimize for production workloads.
Why FastAPI Async Matters
FastAPI adoption jumped 40% in 2025, reaching 38% of Python developers. The reason? Performance that scales.
When you make a database query or API call, synchronous code blocks the entire thread. Your server sits idle, waiting for a response. Async code releases the thread immediately, allowing it to handle other requests while waiting for I/O operations to complete.
Performance comparison:
- FastAPI: 15,000-20,000 requests/second
- Flask: 2,000-3,000 requests/second
- Response time: <60ms vs >200ms
The difference matters most when handling real-time data, WebSocket connections, or microservices that make multiple external calls.
Understanding Async/Await in FastAPI
FastAPI builds on Python’s async/await syntax and the ASGI standard. Here’s how it works.
Basic Async Endpoint
from fastapi import FastAPI
import httpx
app = FastAPI()
@app.get("/users/{user_id}")
async def get_user(user_id: int):
async with httpx.AsyncClient() as client:
response = await client.get(f"https://api.example.com/users/{user_id}")
return response.json()
The async def declares an asynchronous function. The await keyword pauses execution until the HTTP request completes, but doesn’t block other requests from being processed.
When to Use Async
Use async for I/O-bound operations:
- Database queries
- External API calls
- File operations
- Network requests
Don’t use async for CPU-intensive tasks like image processing or complex calculations. These block the event loop regardless of async syntax.
Setting Up FastAPI for Async
Install FastAPI with async support:
pip install fastapi[all] uvicorn[standard]
This includes:
- Uvicorn: ASGI server that handles async requests
- Starlette: Async web framework FastAPI builds on
- Pydantic: Data validation with type hints
Choosing Async Database Drivers
Synchronous database drivers block the event loop. Use async alternatives:
PostgreSQL:
from databases import Database
database = Database("postgresql://user:pass@localhost/db")
@app.on_event("startup")
async def startup():
await database.connect()
@app.get("/products")
async def list_products():
query = "SELECT * FROM products LIMIT 10"
results = await database.fetch_all(query)
return results
MongoDB:
from motor.motor_asyncio import AsyncIOMotorClient
client = AsyncIOMotorClient("mongodb://localhost:27017")
db = client.myapp
@app.get("/documents")
async def get_documents():
cursor = db.collection.find().limit(10)
documents = await cursor.to_list(length=10)
return documents
Redis:
import aioredis
redis = await aioredis.create_redis_pool("redis://localhost")
@app.get("/cache/{key}")
async def get_cached(key: str):
value = await redis.get(key)
return {"value": value}
Async Patterns and Best Practices
Concurrent Requests with asyncio.gather
When you need data from multiple sources, run requests concurrently:
import asyncio
import httpx
@app.get("/dashboard")
async def get_dashboard():
async with httpx.AsyncClient() as client:
user_task = client.get("https://api.example.com/user")
orders_task = client.get("https://api.example.com/orders")
stats_task = client.get("https://api.example.com/stats")
user, orders, stats = await asyncio.gather(
user_task,
orders_task,
stats_task
)
return {
"user": user.json(),
"orders": orders.json(),
"stats": stats.json()
}
This pattern reduces total response time from 300ms (3 × 100ms sequential) to 100ms (parallel execution).
Background Tasks
For operations that don’t need to complete before responding:
from fastapi import BackgroundTasks
def send_email(email: str, message: str):
# Email sending logic
pass
@app.post("/signup")
async def signup(email: str, background_tasks: BackgroundTasks):
# Create user account
user = await create_user(email)
# Send welcome email in background
background_tasks.add_task(send_email, email, "Welcome!")
return {"user_id": user.id}
The response returns immediately while the email sends asynchronously.
Dependency Injection for Resources
Manage database connections and other resources cleanly:
from fastapi import Depends
from sqlalchemy.ext.asyncio import AsyncSession
async def get_db() -> AsyncSession:
async with async_session() as session:
yield session
@app.get("/items")
async def list_items(db: AsyncSession = Depends(get_db)):
result = await db.execute("SELECT * FROM items")
return result.scalars().all()
FastAPI handles resource cleanup automatically when the request completes.
Performance Optimization Techniques
Connection Pooling
Reuse database connections instead of creating new ones:
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker
engine = create_async_engine(
"postgresql+asyncpg://user:pass@localhost/db",
pool_size=20,
max_overflow=0
)
async_session = sessionmaker(
engine, class_=AsyncSession, expire_on_commit=False
)
This configuration maintains 20 persistent connections, eliminating connection overhead.
Response Caching
Cache expensive operations with Redis:
import json
from fastapi import Response
@app.get("/expensive-data")
async def get_expensive_data():
# Check cache first
cached = await redis.get("expensive_data")
if cached:
return Response(content=cached, media_type="application/json")
# Compute if not cached
data = await compute_expensive_data()
# Cache for 5 minutes
await redis.setex("expensive_data", 300, json.dumps(data))
return data
Rate Limiting
Protect your API from abuse:
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
@app.get("/api/data")
@limiter.limit("100/minute")
async def get_data():
return {"data": "value"}
This limits each IP to 100 requests per minute.
Common Pitfalls to Avoid
Mixing Sync and Async Code
Don’t call synchronous blocking code in async functions:
# Bad: blocks event loop
@app.get("/bad")
async def bad_endpoint():
time.sleep(5) # Blocks everything!
return {"status": "done"}
# Good: use async sleep
@app.get("/good")
async def good_endpoint():
await asyncio.sleep(5) # Releases event loop
return {"status": "done"}
Forgetting await
Missing await returns a coroutine object instead of the result:
# Wrong
result = database.fetch_all(query) # Returns coroutine
# Correct
result = await database.fetch_all(query) # Returns data
CPU-Bound Operations
For heavy computation, use a process pool:
from concurrent.futures import ProcessPoolExecutor
import asyncio
executor = ProcessPoolExecutor()
def cpu_intensive_task(data):
# Heavy computation
return processed_data
@app.post("/process")
async def process_data(data: dict):
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(executor, cpu_intensive_task, data)
return result
Testing Async Endpoints
Use pytest-asyncio for testing:
import pytest
from httpx import AsyncClient
@pytest.mark.asyncio
async def test_get_user():
async with AsyncClient(app=app, base_url="http://test") as client:
response = await client.get("/users/1")
assert response.status_code == 200
assert response.json()["id"] == 1
Production Deployment
Run with Uvicorn and multiple workers:
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4
For production, use Gunicorn with Uvicorn workers:
gunicorn main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
Monitor performance with middleware:
import time
from fastapi import Request
@app.middleware("http")
async def add_process_time_header(request: Request, call_next):
start_time = time.time()
response = await call_next(request)
process_time = time.time() - start_time
response.headers["X-Process-Time"] = str(process_time)
return response
Real-World Performance Gains
A production API handling user authentication saw these improvements after switching to async:
- Request throughput: 3,200 → 18,500 req/s (578% increase)
- P95 latency: 340ms → 45ms (87% reduction)
- Server costs: 8 instances → 2 instances (75% reduction)
The key was replacing synchronous database calls with asyncpg and implementing connection pooling.
Next Steps
Start with these async patterns in your FastAPI projects:
- Replace synchronous database drivers with async versions
- Use
asyncio.gather()for concurrent external API calls - Implement background tasks for non-critical operations
- Add connection pooling and caching for frequently accessed data
FastAPI’s async capabilities deliver performance that scales with your application. The patterns in this guide handle production workloads efficiently while keeping code readable and maintainable.
For more FastAPI tutorials, check out our guides on authentication and database integration.
Sources:
Discussion
Leave a comment
No comments yet
Be the first to start the conversation.