HIGH data exposurefastapi

Data Exposure in Fastapi

How Data Exposure Manifests in Fastapi

Data exposure in Fastapi APIs often stems from the framework's automatic serialization of Pydantic models and dependency injection system. Fastapi's default behavior of returning complete model instances can inadvertently leak sensitive fields like passwords, API keys, or internal identifiers.

A common vulnerability occurs when developers use Pydantic models with sensitive fields but forget to exclude them in responses. Consider this flawed implementation:

from fastapi import FastAPI
from pydantic import BaseModel
from typing import Optional

class User(BaseModel):
    id: int
    username: str
    password: str  # Never return passwords!
    email: str
    is_admin: bool

app = FastAPI()

@app.get('/users/{user_id}')
async def get_user(user_id: int):
    user = get_user_from_db(user_id)  # Returns User model
    return user  # Fastapi serializes entire model including password

Fastapi's automatic serialization means the entire User model, including the password field, is returned to the client. This is particularly dangerous because Fastapi's dependency injection can make data flow less explicit than traditional frameworks.

Another Fastapi-specific pattern that leads to data exposure is improper use of response_model in endpoint definitions. Developers might define response models that include sensitive fields or fail to use response_model_exclude to filter out confidential data.

Fastapi's background tasks and asynchronous endpoints can also create race conditions where incomplete data or intermediate states are exposed to clients before proper validation and sanitization occur.

Database query exposure is another Fastapi-specific concern. When using async database libraries like asyncpg or databases, developers might accidentally expose raw query results or ORM objects that contain more data than intended.

Fastapi-Specific Detection

Detecting data exposure in Fastapi requires examining both the Pydantic models and endpoint implementations. middleBrick's Fastapi-specific scanning identifies several critical patterns:

The scanner analyzes Pydantic models for sensitive field types (password, secret, api_key, token) and flags endpoints that return these models without proper filtering. It specifically checks for:

  • response_model definitions that include sensitive fields
  • Missing response_model_exclude parameters on endpoints returning user data
  • Database query results that might contain excessive columns
  • Async endpoint implementations where data flow is harder to track
  • Background task implementations that might expose intermediate states

middleBrick's OpenAPI analysis is particularly effective for Fastapi because it parses the auto-generated OpenAPI specs that Fastapi creates. The scanner cross-references these specs with its runtime analysis to identify endpoints that claim to return certain data but might actually expose more.

For LLM/AI security in Fastapi applications, middleBrick detects system prompt leakage by scanning for common LLM format patterns in API responses. Fastapi's async nature makes it particularly vulnerable to timing attacks where partial responses might be exposed during LLM interactions.

The scanner also checks for Fastapi's specific dependency injection patterns, ensuring that injected dependencies don't inadvertently expose sensitive configuration or database connections through their return values.

middleBrick's Property Authorization check is especially relevant for Fastapi, as it verifies that model properties are properly protected and not exposed through multiple access paths.

Fastapi-Specific Remediation

Fastapi provides several native mechanisms to prevent data exposure. The most effective approach is using response_model with proper field exclusion:

from fastapi import FastAPI
from pydantic import BaseModel
from typing import Optional

class UserCreate(BaseModel):
    username: str
    password: str  # Only for creation
    email: str

class UserRead(BaseModel):
    id: int
    username: str
    email: str
    is_admin: bool

app = FastAPI()

@app.post('/users/', response_model=UserRead)
async def create_user(user: UserCreate):
    # UserCreate contains password, but response uses UserRead
    created_user = create_user_in_db(user)
    return created_user

@app.get('/users/{user_id}', response_model=UserRead)
async def get_user(user_id: int):
    user = get_user_from_db(user_id)
    return user

This pattern separates creation models from read models, ensuring sensitive fields never appear in responses.

For more granular control, use response_model_exclude:

from fastapi import FastAPI
from pydantic import BaseModel
from typing import Optional

class User(BaseModel):
    id: int
    username: str
    password: str
    email: str
    is_admin: bool

app = FastAPI()

@app.get('/users/{user_id}', response_model=User, response_model_exclude={'password', 'is_admin'})
async def get_user(user_id: int):
    user = get_user_from_db(user_id)
    return user

Fastapi's dependency injection system can be secured by creating proper dependency classes that don't expose sensitive data:

from fastapi import Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession
from database import get_async_db

def get_current_user(db: AsyncSession = Depends(get_async_db)):
    # Returns only necessary user info, not the entire model
    user = get_authenticated_user()
    if not user:
        raise HTTPException(status_code=401)
    return {'user_id': user.id, 'username': user.username}

@app.get('/protected')
async def protected_endpoint(current_user: dict = Depends(get_current_user)):
    # current_user contains only safe fields
    return {'message': 'Access granted', 'user': current_user}

For database queries, use explicit column selection to avoid exposing unnecessary data:

from databases import Database
from sqlalchemy import select

@app.get('/users/{user_id}')
async def get_user(user_id: int, db: Database = Depends(get_database)):
    query = select([User.c.id, User.c.username, User.c.email]).where(User.c.id == user_id)
    result = await db.fetch_one(query)
    return result  # Only selected columns are returned

middleBrick's CLI tool can verify these fixes:

middlebrick scan https://api.example.com --fastapi

This command runs Fastapi-specific checks and provides detailed reports on any remaining data exposure vulnerabilities.

Related CWEs: dataExposure

CWE IDNameSeverity
CWE-200Exposure of Sensitive Information HIGH
CWE-209Error Information Disclosure MEDIUM
CWE-213Exposure of Sensitive Information Due to Incompatible Policies HIGH
CWE-215Insertion of Sensitive Information Into Debugging Code MEDIUM
CWE-312Cleartext Storage of Sensitive Information HIGH
CWE-359Exposure of Private Personal Information (PII) HIGH
CWE-522Insufficiently Protected Credentials CRITICAL
CWE-532Insertion of Sensitive Information into Log File MEDIUM
CWE-538Insertion of Sensitive Information into Externally-Accessible File HIGH
CWE-540Inclusion of Sensitive Information in Source Code HIGH

Frequently Asked Questions

How does Fastapi's automatic serialization create data exposure risks?
Fastapi automatically serializes Pydantic model instances returned from endpoints, including all fields regardless of sensitivity. This means if you return a User model containing password or API key fields, those values are included in the JSON response without any additional code. The framework's convenience becomes a security liability when developers don't explicitly exclude sensitive fields using response_model_exclude or separate read/write models.
Can middleBrick detect data exposure in Fastapi applications without access to the source code?
Yes, middleBrick performs black-box scanning that analyzes the runtime behavior and API responses. It examines the OpenAPI spec generated by Fastapi, tests endpoints with various inputs, and analyzes actual response payloads to identify sensitive data exposure. The scanner doesn't need source code access or credentials—it works by interacting with the running API and analyzing the responses for patterns that indicate data exposure.