HIGH excessive data exposurefastapiapi keys

Excessive Data Exposure in Fastapi with Api Keys

Excessive Data Exposure in Fastapi with Api Keys — how this specific combination creates or exposes the vulnerability

Excessive Data Exposure occurs when an API returns more information than necessary for a given operation, such as full database rows, internal identifiers, or sensitive metadata. In Fastapi applications that rely on API keys for access control, this risk is amplified when key validation is incomplete or when responses are not explicitly filtered. A common pattern is to use a middleware or dependency that checks for the presence of an api key but does not enforce field-level authorization, allowing an attacker who obtained or guessed a valid key to harvest usernames, email addresses, role names, or internal IDs that should remain restricted.

Consider a Fastapi endpoint that lists user profiles. If the endpoint queries the database without scoping results to the requesting user and returns the complete ORM model, the response may expose ids, password hashes, email addresses, and timestamps for all users. An API key intended to permit read-only access to public data can thus become a credential that enables mass data extraction when combined with missing row-level filters. This is a classic case of BOLA (Broken Level of Authorization) facilitated by excessive data exposure: the key is valid, the endpoint is reachable, but authorization boundaries are not enforced at the data layer.

Another scenario involves debug or verbose fields included in responses. Fastapi models (Pydantic) that serialize entire database records may inadvertently include fields such as internal_state, last_login_ip, or raw pointers to storage objects. If an API key is accepted from untrusted clients without restricting which fields are returned, an attacker can chain a valid key with output scanning to harvest sensitive attributes. This intersects with LLM/AI Security when response content is later consumed by language models; exposed PII or internal code snippets can leak into model outputs, increasing the blast radius of a single key compromise.

The interaction with OpenAPI/Swagger spec analysis is important here. middleBrick scans unauthenticated attack surfaces and cross-references spec definitions with runtime findings. If the spec documents a /users/{user_id} endpoint that returns a full user object but the implementation does not enforce scope checks per key, the scan can flag the endpoint as high risk for excessive data exposure. The presence of API keys does not automatically imply proper scoping; without explicit authorization on each query, keys become a broad credential that can widen data exposure beyond intended boundaries.

Api Keys-Specific Remediation in Fastapi — concrete code fixes

Remediation centers on enforcing strict authorization and minimizing data exposure per request. Use dependency injection to validate the API key and bind it to a scoped identity, then filter query results to return only necessary fields. Avoid returning full ORM models directly; instead, define Pydantic response models that exclude sensitive attributes. Combine this with explicit row-level filtering so that a key only permits access to data the client is explicitly allowed to see.

Example: secure key validation with scopes and filtered response models.

from fastapi import FastAPI, Depends, HTTPException, status
from pydantic import BaseModel
from typing import Optional

app = FastAPI()

# Simulated key store with scopes
KEY_STORE = {
    "public_key": {"scopes": ["read:public"]},
    "private_key": {"scopes": ["read:private", "read:public"]},
}

class UserPublic(BaseModel):
    id: int
    username: str
    email: str  # consider masking or omitting in public contexts

class UserPrivate(BaseModel):
    id: int
    username: str
    # intentionally omit sensitive fields like password_hash, internal_state

def get_key_scopes(api_key: str):
    if not api_key:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="API key required",
        )
    key_data = KEY_STORE.get(api_key)
    if not key_data:
        raise HTTPException(
            status_code=status.HTTP_403_FORBIDDEN,
            detail="Invalid or insufficient scope",
        )
    return key_data["scopes"]

def require_scope(required: str):
    def inner(scopes=Depends(get_key_scopes)):
        if required not in scopes:
            raise HTTPException(
                status_code=status.HTTP_403_FORBIDDEN,
                detail=f"Missing scope: {required}",
            )
        return scopes
    return inner

@app.get("/users/me", response_model=UserPublic)
async def read_me(scopes: list = Depends(require_scope("read:public"))):
    # In practice, fetch user from request context or identity derived from key
    # Filter to safe fields only
    return UserPublic(id=1, username="alice", email="[email protected]")

@app.get("/users/{user_id}", response_model=UserPublic)
async def read_user(
    user_id: int,
    scopes: list = Depends(require_scope("read:public")),
):
    # Apply row-level filtering: ensure user_id is within allowed scope for this key
    # For example, restrict to user_id matching tenant or group associated with the key
    # This is a placeholder for actual authorization logic
    if user_id < 0:
        raise HTTPException(status_code=404)
    return UserPublic(id=user_id, username="alice", email="[email protected]")

In this example, each endpoint declares an explicit scope requirement via require_scope, and response models strip out sensitive fields. The API key is mapped to scopes, enabling least-privilege access. For endpoints that should never expose certain fields, simply omit them from the Pydantic model; this reduces the chance of accidental leakage even if the ORM model contains more data.

Additional remediation practices include auditing query results to ensure no raw SELECT * is used, applying tenant or user filters at the database level, and using middleware to redact or mask fields when full models must be returned temporarily. middleBrick can validate these controls by scanning endpoints against the spec and runtime behavior, highlighting mismatches where responses include fields not documented or where keys lack scope-based constraints.

Related CWEs: propertyAuthorization

CWE IDNameSeverity
CWE-915Mass Assignment HIGH

Frequently Asked Questions

Does using API keys alone prevent excessive data exposure in Fastapi?
No. API keys provide authentication (identifying who is making the request) but do not enforce authorization (what they are allowed to see). Without explicit row-level filtering and field-level controls, a valid key can still retrieve excessive data, so keys must be paired with scoped permissions and minimal response models.
How can I verify my Fastapi endpoints are not exposing excess data?
Define strict Pydantic response models that include only necessary fields, enforce scope-based dependencies for API keys, and validate with automated scans. Tools like middleBrick can cross-reference your OpenAPI spec against runtime responses to detect mismatches and highlight endpoints where models return more data than documented.