Excessive Data Exposure in Fastapi with Firestore
Excessive Data Exposure in Fastapi with Firestore — how this specific combination creates or exposes the vulnerability
Excessive Data Exposure occurs when an API returns more data than necessary for a given operation, often including sensitive fields that should remain restricted. In a Fastapi application that integrates with Google Cloud Firestore, this commonly arises when query endpoints return entire documents without selectively projecting fields. Firestore documents can contain nested maps, authentication tokens, internal metadata, or personally identifiable information (PII). If the endpoint does not explicitly limit which fields are returned, clients receive the full document, inadvertently exposing secrets such as api_keys, internal_ids, or role flags.
Fastapi’s dependency injection and response models can inadvertently propagate all Firestore fields because developers often map query results directly to Pydantic models that mirror the document structure. When a GET or LIST route uses a model that includes every possible key, the API surface expands. For example, a user profile endpoint might return administrative flags, password reset tokens, or internal status fields that should only be visible to privileged services. Attackers performing unauthenticated or low-privilege interactions can harvest these fields to enable horizontal privilege escalation, social engineering, or further API abuse.
The combination of Fastapi’s automatic schema generation and Firestore’s schemaless nature amplifies the risk. Because Firestore does not enforce a fixed schema at write time, documents within a collection can diverge in fields. A Fastapi route that deserializes documents into a generic dict or a loosely defined Pydantic model may expose fields that appear only in a subset of records but are nonetheless sensitive. This inconsistency can lead to intermittent exposures that are difficult to detect without runtime scanning. Additionally, when Firestore references are embedded in responses (e.g., Cloud Storage paths or callable URLs), those references can disclose internal resource topology.
In the context of the OWASP API Top 10, Excessive Data Exposure aligns closely with A05:2023. Security Misconfiguration, where unnecessary data exposure results from insecure default behaviors or incomplete filtering. Unlike authenticated scenarios where access controls might mitigate exposure, unauthenticated or weakly authenticated endpoints in Fastapi with Firestore are particularly vulnerable. Attackers can probe these endpoints to map data structures and identify high-value fields that can be leveraged in chained attacks such as BOLA/IDOR or privilege escalation.
middleBrick’s LLM/AI Security checks are designed to detect scenarios where AI-driven probes could infer sensitive fields through iterative queries. By running active prompt injection tests and output scanning, the platform can highlight endpoints that leak information beyond intended boundaries. For teams using the CLI (middlebrick scan
Firestore-Specific Remediation in Fastapi — concrete code fixes
To remediate Excessive Data Exposure in Fastapi with Firestore, explicitly control which fields are retrieved and returned. Use projection to limit fields at the query level and define strict Pydantic response models that exclude sensitive keys. This ensures that even if Firestore documents contain extra fields, the API surface remains minimal and intentional.
1. Use select() to limit returned fields
Firestore’s select method reduces data transfer and prevents unintended field exposure. In Fastapi, apply projection before converting documents to dictionaries.
from google.cloud import firestore
from fastapi import FastAPI
from pydantic import BaseModel
from typing import List
app = FastAPI()
db = firestore.Client()
class UserPublic(BaseModel):
user_id: str
email: str
display_name: str
@app.get("/users/public", response_model=List[UserPublic])
def list_public_users():
docs = db.collection("users").select(["user_id", "email", "display_name"]).stream()
return [doc.to_dict() for doc in docs]
2. Avoid returning document references and internal metadata
Firestore documents may contain fields like __name__, create_time, or update_time. Explicitly exclude these in your projection or strip them in the response layer.
import re
def sanitize_doc(data: dict) -> dict:
# Remove internal metadata fields
return {k: v for k, v in data.items() if not re.match(r"^__|create_time|update_time$", k)}
@app.get("/users/safe")
def get_user_safe(user_id: str):
doc = db.collection("users").document(user_id).select(["email", "display_name"]).get()
if doc.exists:
data = sanitize_doc(doc.to_dict())
return data
return {"error": "not found"}
3. Validate nested maps and arrays
Firestore allows nested objects. Define nested Pydantic models to ensure only intended subfields are exposed. This prevents leakage of deeply nested sensitive data.
from pydantic import Field
from typing import Optional
class Address(BaseModel):
city: str
country: str
class UserDetailed(BaseModel):
user_id: str
email: str
address: Optional[Address] = Field(default=None)
@app.get("/users/detailed/{user_id}", response_model=UserDetailed)
def get_user_detailed(user_id: str):
doc = db.collection("users").document(user_id).get()
if doc.exists:
data = doc.to_dict()
# Ensure only expected nested structure is passed
safe_data = {
"user_id": data.get("user_id"),
"email": data.get("email"),
"address": data.get("address"),
}
return safe_data
return {"error": "not found"}
4. Enforce read permissions at the route level
Even with projection, ensure that route-level checks align with Firestore security rules. Fastapi should not rely solely on Firestore rules; implement explicit role checks when necessary to complement backend logic.
from fastapi import HTTPException, Depends
def require_scope(required: str):
def checker(user=Depends(get_current_user)):
if required not in user.get("roles", []):
raise HTTPException(status_code=403, detail="insufficient scope")
return checker
@app.get("/admin/fields")
def admin_fields(admin: bool = Depends(require_scope("admin"))):
docs = db.collection("users").select(["email", "role", "status"]).stream()
return [doc.to_dict() for doc in docs]
By combining field projection, strict models, and input sanitization, Fastapi services can integrate with Firestore while minimizing the attack surface. These practices reduce the likelihood of Excessive Data Exposure and align with compliance mappings to OWASP API Top 10, PCI-DSS, and SOC2. For ongoing assurance, teams on the Pro plan can enable continuous monitoring to detect regressions in field exposure across deployed endpoints.
Related CWEs: propertyAuthorization
| CWE ID | Name | Severity |
|---|---|---|
| CWE-915 | Mass Assignment | HIGH |