Pii Leakage in Fastapi with Firestore
Pii Leakage in Fastapi with Firestore — how this specific combination creates or exposes the vulnerability
When a Fastapi service uses Google Cloud Firestore as a persistence layer, PII leakage commonly arises from overly broad Firestore read permissions, insufficient server-side field filtering, and direct exposure of raw Firestore documents through API responses. Fastapi endpoint handlers often deserialize Firestore documents into Pydantic models or plain dictionaries and return them to clients. If the handler returns entire documents, fields such as email, phone number, government ID, or internal identifiers are exposed unintentionally.
Firestore security rules are not a substitute for application-layer controls in this context. Rules can allow read access at the collection or document level but cannot enforce fine-grained masking or redaction for specific fields returned to a given client role. Therefore, a Fastapi route that performs a read such as doc_ref.get() and returns the full snapshot can disclose PII even when rules permit the read.
The combination introduces risk patterns such as:
- Returning complete Firestore documents in JSON responses without removing or hashing sensitive fields.
- Caching full documents in memory or logs within Fastapi request lifecycles, increasing the window for accidental exposure.
- Using Firestore query cursors or snapshots in server-side pagination that inadvertently include sensitive fields across multiple requests.
- Deserializing Firestore data into Pydantic models that do not omit sensitive fields by default, allowing PII to flow through serialization.
An attacker who gains access to a valid access token or unauthenticated endpoint (if LLM/AI Security checks identify exposed inference endpoints) can enumerate these routes and harvest emails, names, or phone numbers. Unlike a traditional SQL ORM, Firestore’s schemaless nature can make it harder to track which fields are considered sensitive, increasing the likelihood of accidental PII inclusion.
Firestore-Specific Remediation in Fastapi — concrete code fixes
Remediation focuses on minimizing the data returned by Fastapi routes, enforcing field-level filtering before serialization, and ensuring Firestore queries and document reads exclude PII. Below are concrete, working examples tailored for Fastapi with Firestore.
1. Selective field projection with to_dict() and explicit includes
Instead of returning the entire document snapshot, explicitly pick safe fields. Firestore’s to_dict() returns a Python dict; you can remove sensitive keys before serialization.
from fastapi import Fastapi, HTTPException
from google.cloud import firestore
import json
app = Fastapi()
db = firestore.Client()
@app.get("/users/{user_id}")
def get_user_safe(user_id: str):
doc_ref = db.collection("users").document(user_id)
doc = doc_ref.get()
if not doc.exists:
raise HTTPException(status_code=404, detail="User not found")
data = doc.to_dict()
# Remove PII before sending to the client
safe_data = {
"user_id": doc.id,
"display_name": data.get("display_name"),
"role": data.get("role"),
"created_at": str(data.get("created_at")),
}
return safe_data
2. Using Pydantic with exclude_unset and custom omit fields
Define a Pydantic model that only exposes intended fields and use it to serialize Firestore data, ensuring PII fields are never included by default.
from fastapi import Fastapi, HTTPException
from pydantic import BaseModel
from typing import Optional
from google.cloud import firestore
app = Fastapi()
db = firestore.Client()
class UserPublic(BaseModel):
user_id: str
display_name: Optional[str] = None
role: Optional[str] = None
created_at: Optional[str] = None
class Config:
orm_mode = True
def build_user_public(doc):
data = doc.to_dict()
return UserPublic(
user_id=doc.id,
display_name=data.get("display_name"),
role=data.get("role"),
created_at=str(data.get("created_at")),
)
@app.get("/users/{user_id}", response_model=UserPublic)
def get_user_public(user_id: str):
doc_ref = db.collection("users").document(user_id)
doc = doc_ref.get()
if not doc.exists:
raise HTTPException(status_code=404, detail="User not found")
return build_user_public(doc)
3. Server-side field selection in queries
When possible, structure queries to limit the fields returned. Firestore does not support projection queries in the same way as relational databases, but you can selectively read subcollections or use client-side filtering after retrieving only necessary parent documents. For sensitive collections, prefer smaller documents and move highly sensitive fields to a separate, restricted collection with stricter access controls.
from fastapi import Fastapi, Depends
from google.cloud import firestore
from typing import List
app = Fastapi()
db = firestore.Client()
def get_restricted_user_fields(user_id: str):
doc_ref = db.collection("profiles").document(user_id)
doc = doc_ref.get()
if not doc.exists:
return {}
# Only include non-sensitive fields
safe = {k: v for k, v in doc.to_dict().items() if k not in {"ssn", "passport", "private_notes"}}
safe["user_id"] = doc.id
return safe
@app.get("/profiles/{user_id}")
def read_profile(user_id: str):
data = get_restricted_user_fields(user_id)
if not data:
raise HTTPException(status_code=404, detail="Profile not found")
return data
4. Secure Firestore rules and operational practices
Use Firestore rules to restrict read access to entire documents or collections, but do not rely on rules alone to hide PII. Combine rules with server-side field removal as shown above. Avoid logging full documents or query snapshots in application logs, and rotate or restrict service account permissions to follow the principle of least privilege.
Related CWEs: dataExposure
| CWE ID | Name | Severity |
|---|---|---|
| CWE-200 | Exposure of Sensitive Information | HIGH |
| CWE-209 | Error Information Disclosure | MEDIUM |
| CWE-213 | Exposure of Sensitive Information Due to Incompatible Policies | HIGH |
| CWE-215 | Insertion of Sensitive Information Into Debugging Code | MEDIUM |
| CWE-312 | Cleartext Storage of Sensitive Information | HIGH |
| CWE-359 | Exposure of Private Personal Information (PII) | HIGH |
| CWE-522 | Insufficiently Protected Credentials | CRITICAL |
| CWE-532 | Insertion of Sensitive Information into Log File | MEDIUM |
| CWE-538 | Insertion of Sensitive Information into Externally-Accessible File | HIGH |
| CWE-540 | Inclusion of Sensitive Information in Source Code | HIGH |