HIGH pii leakagefastapifirestore

Pii Leakage in Fastapi with Firestore

Pii Leakage in Fastapi with Firestore — how this specific combination creates or exposes the vulnerability

When a Fastapi service uses Google Cloud Firestore as a persistence layer, PII leakage commonly arises from overly broad Firestore read permissions, insufficient server-side field filtering, and direct exposure of raw Firestore documents through API responses. Fastapi endpoint handlers often deserialize Firestore documents into Pydantic models or plain dictionaries and return them to clients. If the handler returns entire documents, fields such as email, phone number, government ID, or internal identifiers are exposed unintentionally.

Firestore security rules are not a substitute for application-layer controls in this context. Rules can allow read access at the collection or document level but cannot enforce fine-grained masking or redaction for specific fields returned to a given client role. Therefore, a Fastapi route that performs a read such as doc_ref.get() and returns the full snapshot can disclose PII even when rules permit the read.

The combination introduces risk patterns such as:

Returning complete Firestore documents in JSON responses without removing or hashing sensitive fields.
Caching full documents in memory or logs within Fastapi request lifecycles, increasing the window for accidental exposure.
Using Firestore query cursors or snapshots in server-side pagination that inadvertently include sensitive fields across multiple requests.
Deserializing Firestore data into Pydantic models that do not omit sensitive fields by default, allowing PII to flow through serialization.

An attacker who gains access to a valid access token or unauthenticated endpoint (if LLM/AI Security checks identify exposed inference endpoints) can enumerate these routes and harvest emails, names, or phone numbers. Unlike a traditional SQL ORM, Firestore’s schemaless nature can make it harder to track which fields are considered sensitive, increasing the likelihood of accidental PII inclusion.

Firestore-Specific Remediation in Fastapi — concrete code fixes

Remediation focuses on minimizing the data returned by Fastapi routes, enforcing field-level filtering before serialization, and ensuring Firestore queries and document reads exclude PII. Below are concrete, working examples tailored for Fastapi with Firestore.

1. Selective field projection with to_dict() and explicit includes

Instead of returning the entire document snapshot, explicitly pick safe fields. Firestore’s to_dict() returns a Python dict; you can remove sensitive keys before serialization.

from fastapi import Fastapi, HTTPException
from google.cloud import firestore
import json

app = Fastapi()
db = firestore.Client()

@app.get("/users/{user_id}")
def get_user_safe(user_id: str):
    doc_ref = db.collection("users").document(user_id)
    doc = doc_ref.get()
    if not doc.exists:
        raise HTTPException(status_code=404, detail="User not found")
    data = doc.to_dict()
    # Remove PII before sending to the client
    safe_data = {
        "user_id": doc.id,
        "display_name": data.get("display_name"),
        "role": data.get("role"),
        "created_at": str(data.get("created_at")),
    }
    return safe_data

2. Using Pydantic with exclude_unset and custom omit fields

Define a Pydantic model that only exposes intended fields and use it to serialize Firestore data, ensuring PII fields are never included by default.

from fastapi import Fastapi, HTTPException
from pydantic import BaseModel
from typing import Optional
from google.cloud import firestore

app = Fastapi()
db = firestore.Client()

class UserPublic(BaseModel):
    user_id: str
    display_name: Optional[str] = None
    role: Optional[str] = None
    created_at: Optional[str] = None

    class Config:
        orm_mode = True

def build_user_public(doc):
    data = doc.to_dict()
    return UserPublic(
        user_id=doc.id,
        display_name=data.get("display_name"),
        role=data.get("role"),
        created_at=str(data.get("created_at")),
    )

@app.get("/users/{user_id}", response_model=UserPublic)
def get_user_public(user_id: str):
    doc_ref = db.collection("users").document(user_id)
    doc = doc_ref.get()
    if not doc.exists:
        raise HTTPException(status_code=404, detail="User not found")
    return build_user_public(doc)

3. Server-side field selection in queries

When possible, structure queries to limit the fields returned. Firestore does not support projection queries in the same way as relational databases, but you can selectively read subcollections or use client-side filtering after retrieving only necessary parent documents. For sensitive collections, prefer smaller documents and move highly sensitive fields to a separate, restricted collection with stricter access controls.

from fastapi import Fastapi, Depends
from google.cloud import firestore
from typing import List

app = Fastapi()
db = firestore.Client()

def get_restricted_user_fields(user_id: str):
    doc_ref = db.collection("profiles").document(user_id)
    doc = doc_ref.get()
    if not doc.exists:
        return {}
    # Only include non-sensitive fields
    safe = {k: v for k, v in doc.to_dict().items() if k not in {"ssn", "passport", "private_notes"}}
    safe["user_id"] = doc.id
    return safe

@app.get("/profiles/{user_id}")
def read_profile(user_id: str):
    data = get_restricted_user_fields(user_id)
    if not data:
        raise HTTPException(status_code=404, detail="Profile not found")
    return data

4. Secure Firestore rules and operational practices

Use Firestore rules to restrict read access to entire documents or collections, but do not rely on rules alone to hide PII. Combine rules with server-side field removal as shown above. Avoid logging full documents or query snapshots in application logs, and rotate or restrict service account permissions to follow the principle of least privilege.

Related CWEs: dataExposure

CWE ID	Name	Severity
CWE-200	Exposure of Sensitive Information	HIGH
CWE-209	Error Information Disclosure	MEDIUM
CWE-213	Exposure of Sensitive Information Due to Incompatible Policies	HIGH
CWE-215	Insertion of Sensitive Information Into Debugging Code	MEDIUM
CWE-312	Cleartext Storage of Sensitive Information	HIGH
CWE-359	Exposure of Private Personal Information (PII)	HIGH
CWE-522	Insufficiently Protected Credentials	CRITICAL
CWE-532	Insertion of Sensitive Information into Log File	MEDIUM
CWE-538	Insertion of Sensitive Information into Externally-Accessible File	HIGH
CWE-540	Inclusion of Sensitive Information in Source Code	HIGH

Frequently Asked Questions

Can Firestore security rules alone protect PII in Fastapi responses?

No. Rules control access at the database level but do not prevent an authorized Fastapi service from returning full documents containing PII. Always filter sensitive fields in application code before sending data to clients.

How can I prevent PII leakage when using Firestore pagination in Fastapi?

When paginating, ensure each page-building function strips PII fields from Firestore documents before serialization. Avoid passing full document snapshots through multiple layers of middleware or caching, and prefer returning projected safe structures rather than raw snapshots.

Pii Leakage in Fastapi with Firestore