HIGH injection flawsfastapifirestore

Injection Flaws in Fastapi with Firestore

Injection Flaws in Fastapi with Firestore — how this specific combination creates or exposes the vulnerability

Injection flaws occur when untrusted data is concatenated into commands or queries without proper validation or parameterization. In a Fastapi service that uses Google Cloud Firestore, the risk arises when request parameters, headers, or body content are directly interpolated into Firestore queries, paths, or document IDs. Firestore’s SDK for Python accepts dictionaries and keyword arguments, but developers can inadvertently build queries by string formatting or by passing raw user input into collection(), document(), or where(), creating injection vectors.

For example, using string concatenation to build a document path exposes the system to injection-style manipulation of the document hierarchy, potentially allowing an attacker to read or write to unexpected locations. Consider this unsafe pattern:

from fastapi import FastAPI, Request
import google.cloud.firestore

app = FastAPI()
db = google.cloud.firestore.Client()

@app.get("/users/{user_input}")
def read_user(user_input: str):
    # UNSAFE: direct interpolation into document path
    doc_ref = db.collection('users').document(user_input)
    snapshot = doc_ref.get()
    return {'data': snapshot.to_dict() if snapshot.exists else {}}

An attacker could supply a path such as users/../../../secrets/read if the document ID is not strictly validated, attempting to traverse logical boundaries. While Firestore enforces strict path semantics, malformed or unexpected IDs can still lead to excessive or unintended data access, effectively an information exposure or BOLA/IDOR pattern facilitated by injection-style input handling.

Another vector arises from dynamic construction of field filters. If a field name or operator is derived from user input without strict allowlisting, an attacker may manipulate query semantics. For instance, the following endpoint permits an attacker to control the field and value used in a where clause through query parameters:

@app.get("/search")
def search_items(field: str, value: str):
    # UNSAFE: field name from user input
    results = db.collection('items').where(field, '==', value).stream()
    return [{'id': doc.id, **doc.to_dict()} for doc in results]

This can lead to BFLA/Privilege Escalation if the field parameter permits filtering on sensitive attributes such as is_admin or deleted_at. It also intersects with Property Authorization issues when field-level permissions are not enforced server-side. Injection here is not SQL-style but logical, manipulating which dataset is returned based on attacker-controlled keys.

LLM/AI Security checks are relevant when endpoints that interact with Firestore are also exposed to LLM tooling. If a system prompt or query handler embeds user input into prompts that are sent to an LLM, concatenated Firestore data can leak system patterns or sensitive context. Output scanning becomes essential to ensure that documents retrieved from Firestore do not inadvertently expose API keys or PII when used in AI workflows.

Finally, the interplay with Inventory Management and Unsafe Consumption checks highlights that Firestore metadata (such as collection group queries or index configurations) can be abused if input is not validated. Attackers may probe for misconfigured composite indexes or attempt to trigger errors that reveal schema details, aiding further exploitation.

Firestore-Specific Remediation in Fastapi — concrete code fixes

Remediation centers on strict input validation, allowlisting, and using Firestore’s parameterization features rather than string building. Always treat user input as opaque data, never as code or structure. Below are concrete, safe patterns for Fastapi with Firestore.

1. Document ID access with strict validation

Do not directly use raw user input as a document ID. Normalize and validate it against a pattern (e.g., alphanumeric with a length limit) and map it to a known set of identifiers or use UUIDs.

import re
from fastapi import FastAPI, HTTPException
import google.cloud.firestore

app = FastAPI()
db = google.cloud.firestore.Client()

USER_ID_PATTERN = re.compile(r'^[a-zA-Z0-9_-]{1,64}$')

@app.get("/users/{user_id}")
def read_user(user_id: str):
    if not USER_ID_PATTERN.match(user_id):
        raise HTTPException(status_code=400, detail='Invalid user identifier')
    # Safe: using parameterized document reference
    doc_ref = db.collection('users').document(user_id)
    snapshot = doc_ref.get()
    if not snapshot.exists:
        raise HTTPException(status_code=404, detail='User not found')
    return {'id': snapshot.id, **snapshot.to_dict()}

2. Safe querying with allowlisted fields

Maintain a server-side allowlist of permitted field names and map user-supplied values to these known fields. Never pass raw field names into where().

ALLOWED_SEARCH_FIELDS = {'email', 'status', 'created_at'}

@app.get("/items")
def list_items(field: str, value: str):
    if field not in ALLOWED_SEARCH_FIELDS:
        raise HTTPException(status_code=400, detail='Invalid search field')
    collection_ref = db.collection('items')
    query = collection_ref.where(field, '==', value)
    results = query.stream()
    return [{'id': doc.id, **doc.to_dict()} for doc in query]

3. Avoid dynamic collection names

Do not allow user input to dictate collection names. If multi-tenancy is required, map tenant identifiers to predefined collections using a whitelist or a secure mapping table.

TENANT_COLLECTIONS = {'tenant_a': 'tenant_a_data', 'tenant_b': 'tenant_b_data'}

@app.get("/tenant/{tenant_key}/logs")
def get_tenant_logs(tenant_key: str):
    if tenant_key not in TENANT_COLLECTIONS:
        raise HTTPException(status_code=400, detail='Invalid tenant')
    collection_name = TENANT_COLLECTIONS[tenant_key]
    logs = db.collection(collection_name).order_by('timestamp', direction=google.cloud.firestore.Query.DESCENDING).stream()
    return [{'id': doc.id, **doc.to_dict()} for doc in logs]

4. Use parameterized queries and avoid string interpolation

Firestore Python SDK does not support query templates in the same way as SQL, but you can avoid injection by never embedding strings into field paths or collection references. Always use keyword arguments and server-side constants.

@app.get("/secure/logs")
def get_secure_logs(start: int, end: int):
    # Safe: numeric bounds used with server-side field names
    logs = (db.collection('audit')
            .where('timestamp', '>=', start)
            .where('timestamp', '<=', end)
            .stream())
    return [{'id': doc.id, **doc.to_dict()} for doc in logs]

By combining input validation, allowlisting, and disciplined use of Firestore’s API, you mitigate injection risks while preserving the flexibility of querying structured data. These practices align with findings from Authentication, BOLA/IDOR, and Property Authorization checks that middleBrick reports when such vulnerabilities are detected.

Frequently Asked Questions

Can Firestore injection bypass authentication in Fastapi?
Yes, if user input is used to construct collection or document references without validation, an attacker may access data belonging to other users (BOLA/IDOR). Always validate and map identifiers server-side.
Does middleBrick test for Firestore injection patterns in Fastapi APIs?
middleBrick runs 12 security checks in parallel, including BOLA/IDOR and Property Authorization, which can detect insecure use of Firestore where user input influences queries or paths.