Path Traversal in Flask with Firestore
Path Traversal in Flask with Firestore — how this specific combination creates or exposes the vulnerability
Path Traversal occurs when user-controlled input is used to construct file system paths without proper validation, allowing an attacker to access files outside the intended directory. In a Flask application that integrates with Google Cloud Firestore, the risk typically arises not from Firestore itself—because Firestore does not expose raw file paths—but from how application code uses Firestore document IDs or fields to build local file system paths, temporary storage keys, or URLs that are later used by other components.
Consider a Flask route that retrieves a user profile document from Firestore and uses a field such as avatar_path to build a local filesystem path for further processing. If the document ID or a field value comes from an unvalidated query parameter, an attacker can supply sequences like ../../../etc/passwd. Even though Firestore enforces its own access controls, the vulnerability exists in the translation between Firestore data and local resource access within the application environment.
Another scenario involves using Firestore document IDs as part of generated signed URLs or storage keys. If a document ID such as user_123/../../backup/config.json is accepted without normalization and used in constructing a Cloud Storage path, the traversal may manifest in the object key space rather than the local filesystem. While Firestore document IDs cannot contain forward slashes in standard usage, developers sometimes store paths in string fields, which can be abused similarly if those fields are concatenated into local paths or external URIs without sanitization.
Middleware or logging components that inspect request paths or Firestore query parameters can also inadvertently expose directory traversal attempts. For example, a naive audit log that concatenates the Flask request path with a Firestore document ID to produce a trace identifier might create a malformed path that reveals internal directory structures when inspected or parsed by other tooling. Because the scanner performs unauthenticated black-box testing, it can detect directory traversal indicators such as unusual sequences of dots and slashes in responses or error messages that reference filesystem locations.
In the context of middleBrick’s LLM/AI Security checks, path traversal attempts may be part of prompt injection or data exfiltration probes if user input reaches prompts or external systems that interact with file-based resources. The scanner’s parallel security checks—such as Input Validation and Property Authorization—help identify whether path-like inputs are properly constrained and whether authorization checks are applied before any file system or external reference resolution occurs.
Firestore-Specific Remediation in Flask — concrete code fixes
To mitigate Path Traversal when working with Flask and Firestore, always validate and sanitize any user input that influences local or external resource identifiers. Do not rely on Firestore access rules alone to protect filesystem or storage paths, because those rules govern Firestore access, not local resource handling.
1. Validate and normalize paths before filesystem use
Use secure path joining and canonicalization to ensure user input never escapes the intended directory. The following Flask route demonstrates safe handling when a Firestore document field is used to reference a local asset:
import os
from flask import Flask, request, jsonify
from google.cloud import firestore
from werkzeug.utils import safe_join
app = Flask(__name__)
db = firestore.Client()
BASE_DIR = "/safe/base/assets"
@app.route("/asset")
def get_asset():
user_file = request.args.get("file", "")
# Reject obviously malicious inputs early
if ".." in user_file or "/" in user_file.replace("\\", "/"):
return jsonify({"error": "invalid file parameter"}), 400
# Use safe_join to prevent directory traversal
safe_path = safe_join(BASE_DIR, user_file)
if safe_path is None:
return jsonify({"error": "path traversal attempt detected"}), 400
# Ensure resolved path is still within BASE_DIR
if not os.path.commonpath([os.path.realpath(safe_path), os.path.realpath(BASE_DIR)]) == os.path.realpath(BASE_DIR):
return jsonify({"error": "path outside allowed directory"}), 400
# Proceed with file operations
return jsonify({"path": safe_path})
2. Sanitize Firestore document fields used in external references
If a Firestore document contains a field used to construct URIs or storage keys, validate and normalize the field value before use:
from urllib.parse import urljoin
from google.cloud import firestore
from flask import Flask, jsonify
app = Flask(__name__)
db = firestore.Client()
@app.route("/profile/avatar")
def get_avatar_url():
doc_id = request.args.get("doc_id", "")
# Basic allowlist validation for document ID format
if not doc_id.replace("_", "").replace("-", "").isalnum():
return jsonify({"error": "invalid document identifier"}), 400
doc_ref = db.collection("profiles").document(doc_id)
doc = doc_ref.get()
if not doc.exists:
return jsonify({"error": "not found"}), 404
data = doc.to_dict()
# Use a base URL and urljoin to avoid string concatenation issues
base_url = "https://storage.example.com/avatars/"
safe_url = urljoin(base_url, doc_id)
return jsonify({"avatar_url": safe_url})
3. Secure Firestore query parameter usage
When Firestore query parameters influence downstream resource selection, enforce strict allowlists and avoid direct concatenation into filesystem or storage paths:
from google.cloud import firestore
from flask import Flask, request, jsonify
app = Flask(__name__)
db = firestore.Client()
@app.route("/search")
def search():
collection_name = request.args.get("collection", "")
# Whitelist allowed collections to prevent injection into path-like identifiers
allowed = {"users", "products", "logs"}
if collection_name not in allowed:
return jsonify({"error": "invalid collection"}), 400
docs = db.collection(collection_name).limit(10).stream()
results = [{"id": doc.id, **doc.to_dict()} for doc in docs]
return jsonify({"results": results})
4. Apply defense in depth with runtime monitoring
Combine input validation, strict allowlists, and runtime security scanning. middleBrick’s CLI can be integrated into development workflows to detect path traversal indicators and input validation weaknesses during testing, while the Web Dashboard helps track findings over time.
Related CWEs: inputValidation
| CWE ID | Name | Severity |
|---|---|---|
| CWE-20 | Improper Input Validation | HIGH |
| CWE-22 | Path Traversal | HIGH |
| CWE-74 | Injection | CRITICAL |
| CWE-77 | Command Injection | CRITICAL |
| CWE-78 | OS Command Injection | CRITICAL |
| CWE-79 | Cross-site Scripting (XSS) | HIGH |
| CWE-89 | SQL Injection | CRITICAL |
| CWE-90 | LDAP Injection | HIGH |
| CWE-91 | XML Injection | HIGH |
| CWE-94 | Code Injection | CRITICAL |