Cache Poisoning in Flask (Python)
Cache Poisoning in Flask with Python
Cache poisoning in a Flask application using Python occurs when an attacker manipulates cached responses so that malicious or incorrect data is served to users. Flask does not provide application-level caching itself; caching is typically implemented via extensions such as Flask-Caching or integrations with external systems like Redis or Memcached. If cache keys are derived from attacker-controlled inputs without proper validation, an attacker can inject a crafted request that results in a poisoned cache entry being stored and subsequently served to other users.
Consider a Flask route that caches user profile lookups using a query parameter without sanitizing or normalizing the input:
from flask import Flask, request
from flask_caching import Cache
app = Flask(__name__)
cache = Cache(config={'CACHE_TYPE': 'SimpleCache'})
cache.init_app(app)
@app.route('/profile')
def profile():
user_id = request.args.get('user_id', '')
# Unsafe: attacker-controlled input used directly as cache key
cached = cache.get(f'profile:{user_id}')
if cached is not None:
return cached
# Simulated expensive operation
data = {'user_id': user_id, 'email': f'{user_id}@example.com'}
cache.set(f'profile:{user_id}', data, timeout=60)
return data
If the user_id value is not validated and an attacker sends requests with values like ..%2Fprofile%3Fuser_id%3Dadmin or carefully crafted strings, they may cause cache entries to be stored under unexpected keys. Worse, if the application logic uses shared cache namespaces or the cache backend is distributed, poisoned entries can affect multiple users or services. In some configurations, response splitting or header injection can occur if cached responses include raw values that later become part of other cache keys or downstream lookups, effectively turning the cache into a persistence mechanism for malicious content.
Another scenario involves caching rendered templates or fragments that include user-specific or tenant-specific data without incorporating tenant or user context into the cache key. A Flask app using subdomain-based tenant routing may inadvertently serve cached content across tenants if cache keys omit the hostname or tenant identifier. Additionally, if the application normalizes inputs inconsistently (for example, lowercasing some parameters but not others), cache poisoning can arise from case-sensitivity mismatches leading to duplicate or ambiguous cache entries.
Unlike request smuggling or server-side request forgery, cache poisoning specifically concerns the integrity of stored responses. The risk is amplified when cached data is used for authorization or privacy-sensitive decisions, such as caching responses that should be unique per user or per role. Because Flask-Caching can be configured with a variety of backends, developers must ensure that cache keys incorporate sufficient context and that input validation is applied before any caching operation.
Python-Specific Remediation in Flask
To remediate cache poisoning in Flask with Python, ensure cache keys are deterministic, scoped, and derived only from validated, normalized inputs. Always validate and sanitize all inputs that participate in cache key construction. Use strong normalization rules and avoid relying on raw query parameters directly. Below are concrete code examples demonstrating secure practices.
1. Validate and scope cache keys with tenant and user context
from flask import Flask, request, g
from flask_caching import Cache
import hashlib
app = Flask(__name__)
cache = Cache(config={'CACHE_TYPE': 'SimpleCache'})
cache.init_app(app)
def get_cache_key(prefix, **kwargs):
parts = [prefix]
for k in sorted(kwargs.keys()):
parts.append(f'{k}={kwargs[k]}')
key = '|'.join(parts)
return hashlib.sha256(key.encode('utf-8')).hexdigest()
@app.route('/profile')
def profile():
user_id = request.args.get('user_id', '')
# Validate input: allow only alphanumeric and safe characters
if not user_id.isalnum():
return {'error': 'invalid user_id'}, 400
tenant_id = g.tenant_id # Assume tenant set earlier in request lifecycle
cache_key = get_cache_key('profile', user_id=user_id, tenant=tenant_id)
cached = cache.get(cache_key)
if cached is not None:
return cached
data = {'user_id': user_id, 'email': f'{user_id}@example.com'}
cache.set(cache_key, data, timeout=60)
return data
The helper normalizes parameters by sorting keys and hashing the full context, preventing key collisions and injection via unexpected parameter ordering or encoding differences. Tenant context is included to avoid cross-tenant cache pollution.
2. Normalize and avoid raw user input in cache operations
from flask import Flask, request
from flask_caching import Cache
import urllib.parse
app = Flask(__name__)
cache = Cache(config={'CACHE_TYPE': 'SimpleCache'})
cache.init_app(app)
def normalize_user_id(raw):
# Strip surrounding whitespace, limit length, allow only safe chars
cleaned = raw.strip()[:64]
if not cleaned.isalnum():
raise ValueError('invalid user_id')
return cleaned
@app.route('/profile')
def profile():
raw = request.args.get('user_id', '')
try:
user_id = normalize_user_id(raw)
except ValueError:
return {'error': 'invalid user_id'}, 400
# Use normalized value only
cache_key = f'profile:v2:{user_id}'
cached = cache.get(cache_key)
if cached is not None:
return cached
data = {'user_id': user_id, 'email': f'{user_id}@example.com'}
cache.set(cache_key, data, timeout=60)
return data
This approach enforces strict normalization before any cache interaction, reducing the risk of encoding-based poisoning and ensuring consistent cache keys across requests.
For applications using template fragments, apply the same discipline: pass only validated, normalized identifiers into fragment keys and avoid embedding raw user strings in cached template output. Combine these practices with regular security testing, such as scanning with tools like middleBrick, to surface misconfigurations in caching behavior and related attack surfaces.