Memory Leak in Flask with Bearer Tokens
Memory Leak in Flask with Bearer Tokens — how this specific combination creates or exposes the vulnerability
A memory leak in a Flask application that uses Bearer tokens can occur when token handling, validation, or caching retains objects in memory beyond their intended lifetime. In a typical Flask route, developers may parse the Authorization header, validate the token (e.g., via introspection or a local check), and then pass user-specific data derived from the token into long-lived structures or global caches. If these structures are appended to on every request without cleanup, or if circular references prevent Python’s garbage collector from reclaiming memory, the process heap grows over time. This pattern is especially risky when tokens carry user context (scopes, roles, tenant IDs) that are stored in global dictionaries keyed by token identifiers or session-like values.
For example, consider a naive implementation that caches decoded token claims in a module-level dictionary to avoid repeated validation:
from flask import Flask, request, jsonify
import jwt
app = Flask(__name__)
claims_cache = {} # module-level cache, lives for the lifetime of the worker
@app.route('/api/data')
def get_data():
auth = request.headers.get('Authorization', '')
if not auth.startswith('Bearer '):
return jsonify({'error': 'missing bearer token'}), 401
token = auth.split(' ', 1)[1]
if token in claims_cache:
claims = claims_cache[token]
else:
claims = jwt.decode(token, options={'verify_signature': False})
claims_cache[token] = claims # unbounded growth: tokens never evicted
return jsonify({'data': 'sensitive', 'claims': claims})
In this example, each unique Bearer token is stored forever in claims_cache. In a production WSGI server with multiple workers or threads, or when using an async worker model, this cache is shared across requests and grows unbounded as clients rotate tokens. The leak contributes to rising memory usage and can trigger garbage collection pressure, increasing latency. Because the scan is unauthenticated, middleBrick’s checks for Input Validation and Unsafe Consumption can flag missing token-bound cleanup and missing size limits on caches, while the LLM/AI Security checks ensure no token handling logic leaks into prompts or outputs.
Additionally, if token validation calls external introspection endpoints and responses are retained (for logging or debugging), response bodies containing PII or secrets may remain in memory or logs, increasing data exposure risk. middleBrick’s Data Exposure and Encryption checks help detect whether token-related data is inadvertently retained in outputs or logs, and its Inventory Management checks can identify missing resource limits for token caches.
Bearer Tokens-Specific Remediation in Flask — concrete code fixes
To mitigate memory leaks when using Bearer tokens in Flask, avoid unbounded caching and ensure timely release of references. Prefer stateless validation where possible, and if caching is required, use bounded, time-bound stores with automatic eviction.
Below is a revised, safer implementation that removes the persistent cache and validates the token on each request without retaining sensitive data:
from flask import Flask, request, jsonify
import jwt
app = Flask(__name__)
@app.route('/api/data')
def get_data():
auth = request.headers.get('Authorization', '')
if not auth.startswith('Bearer '):
return jsonify({'error': 'missing bearer token'}), 401
token = auth.split(' ', 1)[1]
# Stateless validation; no caching of claims
try:
claims = jwt.decode(token, options={'verify_signature': False})
except Exception:
return jsonify({'error': 'invalid token'}), 401
# Use claims for authorization without storing them globally
return jsonify({'data': 'sensitive', 'scope': claims.get('scope')})
If you must cache claims to reduce validation latency, use a bounded in-memory store with size limits and time-based eviction, such as an LRU cache:
from flask import Flask, request, jsonify
import jwt
from functools import lru_cache
app = Flask(__name__)
@lru_cache(maxsize=1024) # bounded cache, evicts least recently used entries
_cached_decode = jwt.decode
@app.route('/api/data')
def get_data():
auth = request.headers.get('Authorization', '')
if not auth.startswith('Bearer '):
return jsonify({'error': 'missing bearer token'}), 401
token = auth.split(' ', 1)[1]
try:
# The lru_cache caps entries; token is used as part of the cache key
claims = _cached_decode(token, options={'verify_signature': False})
except Exception:
return jsonify({'error': 'invalid token'}), 401
return jsonify({'data': 'sensitive', 'scope': claims.get('scope')})
Additional remediation steps include limiting the size of any request-scoped objects, ensuring that logs do not include full tokens or sensitive claims, and using middleware to release references promptly. middleBrick’s LLM/AI Security checks verify that token handling logic does not leak into prompts or outputs, while its Input Validation and Property Authorization checks ensure scopes and permissions are enforced per request rather than relying on cached state.
For teams using the middleBrick CLI, running middlebrick scan <url> against your Flask endpoints can surface these memory and token handling patterns as part of the standardized security checks, providing prioritized findings and remediation guidance mapped to frameworks such as OWASP API Top 10.