Memory Leak in Fastapi with Jwt Tokens
Memory Leak in Fastapi with Jwt Tokens — how this specific combination creates or exposes the vulnerability
A memory leak in a FastAPI application that uses JWT tokens typically occurs when token payloads or cryptographic verification artifacts are retained in memory beyond their intended lifetime. FastAPI is commonly paired with libraries such as PyJWT or python-jose for JWT decoding and validation. If developers store decoded token payloads, temporary objects, or cryptographic contexts in global structures, caches, or long-lived sessions without cleanup, memory usage grows over time under sustained load.
Consider an endpoint that decodes a JWT and attaches the full decoded payload to the request state for downstream handlers to use:
import jwt
from fastapi import FastAPI, Request
app = FastAPI()
SECRET = "super-secret-key"
@app.middleware("http")
async def attach_token_payload(request: Request, call_next):
auth = request.headers.get("authorization")
if auth and auth.startswith("Bearer "):
token = auth.split(" ")[1]
# Potentially large payload retained on request.state
decoded = jwt.decode(token, SECRET, algorithms=["HS256"])
request.state.token_payload = decoded # retained for request lifetime
response = await call_next(request)
return response
If decoded contains large claims (such as extensive scopes, roles, or custom metadata), and the application holds references in global caches or inadvertently in module-level structures, the garbage collector may not reclaim them promptly. Over repeated requests, this leads to a steady increase in RSS memory. In a security scanning context, middleBrick may flag this pattern under BFLA/Privilege Escalation and Unsafe Consumption checks because unbounded retention of authorization artifacts can degrade service stability and amplify the impact of token-related abuse.
The issue is exacerbated when token validation logic creates temporary objects per request and those objects are referenced indirectly (e.g., through closures or callback registries). For example, attaching decoded tokens to a global dictionary keyed by subject or session ID without expiration or eviction turns token handling into an unbounded cache, which shows up in the Inventory Management and Data Exposure checks as a risk to stability and confidentiality.
Additionally, if the JWT library or app code retains references to cryptographic keys or verification contexts across requests, memory usage can increase further. The scanning tool tests unauthenticated attack surfaces and looks for indicators such as missing cleanup and large retained payloads. Remediation guidance focuses on minimizing what is retained, ensuring short lifetimes for sensitive objects, and using structured limits on caches.
Jwt Tokens-Specific Remediation in Fastapi — concrete code fixes
To prevent memory retention issues, keep token handling short-lived and avoid attaching large payloads to long-lived scopes. Use local variables and limit the scope of decoded claims. If you need to pass user identity downstream, extract only the minimal required fields (e.g., subject) and avoid caching entire payloads.
Here is a safer pattern that decodes the JWT, extracts only necessary fields, and does not retain references:
import jwt
from fastapi import FastAPI, Request, HTTPException
from fastapi.responses import JSONResponse
app = FastAPI()
SECRET = "super-secret-key"
@app.middleware("http")
async def lightweight_token_validation(request: Request, call_next):
auth = request.headers.get("authorization")
if auth and auth.startswith("Bearer "):
token = auth.split(" ")[1]
try:
decoded = jwt.decode(token, SECRET, algorithms=["HS256"])
# Keep only the minimal required data
user_id = decoded.get("sub")
if not user_id:
raise HTTPException(status_code=401, detail="Invalid token claims")
# Pass minimal data via request state, not the full payload
request.state.user_id = user_id
except jwt.ExpiredSignatureError:
raise HTTPException(status_code=401, detail="Token expired")
except jwt.InvalidTokenError:
raise HTTPException(status_code=401, detail="Invalid token")
response = await call_next(request)
return response
For applications that must inspect scopes or roles, extract only required strings and avoid storing the entire dictionary. If you use dependency injection, ensure dependencies do not implicitly cache decoded tokens:
from fastapi import Depends
def get_current_user(request: Request):
# Return only the minimal identifier, not the full payload
user_id = getattr(request.state, "user_id", None)
if user_id is None:
raise HTTPException(status_code=401, detail="Not authenticated")
return {"user_id": user_id}
If you maintain an in-memory cache for rate limiting or session tracking, enforce strict size limits and time-based eviction. For example, use a fixed-size dictionary or a library that provides LRU eviction to prevent unbounded growth:
from collections import OrderedDict
# Simple bounded cache example
class BoundedCache:
def __init__(self, max_size: int = 1000):
self._data = OrderedDict()
self._max_size = max_size
def get(self, key: str):
return self._data.get(key)
def set(self, key: str, value):
self._data[key] = value
if len(self._data) > self._max_size:
self._data.popitem(last=False)
cache = BoundedCache(max_size=1000)
By minimizing retained data, validating tokens without keeping full payloads in global structures, and bounding any caches, you reduce the likelihood of memory growth that middleBrick identifies under Unsafe Consumption and BFLA/Privilege Escalation checks. This approach aligns with secure token handling while preserving application functionality.