Llm Data Leakage with Jwt Tokens
How Llm Data Leakage Manifests in Jwt Tokens
Large language model (LLM) endpoints often return raw text that includes data they were not intended to expose. When an LLM is integrated with an API that processes JSON Web Tokens (JWT), the model can inadvertently echo token values in its output. This happens in several common code paths:
- Prompt reflection: A developer builds a prompt that includes user‑supplied data (e.g., "Explain this JWT: {token}") without sanitising the token. If the LLM repeats the prompt verbatim, the JWT appears in the response.
- Error messages: When token validation fails, some libraries throw exceptions that contain the token string (e.g., "Invalid token: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."). If the exception is caught and returned to the LLM as part of the response, the token leaks.
- Logging and debugging: Debug endpoints that dump request headers or payloads may be hooked into an LLM‑powered chat interface. The LLM can then be asked to "show the last request" and will output the Authorization header bearing the JWT.
These patterns map directly to the Data Exposure and LLM/AI Security checks that middleBrick performs. The scanner looks for token‑like strings (the JWT regex pattern) in unauthenticated responses and flags them as potential leakage.
Jwt Tokens-Specific Detection
Detecting JWT leakage via an LLM requires scanning the unauthenticated attack surface for responses that contain a JWT where the model’s output is under attacker control. middleBrick’s LLM/AI Security module runs five active probes; one of them searches for token exfiltration.
When you submit a URL, middleBrick:
- Identifies endpoints that accept user input and forward it to an LLM (detected via typical patterns like
/chat,/generate, or custom LLM wrappers). - Injects a series of prompts designed to cause the model to echo or re‑format supplied data, including a base64‑encoded JWT.
- Scans the returned content for the JWT signature pattern (
^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+$) and for any token that matches theAuthorization: Bearerformat. - Correlates findings with the Data Exposure check: if a JWT appears in a response that is not supposed to contain secrets, the finding is marked as high severity.
Because the scan is black‑box and requires no credentials, it works even when the LLM endpoint is behind a gateway or API‑management layer.
Example of a finding that middleBrick might report:
{
"category": "LLM/AI Security",
"severity": "high",
"title": "Potential JWT token leakage via LLM output",
"description": "The response to prompt \"Show me the token you received\" contained a string matching the JWT pattern: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9....",
"remediation": "Ensure user‑supplied data is never placed directly into LLM prompts without sanitisation. Validate and strip Authorization headers before passing requests to the model."
}
Jwt Tokens-Specific Remediation
Fixing JWT leakage in LLM‑enabled APIs involves removing the token from any data that reaches the model and ensuring error handling does not expose the token. Below are language‑specific examples using widely adopted JWT libraries.
Node.js (jsonwebtoken)
const jwt = require('jsonwebtoken');
function safeChatEndpoint(req, res) {
// Extract token from Authorization header
const auth = req.headers.authorization;
let token = null;
if (auth && auth.startsWith('Bearer ')) {
token = auth.slice(7); // remove 'Bearer '
}
// Verify token – if invalid, return a generic error
let payload;
try {
payload = jwt.verify(token, process.env.JWT_SECRET);
} catch (err) {
// Do NOT include the token in the error message
return res.status(401).json({ error: 'Invalid authentication' });
}
// Build a prompt that excludes the raw token
const userInput = req.body.message || '';
const prompt = `You are a helpful assistant. User said: "${userInput}"`;
// Call the LLM (pseudo‑call)
llm.generate(prompt).then(result => {
res.json({ reply: result });
});
}
Python (PyJWT)
import jwt
from flask import request, jsonify
SECRET = os.getenv('JWT_SECRET')
def safe_chat():
auth = request.headers.get('Authorization', '')
token = None
if auth.startswith('Bearer '):
token = auth[7:] # strip 'Bearer '
try:
payload = jwt.decode(token, SECRET, algorithms=['HS256'])
except jwt.PyJWTError:
# Generic failure – token never appears in response
return jsonify({'error': 'Invalid authentication'}), 401
user_msg = request.json.get('message', '')
# Prompt without the token
prompt = f'User said: "{user_msg}"'
llm_response = llm.generate(prompt) # placeholder for LLM call
return jsonify({'reply': llm_response})
Key remediation points:
- Never concatenate raw JWTs or Authorization headers into LLM prompts.
- Catch validation exceptions and return generic messages.
- If debugging endpoints are needed, protect them with authentication and ensure they are not reachable by the LLM.
- Use middleBrick’s scan to verify that the fix removes JWT strings from unauthenticated LLM responses before deploying to production.
Related CWEs: llmSecurity
| CWE ID | Name | Severity |
|---|---|---|
| CWE-754 | Improper Check for Unusual or Exceptional Conditions | MEDIUM |