Llm Data Leakage in Flask with Bearer Tokens
Llm Data Leakage in Flask with Bearer Tokens — how this specific combination creates or exposes the vulnerability
LLM Data Leakage occurs when an application unintentionally exposes sensitive data in responses generated by or processed through language models. In a Flask application that uses Bearer Tokens for authentication, this risk is compounded because tokens and other secrets may appear in request and response payloads, logs, or error messages. When an LLM endpoint is reachable from the same environment as your Flask service, the model may receive credentials embedded in prompts, headers, or user input, leading to inadvertent disclosure in model outputs.
Consider a Flask route that forwards user queries to an LLM while attaching the incoming Authorization header as context:
from flask import Flask, request, jsonify
import requests
app = Flask(__name__)
@app.route('/ask')
def ask():
auth = request.headers.get('Authorization')
user_prompt = request.json.get('prompt', '') if request.is_json else ''
llm_prompt = f'User: {user_prompt}\nAuth: {auth}'
resp = requests.post('https://api.example.com/llm/completions',
json={'prompt': llm_prompt},
headers={'Authorization': 'Bearer not-a-token'})
return jsonify({'reply': resp.json().get('text', '')})
If the Authorization header contains a Bearer token (e.g., Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...), the token text is included verbatim in the prompt sent to the LLM. Even if the LLM does not echo the token, the exposure occurs because the model may log or cache the input, and because the token traverses internal networks between Flask and the LLM. An attacker who can influence the prompt through user input may coax the model into reproducing the token via an injection or data exfiltration test pattern, revealing credentials in model responses.
This situation is especially risky when the LLM endpoint is unauthenticated or weakly guarded, as it becomes a potential channel for unauthorized data extraction. The combination of Flask routing authenticated requests and an LLM that processes raw HTTP traffic means that tokens can leak through error messages, verbose logging, or overly verbose completions. For example, if the LLM is prompted with instructions to 'repeat back the authentication header,' a vulnerable deployment might return the token directly. Because Bearer tokens are often long-lived credentials, any leakage can lead to replay attacks, privilege escalation, or lateral movement within your infrastructure.
middleBrick’s LLM/AI Security checks detect this class of issue by probing for system prompt leakage, active prompt injection (including data exfiltration attempts), and output scanning for API keys and PII. The scanner reviews your OpenAPI specification, resolves $ref definitions, and correlates runtime behavior with spec expectations to identify unauthenticated LLM endpoints and patterns where authentication data enters the prompt chain. These checks complement the standard security scans for Authentication, Input Validation, and Data Exposure, ensuring that token leakage paths are surfaced even when they involve downstream AI services.
Remediation focuses on ensuring that credentials never form part of LLM prompts, that inputs are strictly validated, and that outputs are inspected before use. In Flask, this means sanitizing user data, removing or masking tokens before constructing prompts, and applying strict allowlists on input fields. Continuous monitoring and automated scans can catch regressions as endpoints evolve.
Bearer Tokens-Specific Remediation in Flask — concrete code fixes
To prevent LLM Data Leakage when using Bearer Tokens in Flask, redesign the interaction so that tokens are never included in prompts or forwarded to LLM endpoints. Instead, validate and authorize the request in Flask, then forward only safe, non-sensitive data to the LLM. Below are concrete, secure patterns you can adopt.
1. Strip or mask tokens before prompt construction
Do not concatenate raw Authorization headers into prompts. If you need to use identity context, derive a safe representation (e.g., user ID) from verified claims rather than passing the token itself.
from flask import Flask, request, jsonify
import jwt
app = Flask(__name__)
# Use a secret key appropriate for your app; keep it out of source control
JWT_SECRET = 'super-secret-key-change-me'
def get_user_id_from_token(auth_header):
if not auth_header or not auth_header.startswith('Bearer '):
return None
token = auth_header.split(' ', 1)[1]
try:
decoded = jwt.decode(token, JWT_SECRET, algorithms=['HS256'])
return decoded.get('sub')
except jwt.InvalidTokenError:
return None
@app.route('/ask')
def ask():
auth_header = request.headers.get('Authorization')
user_id = get_user_id_from_token(auth_header)
if user_id is None:
return jsonify({'error': 'Unauthorized'}), 401
user_prompt = request.json.get('prompt', '') if request.is_json else ''
# Safe: only non-sensitive context is forwarded
llm_prompt = f'User {user_id}: {user_prompt}'
resp = requests.post('https://api.example.com/llm/completions',
json={'prompt': llm_prompt})
return jsonify({'reply': resp.json().get('text', '')})
2. Use environment variables or secure configuration for LLM credentials
Keep LLM API keys and endpoint URLs out of request flow entirely. Load them from environment variables and reference them server-side only.
import os
import requests
from flask import Flask, request, jsonify
app = Flask(__name__)
LLM_API_KEY = os.environ.get('LLM_API_KEY')
LLM_ENDPOINT = os.environ.get('LLM_ENDPOINT', 'https://api.example.com/llm/completions')
@app.route('/ask')
def ask():
user_prompt = request.json.get('prompt', '') if request.is_json else ''
resp = requests.post(LLM_ENDPOINT,
json={'prompt': user_prompt},
headers={'Authorization': f'Bearer {LLM_API_KEY}'})
return jsonify({'reply': resp.json().get('text', '')})
3. Validate and sanitize inputs rigorously
Apply strict validation to user-supplied data to prevent prompt injection and ensure that only expected formats reach the LLM.
from flask import Flask, request, jsonify
import re
app = Flask(__name__)
@app.route('/ask')
def ask():
user_prompt = request.json.get('prompt', '') if request.is_json else ''
if not isinstance(user_prompt, str) or len(user_prompt) > 500:
return jsonify({'error': 'Invalid prompt'}), 400
# Basic injection mitigation: disallow newlines and suspicious patterns
if re.search(r'(?i)(auth|token|bearer|secret|key)\s*[:=]', user_prompt):
return jsonify({'error': 'Suspicious content'}), 400
resp = requests.post('https://api.example.com/llm/completions',
json={'prompt': user_prompt})
return jsonify({'reply': resp.json().get('text', '')})
These patterns reduce the likelihood that tokens or sensitive instructions reach the LLM. Combine them with scanning tools that include LLM/AI Security checks to continuously verify that no credentials appear in prompts or outputs.
Related CWEs: llmSecurity
| CWE ID | Name | Severity |
|---|---|---|
| CWE-754 | Improper Check for Unusual or Exceptional Conditions | MEDIUM |