Llm Data Leakage in Express with Bearer Tokens
Llm Data Leakage in Express with Bearer Tokens — how this specific combination creates or exposes the vulnerability
When an Express API uses Bearer token authentication and also exposes endpoints that return or process large language model (LLM) responses, the combination can unintentionally leak sensitive information. Bearer tokens are typically passed in the Authorization header (Authorization: Bearer <token>) and are often used to gate access to user data or privileged operations. If an LLM endpoint echoes back prompt content, system instructions, or internal reasoning in its response, a client that includes a valid Bearer token may receive data that should be restricted to a different user or context. This occurs because the LLM runtime does not inherently enforce user-level boundaries; it processes prompts and configurations as provided by the caller. Without explicit checks, an attacker who can influence the prompt or observe LLM output could capture tokens, PII, or operational details embedded in responses.
In Express, this risk is amplified when middleware that reads the Authorization header is not consistently applied to LLM routes, or when route handlers forward user input directly to the LLM without sanitization. For example, if a developer builds an endpoint like /api/chat that sends user messages to an LLM and returns the raw completion, any leakage of the Bearer token in prompt instructions or in the model’s output becomes accessible to the caller. An LLM Security check in middleBrick would flag this under System Prompt Leakage and Output Scanning, detecting patterns such as token-like strings or configuration details in responses. The scanner also tests for Unauthenticated LLM Endpoint exposure, where an endpoint intended for authenticated use might inadvertently allow requests without proper Bearer validation, increasing the chance that sensitive data appears in outputs accessible to unauthenticated actors.
Another vector involves tooling and function calls. If an Express route sets up the LLM with tools or function_call settings that include sensitive metadata, and the LLM returns those tool names or parameters in its output, a client with a Bearer token can harvest internal implementation details. middleBrick’s LLM/AI Security checks specifically look for Excessive Agency patterns, such as repeated tool_calls or LangChain agent configurations, and scan outputs for API keys, PII, or executable code. Because Bearer tokens are often used as identifiers or session handles, seeing them in LLM responses indicates a breakdown in output filtering and access control, which an attacker can exploit for further intrusion.
To understand the impact, consider an Express route that passes user input to an LLM while also logging or echoing system instructions. If the system message contains instructions like "Use the bearer token from the user context to call downstream services," and the LLM repeats that in its output, the token is effectively leaked. middleBrick’s Active Prompt Injection testing performs sequential probes, including Data Exfiltration and Cost Exploitation, to see whether crafted prompts can trick the model into revealing such material. Even when the Express app uses middleware to set CORS or basic logging, if the LLM response is not scrubbed for sensitive content, the Bearer token and related data can leave the application boundary in an uncontrolled way.
Bearer Tokens-Specific Remediation in Express — concrete code fixes
Remediation focuses on strict separation of authentication, controlled data flow to LLMs, and output validation. Never forward raw Authorization headers or token values to the LLM, and ensure that token handling logic is not reflected in prompts or responses. Below are concrete Express patterns that reduce the risk of LLM data leakage when Bearer tokens are used.
1. Validate Bearer token before LLM calls
Use middleware to verify the token and attach only safe, non-sensitive claims to the request. Do not pass the raw token to the LLM.
import express from 'express';
import jwt from 'jsonwebtoken';
const app = express();
app.use(express.json());
function authenticateToken(req, res, next) {
const authHeader = req.headers['authorization'];
const token = authHeader && authHeader.split(' ')[1];
if (!token) return res.sendStatus(401);
jwt.verify(token, process.env.JWT_SECRET, (err, user) => {
if (err) return res.sendStatus(403);
req.user = { id: user.id, role: user.role };
next();
});
}
app.post('/api/chat', authenticateToken, (req, res) => {
const { message } = req.body;
// req.user contains safe claims, not the raw token
res.json({ userRole: req.user.role, echo: message.substring(0, 100) });
});
2. Sanitize LLM prompts and exclude tokens
Ensure that system and user prompts do not include token values or sensitive context. Use strict input validation and avoid echoing headers or internal variables into the prompt.
import { Configuration, OpenAIApi } from 'openai';
const openai = new OpenAIApi(new Configuration({ apiKey: process.env.OPENAI_API_KEY }));
app.post('/api/chat', authenticateToken, async (req, res) => {
const userMessage = req.body.message;
const systemPrompt = 'You are a helpful assistant. Do not repeat tokens or internal identifiers.';
try {
const completion = await openai.createChatCompletion({
model: 'gpt-4o-mini',
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: userMessage }
],
// Never include req.headers.authorization in messages or tool calls
});
res.json({ reply: completion.data.choices[0].message.content });
} catch (err) {
res.status(500).json({ error: 'LLM request failed' });
}
});
3. Control LLM output and disable dangerous features
Disable tool calls and function_call modes that might expose internal structure, and implement output scanning to reject responses containing token-like patterns.
app.post('/api/chat', authenticateToken, async (req, res) => {
const userMessage = req.body.message;
const completion = await openai.createChatCompletion({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: userMessage }],
tools: [], // Explicitly empty to prevent tool leakage
tool_choice: 'none',
response_format: { type: 'text' }
});
const reply = completion.data.choices[0].message.content;
const tokenLike = /(eyJ[a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+)/;
if (tokenLike.test(reply)) {
return res.status(400).json({ error: 'Response contains potential token leakage' });
}
res.json({ reply });
});
4. Use middleBrick to validate your setup
Run middleBrick scans to detect System Prompt Leakage, Excessive Agency, and Output Scanning findings. The scanner checks whether your endpoints expose tokens or instructions in responses and whether unauthenticated access to LLM endpoints is possible. Use the CLI to integrate checks into development: middlebrick scan <url>, or add the GitHub Action to fail builds if risk scores drop below your threshold.
| Check | Why it matters for Bearer tokens | Remediation focus |
|---|---|---|
| System Prompt Leakage | Prompts may embed token handling logic | Strip token references from system messages |
| Output Scanning | LLM responses might contain tokens | Regex filter for JWT patterns in replies |
| Unauthenticated LLM Endpoint | Endpoints may be callable without token validation | Enforce middleware on all LLM routes |
| Excessive Agency | Tool calls may expose internal token usage | Disable unused tools and inspect tool outputs |
Related CWEs: llmSecurity
| CWE ID | Name | Severity |
|---|---|---|
| CWE-754 | Improper Check for Unusual or Exceptional Conditions | MEDIUM |