HIGH xpath injectionexpressbearer tokens

Xpath Injection in Express with Bearer Tokens

Xpath Injection in Express with Bearer Tokens — how this specific combination creates or exposes the vulnerability

XPath Injection is a server-side injection class where untrusted data is concatenated into an XPath expression without proper escaping or parameterization, leading to unauthorized data access or bypass of access controls. In Express applications that use XPath—typically via an XML parsing library or a custom XML-based API handler—and rely on Bearer Tokens for authentication, the combination can expose two distinct but related risks.

First, Bearer Tokens are often passed in the Authorization header (e.g., Authorization: Bearer ) and may be used indirectly in server-side logic such as logging, audit trails, or token introspection. If these values are later embedded into XPath expressions—perhaps to filter user-specific XML data by token-derived identifiers—unsanitized input can alter query semantics. For example, an attacker who can partially influence a token-derived parameter might craft input that changes the selected nodes, leading to Insecure Direct Object References (IDOR) or Broken Access Control (BOLA).

Second, if the Express route itself builds XPath expressions by concatenating request-controlled inputs (such as query parameters or body fields) alongside token-derived values, an attacker can inject malicious predicates. A typical vulnerable pattern looks like const query = "//user[token='" + token + "' and name='" + req.query.name + "']"; where token might be extracted from the Bearer header. An attacker supplying name=' or '1'='1 can change the predicate logic, potentially bypassing intended filters and reading other users’ XML resources. This maps to the OWASP API Top 10 category '2: Broken Object Level Authorization' and can be surfaced by middleBrick as a BOLA/IDOR finding with severity High.

Because XPath does not support parameterized queries in many JavaScript libraries, the onus is on the developer to sanitize and validate inputs. middleBrick’s scans include Input Validation and Property Authorization checks that specifically test for such injection paths by probing endpoints with crafted payloads, including attempts to escape string contexts or manipulate token-derived filters.

Even when Bearer Tokens are validated centrally, an Express app that reuses token metadata in XPath queries remains vulnerable if those values are not treated as untrusted. For instance, extracting a username from a decoded JWT and embedding it directly into an XPath selection without normalization can enable privilege escalation if an attacker manages to control or predict parts of the token payload. This is why runtime testing that cross-references spec definitions with actual requests—such as the parallel checks in middleBrick—is crucial to detect subtle leakage paths that static analysis might miss.

In summary, the Express + Bearer Token + XPath combination becomes risky when token-derived or request-derived data is concatenated into XPath expressions without strict input validation, canonicalization, and separation of duties. The vulnerability is not in Bearer Tokens themselves but in how their associated data is used in XPath construction, making it essential to treat all inputs—including those derived from trusted headers—as potentially hostile.

Bearer Tokens-Specific Remediation in Express — concrete code fixes

To mitigate XPath Injection in Express when using Bearer Tokens, avoid concatenating any token-derived or user-supplied data directly into XPath strings. Instead, use defensive coding patterns, strict validation, and isolation of authentication data from business logic that queries XML structures.

1. Never concatenate token or user data into XPath

Do not build XPath by string interpolation with values derived from the Authorization header or token payload. For example, avoid:

// Avoid: concatenating token and user input into XPath
const token = extractBearerToken(req); // e.g., 'abc123'
const username = req.query.username;
const xpath = "//user[token='" + token + "' and name='" + username + "']"; // Vulnerable
const result = xmlDoc.evaluate(xpath, xmlDoc, null, XPathResult.ANY_TYPE, null);

2. Use whitelisting and strict validation for user inputs

Validate and sanitize all inputs used in XML queries. For identifiers like usernames or IDs, use allowlists and reject unexpected characters.

// Validate username: allow only alphanumeric and underscore
const username = req.query.username;
if (!/^[A-Za-z0-9_]+$/.test(username)) {
  return res.status(400).send('Invalid username');
}
// token should not be used in XPath at all; use session mapping instead

3. Isolate authentication data from query construction

Keep Bearer Token validation separate from data retrieval. Authenticate the request, resolve the user identity via a secure mapping (e.g., a session store or database keyed by token), and then use parameterized or constructed-safe queries that do not include the raw token.

// Express route example: authenticate first, then fetch data safely
app.get('/profile', (req, res) => {
  const authHeader = req.headers.authorization;
  if (!authHeader || !authHeader.startsWith('Bearer ')) {
    return res.status(401).send('Unauthorized');
  }
  const token = authHeader.substring(7);
  // Verify token via your auth provider (pseudo-code)
  const userId = verifyTokenAndGetUserId(token); // e.g., returns 'user123'
  if (!userId) {
    return res.status(401).send('Invalid token');
  }
  // Use userId in a safe, non-XPath mechanism if possible; if XPath is required:
  const safeXPath = `//user[@id='${escapeXPathLiteral(userId)}']`;
  // But prefer a non-XPath XML search or a compiled query if the library supports it
  const result = findUserXml(safeXPath);
  res.json(result);
});

// Utility to escape literal values in XPath string context
function escapeXPathLiteral(literal) {
  if (literal.includes("'") && !literal.includes('"')) {
    return '"' + literal + '"';
  }
  if (literal.includes('"') && !literal.includes("'")) {
    return "'" + literal + "'";
  }
  return "'" + literal.replace(/'/g, "'") + "'";
}

4. Prefer non-XPath mechanisms when possible

If your use case allows, avoid XPath for filtering user-specific data. Use JSON-based APIs or structured XML parsing with explicit path traversal that does not rely on string evaluation. This removes injection risk entirely and aligns with modern API security practices.

5. Complementary measures

  • Apply strict Content-Type and schema validation on incoming XML to prevent external entity injection (XXE) which can compound injection issues.
  • Log authentication failures and token usage anomalies without including raw tokens or sensitive XML data in logs.
  • Use the middleBrick CLI to scan your Express endpoints regularly: middlebrick scan <url>, and integrate the GitHub Action to fail builds if new XPath-related findings appear.

By ensuring Bearer Token handling remains separate from data query construction and by validating and escaping all inputs, you reduce the attack surface for XPath Injection in Express services.

Frequently Asked Questions

Can Bearer Tokens ever safely be used in XPath expressions?
Avoid embedding Bearer Tokens or any token-derived values directly into XPath expressions. If token metadata must be used, resolve the user identity server-side via a secure mapping and keep tokens out of query construction entirely.
How does middleBrick detect XPath Injection risks in Express APIs?
middleBrick runs parallel security checks including Input Validation and Property Authorization. It probes endpoints with crafted payloads designed to manipulate string contexts and bypass filters, reporting findings such as BOLA/IDOR with severity and remediation guidance.