Xpath Injection in Express
How Xpath Injection Manifests in Express — specific attack patterns, Express-specific code paths where this appears
XPath Injection occurs when untrusted input is concatenated into an XPath expression without proper escaping or parameterization, allowing attackers to alter the logic of the query. While XPath is common in XML processing, in an Express.js application it typically appears when the server parses XML payloads (for example, SOAP services or legacy integrations) and builds XPath selectors using string concatenation or template literals.
Express-specific attack pattern: an endpoint accepts an XML document or an identifier and uses an XPath library such as xpath with libxmljs or xmldom. If the code does not sanitize the input, an attacker can inject predicates or path segments to bypass authorization or extract arbitrary nodes. For example, consider an Express route that retrieves a user’s profile from an XML store:
const xpath = require('xpath');
const { DOMParser } = require('xmldom');
app.post('/profile', (req, res) => {
const userId = req.body.userId;
const xml = getXmlData(); // Assume this returns an XML document as a string
const doc = new DOMParser().parseFromString(xml);
const expr = "//user[id='" + userId + "']"; // Vulnerable concatenation
const nodes = xpath.select(expr, doc);
res.json(nodes.map(n => ({ id: n.getElementsByTagName('id')[0].textContent, name: n.getElementsByTagName('name')[0].textContent })));
});
An attacker can supply userId as ' or 1=1 or ', turning the expression into //user[id='' or 1=1 or ']', which may return all user nodes. Another variant is path traversal: '/user/admin/..' or '1'='1 to manipulate hierarchy checks. In services using XPath for RBAC decisions (e.g., selecting role nodes from an XML policy document), injection can elevate privileges or leak sensitive data.
Another Express-specific scenario involves XPath used in middleware that validates XML-based tokens or configuration files. If the token’s identifier is user-controlled and embedded into an XPath query without escaping, an attacker can bypass validation or cause denial-of-service by selecting excessive nodes. Because Express often chains middleware and routes, a vulnerable XPath usage in one handler can impact authentication or authorization checks elsewhere in the pipeline.
Express-Specific Detection — how to identify this issue, including scanning with middleBrick
Detection starts with code review: search for XPath construction patterns that concatenate request parameters. In Express applications, look for usage of xpath.select, evaluate, or DOM methods where strings are built with + or template literals containing user input. Key indicators include missing escaping functions and direct interpolation of req.body, req.query, or req.params into expressions like //tag[id='VALUE'].
Automated scanning with middleBrick helps surface these issues without source code access. middleBrick runs black-box tests that probe endpoints accepting XML or expecting XPath-like identifiers. By injecting payloads such as ' or 1=1 or ' and observing whether the response differs in size, timing, or content (e.g., returning multiple user records or error messages revealing path traversal), middleBrick can detect logical deviations consistent with injection. The scanner also checks for missing input validation and improper handling of XML external entities, which often coexist with XPath misuse.
Because middleBrick supports OpenAPI/Swagger analysis, it cross-references spec definitions with runtime findings. If your Express API defines an endpoint that accepts an XML payload but does not explicitly describe constraints on identifiers, middleBrick’s 12 parallel checks (including Input Validation, Property Authorization, and Unsafe Consumption) will highlight inconsistencies and potential injection surfaces. The report includes severity-ranked findings and remediation guidance, allowing developers to quickly pinpoint vulnerable routes and understand the associated risks.
Express-Specific Remediation — code fixes using Express's native features/libraries
Remediation focuses on eliminating string concatenation in XPath construction. Use a library that supports parameterized XPath or switch to a safer querying approach. With xpath and xmldom, you can bind variables using a custom resolver or precompile expressions where possible. However, the most robust defense is to avoid dynamic XPath built from untrusted input; if you must include identifiers, use strict allow-lists and escaping.
Example of a safer pattern using input validation and a parameterized approach:
const xpath = require('xpath');
const { DOMParser } = require('xmldom');
const { body, validationResult } = require('express-validator');
app.post('/profile', [
body('userId').isInt({ min: 1 }),
], (req, res) => {
const errors = validationResult(req);
if (!errors.isEmpty()) {
return res.status(400).json({ errors: errors.array() });
}
const userId = req.body.userId;
const xml = getXmlData();
const doc = new DOMParser().parseFromString(xml);
// Use a function that escapes single quotes for XPath 1.0
const escapeXpathLiteral = (s) => {
if (s.includes("'") && s.includes('"')) {
return "concat(" + s.split("'").map((part, i, arr) =>
i === 0 ? "'" + part + "'" : "," + "'" + part + "'"
).join('') + ")";
}
return s.includes("'") ? '" + s + '"' : "'" + s + "'";
};
const safeExpr = "//user[id=" + escapeXpathLiteral(userId.toString()) + "]";
const nodes = xpath.select(safeExpr, doc);
res.json(nodes.map(n => ({ id: n.getElementsByTagName('id')[0].textContent, name: n.getElementsByTagName('name')[0].textContent })));
});
Additional measures include using a strict content-type policy for XML, limiting entity expansion to prevent external entity attacks (XXE), and employing allow-lists for expected identifier formats. Where feasible, consider replacing XPath with JSON-based APIs or server-side filtering to avoid XML-level injection risks entirely. middleBrick’s continuous monitoring (Pro plan) can help detect regressions by scanning on a configurable schedule, and the GitHub Action can fail builds if a new endpoint introduces unsafe XPath usage.