Unicode Normalization in Express with Basic Auth
Unicode Normalization in Express with Basic Auth — how this specific combination creates or exposes the vulnerability
Unicode normalization attacks exploit different byte representations of the same logical string. In Express, when Basic Auth credentials are extracted from the Authorization header, developers often compare the received user-provided credentials directly against a database or in-memory store without normalizing them first. An attacker can supply an equivalent Unicode identity that passes normalization during comparison but appears different in storage, leading to authentication bypass or privilege confusion. For example, the character "é" can be represented as a single code point U+00E9 or as a decomposed sequence "e" + U+0301. If the server stores the normalized form but compares against the raw, unnormalized input, both forms may be treated as distinct credentials.
Basic Auth transmits credentials as a base64-encoded string of username:password. Because base64 does not alter the byte content, any Unicode normalization issues in the username or password are preserved end-to-end. An attacker can craft an Authorization header such as Authorization: Basic dXNlcjDwn5xwOg== where the decoded username contains a decomposed character. If the server decodes, splits on :, and compares without normalization, the check may incorrectly succeed against a normalized stored credential or incorrectly fail, causing erratic behavior that can be leveraged in authentication bypass or enumeration attacks.
Express middleware that parses headers manually or uses simple string operations is especially prone to these inconsistencies. Since normalization is not applied by default in JavaScript string comparisons, two visually identical strings may not be equal in memory. This discrepancy can be exposed through authentication routes that do not explicitly normalize both sides before comparison. Attackers can use this to test multiple equivalence forms—such as NFC, NFD, NFKC, NFKD—to discover which variant the server accepts, potentially bypassing controls that rely on exact string matching.
The interaction with other security checks in middleBrick is relevant here. The scanner’s Input Validation checks examine how headers and credentials are parsed, and Unicode handling is part of that surface. The Authentication check evaluates whether the endpoint reliably verifies identity, and inconsistent normalization can lead to false negatives where weak comparison logic is not detected as a flaw during manual review. By normalizing inputs consistently and using constant-time comparisons, you reduce the risk of bypass via canonicalization attacks.
Basic Auth-Specific Remediation in Express — concrete code fixes
To secure Basic Auth in Express, normalize usernames and passwords before storage and before comparison. Use a Unicode normalization library such as unorm to ensure canonical equivalence. Always compare credentials using a constant-time function to prevent timing attacks. Below is a minimal, secure implementation for an Express route that uses HTTP Basic Authentication.
const express = require('express');
const unorm = require('unorm');
const app = express();
// Example user store with pre-normalized credentials
const users = new Map();
// users.set(normalize('alice'), normalize('p@ssw0rd'));
function normalize(str) {
// NFC normalization; choose a form and be consistent
return unorm.nfc(str);
}
function basicAuth(req, res, next) {
const header = req.headers.authorization;
if (!header || !header.startsWith('Basic ')) {
res.set('WWW-Authenticate', 'Basic realm="api"');
return res.status(401).send('Authorization required');
}
const base64 = header.split(' ')[1];
const decoded = Buffer.from(base64, 'base64').toString('utf8');
const [username, password] = decoded.split(':');
if (username == null || password == null) {
return res.status(400).send('Invalid credentials format');
}
const normalizedUser = normalize(username);
const normalizedPass = normalize(password);
const storedPass = users.get(normalizedUser);
if (!storedPass || !timingSafeEqual(storedPass, normalizedPass)) {
res.set('WWW-Authenticate', 'Basic realm="api"');
return res.status(401).send('Invalid credentials');
}
req.user = normalizedUser;
next();
}
// Constant-time comparison to mitigate timing attacks
function timingSafeEqual(a, b) {
if (a.length !== b.length) {
return false;
}
let result = 0;
for (let i = 0; i < a.length; i++) {
result |= a.charCodeAt(i) ^ b.charCodeAt(i);
}
return result === 0;
}
app.get('/api/secure', basicAuth, (req, res) => {
res.json({ message: `Authenticated as ${req.user}` });
});
app.listen(3000, () => console.log('Server running on port 3000'));
Key takeaways: always normalize both the input and stored values using the same Unicode form (NFC or NFD), and avoid naive equality checks. Combine this with transport-layer encryption (HTTPS) since Basic Auth credentials are base64-encoded, not encrypted. middleBrick’s Authentication and Input Validation checks can help identify endpoints where normalization or comparison logic is inconsistent, and the CLI allows you to scan endpoints from the terminal with middlebrick scan <url> to detect such issues early.