Unicode Normalization in Express with Hmac Signatures
Unicode Normalization in Express with Hmac Signatures — how this specific combination creates or exposes the vulnerability
Unicode normalization attacks exploit how different Unicode representations can produce identical-looking strings but distinct byte sequences. In Express applications that use HMAC signatures for request integrity, this becomes a security boundary issue: an attacker can supply a specially crafted payload that normalizes to the same logical value as the original signed data, yet bypasses signature verification.
Consider an Express route that validates an HMAC signature over a query parameter or header. If the application normalizes input only after signature verification—or skips normalization entirely—an attacker can provide a string containing composed characters (e.g., é as a single code point U+00E9) and a semantically identical but decomposed version (e.g., e\u0301 combining e + combining acute). These two forms canonicalize to the same visual character but have different UTF-8 byte representations, leading to different HMAC outputs. If the server only normalizes after computing the HMAC, the computed signature will not match the attacker’s signature, but the application might still process the request because it compares the raw, unnormalized values or because normalization mismatches cause inconsistent handling across libraries.
The vulnerability is specific to the interaction of three elements:
- Express middleware that reads raw request inputs (query, headers, body) without canonicalizing before signing or verification.
- HMAC-based integrity schemes where the signature is computed over the exact byte sequence presented by the client, without normalization.
- Runtime behavior of libraries that may apply normalization inconsistently—for example, one library NFC-normalizing a string before comparison while another leaves the string untouched.
Real-world impact aligns with common web risks like signature bypass and parameter tampering, similar in class to issues cataloged in the OWASP API Top 10. An attacker can forge requests that appear to carry a valid HMAC, potentially escalating trust in tampered parameters. This is especially relevant when HMACs are used to secure webhook payloads, API query parameters (e.g., ?id=123&signature=...), or tokens passed in headers.
To detect this class of issue, scanners should compare normalized representations before and after signature operations and verify that normalization occurs consistently and prior to any cryptographic comparison. Discrepancies between expected and actual normalized forms under the same HMAC key indicate a potential bypass path.
Hmac Signatures-Specific Remediation in Express — concrete code fixes
Remediation centers on ensuring canonical input before HMAC computation and verification, and enforcing normalization as early as possible in the request lifecycle.
Use a stable normalization form (NFC is common) on all user-controlled strings before including them in the HMAC scope. In Node.js, the built-in util module does not provide normalization; use the normalize method available on JavaScript strings via unorm or the unicode-normalization package, or the Intl API where available.
const express = require('express');
const crypto = require('crypto');
const unorm = require('unorm');
const app = express();
const HMAC_KEY = 'your-256-bit-secret';
function normalize(str) {
// NFC normalization; choose a single form for your application
return unorm.nfkc(str);
}
function computeHmac(payload) {
const normalized = normalize(payload);
return crypto.createHmac('sha256', HMAC_KEY).update(normalized, 'utf8').digest('hex');
}
app.get('/resource', (req, res) => {
const rawId = req.query.id || '';
const providedSignature = req.query.signature || '';
const expectedSignature = computeHmac(rawId);
// Constant-time comparison to avoid timing attacks
const isValid = crypto.timingSafeEqual(
Buffer.from(expectedSignature, 'hex'),
Buffer.from(providedSignature, 'hex')
);
if (!isValid) {
return res.status(401).send('Invalid signature');
}
// At this point, use normalizedId for further processing
const normalizedId = normalize(rawId);
res.json({ id: normalizedId, valid: true });
});
app.listen(3000, () => console.log('Server running on port 3000'));
Key practices:
- Normalize before signing: Apply normalization to the exact string that will be included in the HMAC input.
- Normalize before verification: Ensure the verifier uses the same normalization routine on the incoming data before recomputing the HMAC.
- Use constant-time comparison (e.g.,
crypto.timingSafeEqual) to prevent timing side-channels. - Document and enforce a single Unicode form across all services and libraries to avoid inconsistencies.
If you use the middleBrick CLI (middlebrick scan <url>), the scanner checks for inconsistent normalization relative to signature operations and flags cases where normalization occurs after or is omitted. Teams on the Pro plan can enable continuous monitoring so that future changes to signing logic are automatically assessed.