Unicode Normalization in Axum with Jwt Tokens
Unicode Normalization in Axum with Jwt Tokens — how this specific combination creates or exposes the vulnerability
Unicode normalization inconsistencies between JWT token handling and Axum routing or parameter extraction can create authorization bypass or injection-like conditions. In Axum, route extractors such as Path, Query, and Json bind directly to incoming request components. If a JWT is passed via an Authorization header and subsequently validated using a library that does not enforce a canonical normalization form, an attacker can exploit differences in how equivalent Unicode strings are represented.
For example, the same identifier can appear as precomposed characters (é, U+00E9) or as decomposed sequences (e, ̌ combining acute accent, U+0301). If the JWT library normalizes to NFC while Axum or an upstream middleware leaves strings in NFD—or vice versa—the resulting canonicalization mismatch can allow two distinct tokens to resolve to different internal representations. This can cause token binding errors where a token issued for one scope or subject is incorrectly accepted as valid for another due to mismatched comparison logic.
Additionally, HTTP parameter pollution can occur when query or path parameters are normalized differently from the JWT claims extracted from the token. If authorization checks compare a normalized route parameter against a non-normalized claim or audience string, an attacker may supply visually identical but distinct Unicode inputs to reach unintended endpoints or gain elevated permissions. Such issues map to OWASP API Top 10 authentication and authorization flaws and can be surfaced by middleBrick scans, which test the unauthenticated attack surface and include checks for Input Validation and Authentication.
Because middleBrick runs 12 security checks in parallel—including Authentication, Input Validation, and Property Authorization—using the tool can help detect inconsistencies in how JWT tokens are handled across normalized versus non-normalized inputs. The scanner provides prioritized findings with severity and remediation guidance without requiring credentials, making it practical to uncover these subtle identity-mapping issues early in development.
Jwt Tokens-Specific Remediation in Axum — concrete code fixes
To mitigate Unicode normalization issues with JWT tokens in Axum, enforce a single normalization form before any comparison or claim processing. Apply normalization at the boundary where the token string is first handled, and ensure all libraries, routes, and validation logic operate on the same canonical representation.
Example: normalize the token before verification and normalize claim values used in authorization decisions.
use axum::{
async_trait,
extract::{FromRequest, Request},
http::Request as HttpRequest,
};
use jsonwebtoken::{decode, Algorithm, DecodingKey, Validation};
use std::convert::Infallible;
use unicode_normalization::UnicodeNormalization;
// Normalize incoming token to NFC before validation
async fn normalize_and_verify(
token: &str,
key: &DecodingKey,
) -> Result<jsonwebtoken::TokenData<Claims>, jsonwebtoken::errors::Error> {
let normalized = token.nfc().collect::<String>();
let mut validation = Validation::new(Algorithm::HS256);
validation.validate_exp = true;
decode<Claims>(&normalized, &key, &validation)
}
// Custom extractor that ensures consistent normalization
pub struct NormalizedToken(pub Claims);
#[async_trait]
impl FromRequest<S> for NormalizedToken
where
S: Send + Sync,
{
type Rejection = (http::StatusCode, String);
async fn from_request(req: Request, _state: &S) -> Result {
let auth_header = req
.headers()
.get("authorization")
.ok_or((http::StatusCode::UNAUTHORIZED, "Missing authorization header".to_string()))?
.to_str()
.map_err(|_| (http::StatusCode::BAD_REQUEST, "Invalid header encoding".to_string()))?;
let token = auth_header
.strip_prefix("Bearer ")
.ok_or((http::StatusCode::BAD_REQUEST, "Invalid bearer format".to_string()))?;
let key = DecodingKey::from_secret(&[0u8; 32]);
match normalize_and_verify(token, &key).await {
Ok(data) => Ok(NormalizedToken(data.claims)),
Err(e) => Err((http::StatusCode::UNAUTHORIZED, format!("Invalid token: {:?}", e))),
}
}
}
// Usage in a route
async fn admin_route(token: NormalizedToken) -> String {
format!("Welcome, role: {:?}", token.0["role"]) // role claim must also be normalized when set
}
// When issuing tokens, normalize identifiers used in claims (e.g., sub, username) to NFC
let normalized_sub = "userÉ".nfc().collect::<String>();
let claims = Claims {
sub: normalized_sub,
role: "admin",
exp: (chrono::Utc::now() + chrono::Duration::hours(1)).timestamp() as usize,
};
let token = encode(
&Header::default(),
&claims,
&EncodingKey::from_secret(&[0u8; 32]),
).expect("Failed to encode");
Key practices:
- Normalize all JWT strings to NFC (or consistent NFD) before verification and comparison.
- Normalize claim values (e.g., subject, username, roles) at issuance time so that comparisons remain consistent.
- Apply the same normalization in any authorization logic that uses claim values against route or query parameters.
- Use crates like
unicode-normalizationto perform deterministic NFC/NFD transformations in Rust. - Validate that no unchecked parameters influence JWT interpretation after normalization.
These steps reduce the risk of bypass due to Unicode representation differences and align with secure handling of identity tokens in Axum applications.