HIGH unicode normalizationaspnetbearer tokens

Unicode Normalization in Aspnet with Bearer Tokens

Unicode Normalization in Aspnet with Bearer Tokens — how this specific combination creates or exposes the vulnerability

Unicode normalization is a text processing operation that ensures equivalent sequences of characters are represented in a consistent binary form. In ASP.NET, incoming HTTP requests pass through model binding and authentication pipelines where bearer token strings are read from headers (typically Authorization: Bearer <token>). If the framework or application code does not normalize these token strings before comparison or cryptographic validation, an attacker can exploit canonicalization differences to bypass authentication or cause logical errors.

One common scenario involves token comparison. Many developers compare bearer tokens using simple string equality (e.g., string.Equals(token, expected, StringComparison.Ordinal)). If the client sends a token that contains Unicode characters in a composed form (e.g., using precomposed code points) while the stored expected token uses a decomposed form (or vice versa), the strings will not match even though they are visually and semantically identical. This mismatch can lead to authentication failures or, worse, if normalization is applied inconsistently across validation steps, it can be leveraged to make an invalid token appear valid.

Another vector arises when bearer tokens are used as dictionary keys, identifiers in claims, or inputs to operations like hashing. Inconsistent normalization can produce different hash values or dictionary lookups for what an application believes to be the same token. Attackers may craft tokens that exploit these differences to escalate privileges, impersonate users, or evade rate limiting or logging mechanisms. For example, consider a token that includes a non-ASCII character normalized differently depending on the API endpoint’s handling. An unauthenticated endpoint might accept the token after normalization, while an authenticated route applies a different normalization form, resulting in a bypass where one check passes but another fails.

OWASP API Security Top 10 highlights injection and validation flaws; improper handling of Unicode is a subtle form of injection where the attacker manipulates encoding rather than syntax. The risk is especially pronounced in APIs that accept bearer tokens in query parameters or custom headers, where normalization may be applied at different layers (web server, framework, middleware). Because these issues are about string equivalence rather than syntax errors, they often evade simple pattern-based validation and require explicit normalization to resolve.

ASP.NET provides built-in mechanisms to mitigate these risks. By normalizing strings to a canonical form—typically NFC or NFD—before any comparison, storage, or cryptographic operation, developers ensure consistent behavior regardless of how the client encodes the token. This is critical for bearer tokens, which must be validated with high precision to prevent unauthorized access.

Bearer Tokens-Specific Remediation in Aspnet — concrete code fixes

To secure bearer token handling in ASP.NET, apply Unicode normalization at the earliest point where the token is read, and maintain that normalized form throughout all validation and comparison logic. Below are concrete code examples that demonstrate robust remediation.

Normalize on Ingestion

When extracting the bearer token from the Authorization header, normalize it immediately. This ensures downstream logic always works with a consistent representation.

using System;
using System.Globalization;
using System.Security.Claims;
using System.Text;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Authentication.JwtBearer;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.DependencyInjection;
using System.IdentityModel.Tokens.Jwt;

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddAuthentication(JwtBearerDefaults.AuthenticationScheme)
    .AddJwtBearer(options =>
    {
        options.Events = new JwtBearerEvents
        {
            OnMessageReceived = context =>
            {
                if (context.Request.Headers.TryGetValue("Authorization", out var authHeader))
                {
                    if (authHeader.ToString().StartsWith("Bearer ", StringComparison.OrdinalIgnoreCase))
                    {
                        // Extract token and apply Unicode normalization
                        var token = authHeader["Bearer ".Length..].ToString();
                        var normalizedToken = NormalizeToken(token);
                        context.Token = normalizedToken;
                    }
                }
                return Task.CompletedTask;
            },
            OnTokenValidated = context =>
            {
                // At this point, context.Token is normalized; re-validate using normalized value
                var normalizedToken = NormalizeToken(context.Token);
                // Example: compare against a known issuer token (also normalized)
                const string expectedIssuerToken = "e99a18c428cb38d5f260853678922e03"; // pre-normalized
                if (!string.Equals(normalizedToken, expectedIssuerToken, StringComparison.Ordinal))
                {
                    context.Fail("Invalid token after normalization");
                }
                return Task.CompletedTask;
            }
        };
    });

string NormalizeToken(string token)
{
    // Use FormC (NFC) for canonical composition; choose based on your security policy
    return token.Normalize(NormalizationForm.FormC);
}

var app = builder.Build();
app.UseAuthentication();
app.UseAuthorization();
app.Run();

Consistent Comparison and Storage

Ensure that any comparison, hashing, or storage uses the same normalization form. For example, if you store normalized tokens in a database, always normalize before lookup.

using System;
using System.Globalization;
using System.Linq;
using System.Security.Cryptography;
using System.Text;

public class TokenService
{
    private const string NormalizationForm = "FormC";

    public string Normalize(string token) => token.Normalize(NormalizationForm);

    public bool Validate(string incoming, string stored)
    {
        var normalizedIncoming = Normalize(incoming);
        var normalizedStored = Normalize(stored);
        return string.Equals(normalizedIncoming, normalizedStored, StringComparison.Ordinal);
    }

    public byte[] ComputeHash(string token)
    {
        var normalized = Normalize(token);
        using var sha256 = SHA256.Create();
        return sha256.ComputeHash(Encoding.UTF8.GetBytes(normalized));
    }
}

By normalizing tokens at ingestion and enforcing normalization in all comparisons, you eliminate canonicalization discrepancies that attackers could exploit. This approach aligns with secure coding practices and reduces the risk of authentication bypass related to Unicode handling.

Frequently Asked Questions

Why is Unicode normalization important for bearer tokens in ASP.NET?
Unicode normalization ensures that visually identical token strings with different binary representations are treated as equal. Without normalization, attackers can craft tokens that bypass authentication due to inconsistent comparison logic.
Which normalization form should I use for bearer tokens?
Use NFC (FormC) for canonical composition unless your security policy requires NFD. Apply the same form consistently across extraction, storage, and comparison to prevent canonicalization vulnerabilities.