Regex Dos in Django with Hmac Signatures
Regex Dos in Django with Hmac Signatures — how this specific combination creates or exposes the vulnerability
A Regular Expression Denial of Service (Regex Dos) occurs when a pattern exhibits catastrophic backtracking, often triggered by nested quantifiers on untrusted input. In Django, this risk can arise in the context of HMAC signature validation when signature extraction and verification rely on regular expressions that are not carefully constrained. Consider a view that parses an Authorization header using a pattern such as Signature\s+(.+) to capture the signature. If the captured group is then passed into additional regex operations—for example, to enforce a specific character set or length using patterns like [A-Za-z0-9\-_=\/]+ without atomic grouping or possessive quantifiers—each malformed input can cause the regex engine to explore an exponential number of paths.
When combined with HMAC signature workflows, the exposure occurs because the signature is typically a long, base64-like string that may include characters such as +, /, and =. A permissive regex that allows these characters but lacks strict boundaries can cause the engine to backtrack extensively when presented with crafted payloads containing many repeated or ambiguous characters. For instance, a pattern like ^(?:[A-Za-z0-9\/+=]+){1,100}$ applied to a long, repetitive string can lead to significant CPU consumption because the quantifiers overlap and the engine retries many combinations. This becomes a vector for resource exhaustion in an API gateway or Django view that processes untrusted requests before validating the HMAC.
In a real-world scenario, an attacker could send requests with specially designed signature-like values that trigger pathological backtracking in middleware or view logic, even if the ultimate HMAC verification fails. Because the scan checks such attack surfaces—including input validation and unsafe consumption patterns—middleBrick can identify these regex-related risks during an unauthenticated test. The scanner examines how signature extraction patterns behave under malformed input and flags expressions that can lead to excessive engine work. Proper remediation involves avoiding regex for parsing structured tokens when simpler string operations suffice, and ensuring any regex used is non-backtracking through atomic groups, strict length limits, and avoidance of nested quantifiers on untrusted data.
Hmac Signatures-Specific Remediation in Django — concrete code fixes
To mitigate Regex Dos risks in Django when working with HMAC signatures, replace fragile regex parsing with deterministic string operations and strict validation. Use constant-time comparison functions to avoid timing attacks, and ensure signature extraction does not rely on backtracking-prone patterns.
Example of a vulnerable approach using regex for signature extraction:
import re
from django.http import HttpRequest
def extract_signature_regex(request: HttpRequest) -> str | None:
# Vulnerable: regex with potential for catastrophic backtracking
match = re.search(r'Signature\s+(.+)', request.META.get('HTTP_AUTHORIZATION', ''))
if match:
return match.group(1)
return None
Safer alternative using string partitioning and strict validation:
import hmac import hashlib import base64 from django.conf import settings from django.http import HttpRequest, HttpResponseBadRequest from django.views.decorators.http import require_POST from typing import Optional def safe_hmac_digest(key: bytes, message: bytes) -> bytes: return hmac.new(key, message, hashlib.sha256).digest() def extract_signature_safe(auth_header: str) -> Optional[str]: # Avoid regex; use constant-time-friendly parsing if not auth_header.startswith('Signature '): return None # Split only on first space; limit parts to avoid ambiguity parts = auth_header.split(' ', 2) if len(parts) != 2: return None signature = parts[1].strip() # Enforce expected character set and length to prevent abuse if not all(c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789/+= ' for c in signature): return None if len(signature) > 4096: return None return signature @require_POST def verify_hmac_view(request): auth = request.META.get('HTTP_AUTHORIZATION', '') signature = extract_signature_safe(auth) if signature is None: return HttpResponseBadRequest('Invalid authorization header') # Example: compute expected signature using request body and a shared secret secret = settings.SECRET_KEY.encode('utf-8') body = request.body expected = safe_hmac_digest(secret, body) expected_b64 = base64.b64encode(expected).decode('utf-8') if not hmac.compare_digest(signature, expected_b64): return HttpResponseBadRequest('Invalid signature') # Proceed with business logic return HttpResponse('OK')Key practices include:
- Avoid regex for token extraction; use
str.startswithandstr.splitwith a limit. - Validate character set and length before using the signature in any comparison or cryptographic operation.
- Use
hmac.compare_digestfor constant-time comparison to prevent timing leaks. - If OpenAPI specs are used, ensure path and header patterns do not employ open-ended quantifiers; document expected signature formats explicitly.
Related CWEs: inputValidation
| CWE ID | Name | Severity |
|---|---|---|
| CWE-20 | Improper Input Validation | HIGH |
| CWE-22 | Path Traversal | HIGH |
| CWE-74 | Injection | CRITICAL |
| CWE-77 | Command Injection | CRITICAL |
| CWE-78 | OS Command Injection | CRITICAL |
| CWE-79 | Cross-site Scripting (XSS) | HIGH |
| CWE-89 | SQL Injection | CRITICAL |
| CWE-90 | LDAP Injection | HIGH |
| CWE-91 | XML Injection | HIGH |
| CWE-94 | Code Injection | CRITICAL |
Frequently Asked Questions
How can I test my Django HMAC signature regex for catastrophic backtracking?
A+/= repeated thousands of characters, and measure execution time. If parsing time grows non-linearly, the regex is vulnerable. Replace regex with string operations or use the regex library with atomic groups (?