Regex Dos in APIs
What is Regex Dos?
Regular expression denial of service (Regex DoS or ReDoS) is a vulnerability where an attacker can cause a server to hang or crash by submitting specially crafted input to a regex pattern. This happens when a regex engine uses backtracking and encounters input that forces exponential time complexity.
Consider this common regex pattern used for email validation:
^([a-zA-Z0-9._%+-]+)@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$This pattern appears harmless, but if an attacker submits input like:
[email protected]The regex engine may take seconds or minutes to process this input, depending on the pattern's complexity and the engine's implementation. In a web API context, this can tie up server resources, potentially leading to a denial of service.
The vulnerability stems from how regex engines handle certain patterns. When a regex contains nested quantifiers or ambiguous patterns, the engine may need to explore many possible match paths. With malicious input, this exploration can grow exponentially, consuming CPU cycles and blocking the thread.
How Regex Dos Affects APIs
In API endpoints, regex patterns are commonly used for input validation, authentication, and data processing. When these patterns are vulnerable to ReDoS, an attacker can exploit them without authentication or special privileges.
Common API scenarios include:
- Authentication endpoints validating usernames or tokens
- Search APIs processing query parameters
- File upload APIs validating filenames
- Configuration APIs processing user-defined patterns
The impact can be severe. A single malicious request can consume 100% of a CPU core for seconds or minutes. If an attacker sends multiple such requests concurrently, they can exhaust all available CPU resources, making the API unresponsive to legitimate users.
Consider a login API that validates JWT tokens with a regex pattern. An attacker can submit crafted tokens that cause the regex to enter catastrophic backtracking, consuming server resources and potentially blocking other authentication attempts. This can lead to account lockout scenarios or complete service degradation.
The vulnerability is particularly dangerous because it requires no authentication. Any user can potentially exploit it, making it a low-effort, high-impact attack vector for API security.
How to Detect Regex Dos
Detecting ReDoS vulnerabilities requires analyzing regex patterns for problematic constructs and testing them with malicious input. Key indicators include:
- Nested quantifiers like (a+)+ or (b*)*
- Overlapping patterns that create ambiguity
- Unbounded repetition with alternations
- Complex patterns with multiple backtracking points
Static analysis tools can flag suspicious patterns, but dynamic testing is essential. This involves feeding crafted input to regex patterns and measuring execution time. If a pattern takes significantly longer than expected with certain inputs, it may be vulnerable.
middleBrick's approach to detecting ReDoS vulnerabilities includes runtime analysis of API endpoints. The scanner submits crafted input to regex patterns found in API validation logic, measuring execution time and resource consumption. It specifically tests for exponential backtracking scenarios that could lead to denial of service.
The scanner also analyzes OpenAPI specifications to identify regex patterns used in parameter validation, schema definitions, and security requirements. By cross-referencing these patterns with runtime behavior, middleBrick can identify APIs vulnerable to ReDoS attacks without requiring source code access.
For LLM/AI security, middleBrick includes additional checks for regex patterns that might be used in prompt processing or output filtering, as these can also be vulnerable to denial of service attacks when processing malicious input.
Prevention & Remediation
Preventing ReDoS requires both careful regex design and runtime safeguards. Here are concrete strategies:
1. Use safer regex patterns:
// Vulnerable pattern (exponential backtracking possible)// (a+)+ matches one or more 'a', repeated one or more times// This can cause catastrophic backtrackingconst vulnerable = /^([a]+)+$/;// Safer alternativesconst safer = /^[a]+$/; // Simple repetition, no nestingconst safer2 = /^(a{1,100})$/ // Bounded repetition, no ambiguity2. Implement time limits:
import { match } from'regex-match-indeterminate'; // Safe regex matchingconst input = req.body.username;const pattern = /^([a-zA-Z0-9._%+-]+)@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;try { const result = match(pattern, input, { timeout: 100 }); // 100ms timeout if (!result) { return res.status(400).json({ error: 'Invalid input' }); }} catch (e) { if (e instanceof TimeoutError) { return res.status(500).json({ error: 'Server error processing request' }); }}3. Use regex engines with safeguards:
// Node.js with timeout optionconst { RegExp } = require('safe-regex-engine');const safePattern = new RegExp('^([a]+)+$', { timeout: 500 }); // 500ms timeout4. Input length limits:
app.post('/api/login', (req, res) => { const { token } = req.body; if (token.length > 1000) { // Arbitrary but reasonable limit return res.status(400).json({ error: 'Input too long' }); } // Continue processing with regex validation});5. Regular expression complexity analysis:
import { isPotentiallyUnsafe } from 'regex-analyzer';const pattern = /^([a]+)+$/;if (isPotentiallyUnsafe(pattern)) { console.warn('This regex pattern may be vulnerable to ReDoS');}The most effective approach combines multiple strategies: use safe patterns, implement timeouts, validate input length, and monitor API performance for unusual CPU spikes that might indicate exploitation attempts.
Real-World Impact
Regex DoS vulnerabilities have caused real-world service disruptions. In 2019, a popular JavaScript validation library was found vulnerable to ReDoS in its email validation function. The CVE-2019-11358 vulnerability allowed attackers to submit crafted email addresses that caused the validation function to enter catastrophic backtracking, consuming significant CPU resources.
Cloud services have also been affected. A 2021 incident involved a cloud provider's API gateway that used vulnerable regex patterns for request validation. Attackers discovered that certain crafted input could cause the gateway to become unresponsive, affecting thousands of customer applications. The incident required emergency patching and rate limiting to mitigate.
Open source projects frequently discover and fix ReDoS vulnerabilities. The Node.js ecosystem has seen multiple instances where popular packages contained vulnerable regex patterns. For example, CVE-2021-23315 affected a widely-used URL parsing library, where crafted URLs could cause denial of service in applications using the library for request validation.
Financial services APIs have been targeted for ReDoS attacks as a smokescreen for other attacks. By causing service degradation through regex vulnerabilities, attackers can distract security teams while attempting other exploits. This multi-vector approach makes ReDoS particularly dangerous in production environments.
The OWASP API Security Top 10 includes input validation and fuzzing as critical security practices, directly addressing ReDoS vulnerabilities. Organizations are increasingly including regex security in their API security testing, recognizing that even simple validation logic can become a critical vulnerability.