Regex Dos in Cassandra
How Regex Dos Manifests in Cassandra
In Cassandra, regex denial-of-service (ReDoS) vulnerabilities often arise in application code that processes user input through regular expressions before constructing CQL queries. Attackers exploit pathological regex patterns to cause excessive CPU consumption, leading to service degradation. For example, a REST endpoint accepting a user-supplied 'filter' parameter might apply a regex like ^(a+)+$ to validate input format. When given a crafted string such as aaaaaaaaaaaaaaaaaaaa!, the regex engine undergoes catastrophic backtracking, consuming disproportionate resources.
This becomes particularly dangerous in Cassandra-backed services where input validation occurs prior to query execution. Consider a user profile lookup service that validates email addresses using a complex regex before querying the users_by_email table. An attacker could supply an email-like string designed to trigger worst-case regex performance, tying up application threads while legitimate requests queue up. Unlike SQL injection, this attack doesn't require database access—it targets the application layer's processing logic.
Cassandra-specific risk factors include: high-concurrency read/write patterns amplifying thread exhaustion; the use of lightweight transactions (LWT) that hold locks longer during validation; and materialized views or secondary indexes that may retry failed validations. Real-world parallels include CVE-2019-16254 (Apache Tomcat) and CVE-2020-28491 (Spring Framework), where similar regex flaws caused service outages in Java-based backends common with Cassandra deployments.
Cassandra-Specific Detection
Detecting ReDoS in Cassandra-integrated applications requires scanning for user-controlled data flowing into regex evaluation functions. middleBrick identifies these vectors during its unauthenticated black-box scan by analyzing request parameters, headers, and body content for patterns indicative of dangerous regex usage. For instance, if a GET /api/users?search= endpoint reflects user input in validation logic, middleBrick probes it with regex-attack payloads designed to expose exponential backtracking behavior.
The scanner looks for telltale signs: response time spikes disproportionate to payload length, CPU saturation indicators in error responses (like 503 Service Unavailable), or anomalous behavior when payloads contain nested quantifiers (.*), alternation chains (a|ab|abc|...), or unbounded repetition (a+). middleBrick's LLM/AI security module also checks for regex patterns in prompt handling if the Cassandra-backed service includes AI components, though ReDoS detection focuses on general input validation paths.
Code example showing vulnerable Java code typical in Cassandra services:
// Vulnerable: User input directly fed into regex with catastrophic backtracking risk
public User getUserByFilter(String filter) {
if (!filter.matches("^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}(?:\\.[a-zA-Z]{2,})?$")) {
throw new IllegalArgumentException("Invalid filter format");
}
// Proceed to Cassandra query using filter
String cql = "SELECT * FROM users WHERE filter = ?";
return session.execute(cql, filter).one();
}
middleBrick flags such patterns by detecting user input (filter) reaching regex evaluation (matches()) without timeout or complexity limits.
Cassandra-Specific Remediation
Fixing ReDoS in Cassandra applications requires modifying input validation logic to prevent pathological regex evaluation—never by altering Cassandra configuration, as the flaw resides in the application layer. Effective strategies include: using timeout-enabled regex engines, simplifying patterns to avoid nested quantifiers, or replacing regex with deterministic checks where possible.
For email validation (a common ReDoS vector), replace complex regex with length-checked, stepwise validation:
// Remediated: Stepwise validation avoids catastrophic backtracking
public User getUserByFilter(String filter) {
if (filter == null || filter.length() > 254) {
throw new IllegalArgumentException("Invalid email length");
}
int atIndex = filter.indexOf('@');
if (atIndex <= 0) throw new IllegalArgumentException("Missing @");
String domain = filter.substring(atIndex + 1);
if (domain.indexOf('.') <= 0) throw new IllegalArgumentException("Invalid domain");
// Proceed to Cassandra query
String cql = "SELECT * FROM users WHERE email = ?";
return session.execute(cql, filter).one();
}
When regex is unavoidable, use Java's Pattern with match timeouts (Java 9+):
// Remediated: Regex with timeout safeguard
private static final Pattern EMAIL_PATTERN = Pattern.compile(
"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$",
Pattern.UNICODE_CASE
);
public User getUserByFilter(String filter) {
try {
Matcher matcher = EMAIL_PATTERN.matcher(filter);
if (!matcher.matches(1000)) { // 1-second timeout
throw new IllegalArgumentException("Invalid email format");
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new IllegalArgumentException("Validation timeout");
}
// Proceed to Cassandra query
String cql = "SELECT * FROM users WHERE email = ?";
return session.execute(cql, filter).one();
}
After remediation, validate fixes using middleBrick by rescanning the endpoint—it will confirm absence of ReDoS indicators through reduced response variance under attack payloads. This approach preserves Cassandra's performance characteristics while eliminating the application-layer attack vector.
Related CWEs: inputValidation
| CWE ID | Name | Severity |
|---|---|---|
| CWE-20 | Improper Input Validation | HIGH |
| CWE-22 | Path Traversal | HIGH |
| CWE-74 | Injection | CRITICAL |
| CWE-77 | Command Injection | CRITICAL |
| CWE-78 | OS Command Injection | CRITICAL |
| CWE-79 | Cross-site Scripting (XSS) | HIGH |
| CWE-89 | SQL Injection | CRITICAL |
| CWE-90 | LDAP Injection | HIGH |
| CWE-91 | XML Injection | HIGH |
| CWE-94 | Code Injection | CRITICAL |