Rate Limiting Bypass in Cassandra
How Rate Limiting Bypass Manifests in Cassandra
Apache Cassandra does not enforce request‑rate limits at the CQL protocol level. The database is designed to handle high write throughput, and it relies on the application or an intermediary gateway to throttle incoming traffic. When an application trusts that a single connection or a simple IP‑based counter will protect the cluster, attackers can bypass that protection by exploiting the way Cassandra clients acquire and reuse connections.
Typical bypass patterns include:
- Connection pool exhaustion: The driver creates a new
Sessionfor each request when the application does not reuse a pooled session. By opening many parallel connections, an attacker can overwhelm the node despite a per‑IP request counter. - Prepared statement reuse: Cassandra caches prepared statements on the node. If an application prepares a statement once and then executes it repeatedly with different bound values, the node sees the same statement hash and may apply any statement‑level throttling less aggressively. An attacker can send many varied bound values while still hitting the same prepared statement, evading naive per‑statement counters.
- Batch statement splitting: A BATCH that contains many mutations is processed as a single unit on the coordinator. By splitting a large batch into many small batches sent over different connections, an attacker can stay under a per‑batch size limit while delivering the same total write volume.
- Token‑aware routing dispersion: The Java driver’s token‑aware policy routes requests to the replica that owns the partition key. By choosing partition keys that map to different tokens across the cluster, an attacker can spread load evenly, preventing any single node from hitting a local rate limit.
These techniques do not require authentication or special privileges; they only need the ability to issue CQL requests, which is exactly what a publicly exposed API that proxies to Cassandra provides.
Cassandra-Specific Detection
Detecting a missing or bypassable rate limit in a Cassandra‑facing API involves observing whether the service returns the expected 429 Too Many Requests response when request volume exceeds a reasonable threshold, and whether any throttling headers (e.g., Retry-After) are present. Because Cassandra itself does not generate these responses, the check must be performed at the API layer.
middleBrick includes a dedicated rate‑limiting test as one of its 12 parallel checks. When a URL is submitted, the scanner:
- Sends a burst of HTTP requests (e.g., 20 requests in 200 ms) to the target endpoint.
- Records the status code and latency of each response.
- If the majority of responses are
200 OK(or another success code) and no429or throttling header is observed, the test flags a potential rate‑limiting bypass. - The result is reported in the dashboard with a severity rating and a short remediation note.
Because the test works at the HTTP layer, it is agnostic to whether the backend uses Cassandra, PostgreSQL, or any other store. However, when the scanner knows the endpoint proxies to Cassandra (e.g., from OpenAPI spec analysis or from observed error messages), it can add context such as “Cassandra‑backed endpoint shows no rate‑limiting enforcement”.
Example CLI usage that would trigger this check:
middlebrick scan https://api.example.com/cassandra-proxy
The output includes a JSON field like:
{
"checks": [
{
"name": "Rate Limiting",
"passed": false,
"findings": [
{
"severity": "medium",
"description": "No 429 responses observed during burst test; possible bypass via connection pooling."
}
]
}
]
}
Cassandra-Specific Remediation
Since Cassandra does not provide built‑in request throttling for client traffic, the fix must be applied where the API receives HTTP requests—before a CQL session is obtained from the driver. A common and effective approach is to apply a token‑bucket or leaky‑bucket rate limiter per client identifier (e.g., API key, IP address, or authenticated user).
The following Java example uses Resilience4j’s RateLimiter to protect a resource that acquires a Cassandra session from a shared pool and executes a prepared statement.
import io.github.resilience4j.ratelimiter.RateLimiter;
import io.github.resilience4j.ratelimiter.RateLimiterConfig;
import com.datastax.oss.driver.api.core.CqlSession;
import com.datastax.oss.driver.api.core.cql.PreparedStatement;
import java.time.Duration;
import java.util.concurrent.ConcurrentHashMap;
public class CassandraApiHandler {
// One rate limiter per API key (could also be per IP)
private final ConcurrentHashMap limiters = new ConcurrentHashMap<>();
private final CqlSession session;
private final PreparedStatement insertStmt;
public CassandraApiHandler(String contactPoint) {
this.session = CqlSession.builder()
.addContactPoint(contactPoint)
.build();
this.insertStmt = session.prepare(
"INSERT INTO events (id, ts, payload) VALUES (?, ?, ?)"
);
}
private RateLimiter getLimiter(String apiKey) {
return limiters.computeIfAbsent(apiKey, k -> {
RateLimiterConfig config = RateLimiterConfig.custom()
.limitForPeriod(10) // 10 requests
.limitRefreshPeriod(Duration.ofSeconds(1))
.timeoutDuration(Duration.ofMillis(500))
.build();
return RateLimiter.of(k + "-limiter", config);
});
}
public void handleEvent(String apiKey, String id, long ts, String payload) {
RateLimiter limiter = getLimiter(apiKey);
// Try to acquire permission; if not available, fail fast
if (!limiter.tryAcquirePermission()) {
// Respond with 429 (implementation depends on your web framework)
throw new RuntimeException("Too Many Requests");
}
// Proceed with Cassandra operation
session.execute(insertStmt.bind(id, ts, payload));
}
}
Key points in the example:
- The rate limiter is keyed by a stable client identifier (API key). Changing the IP address or using different keys does not reset the counter for a given key, preventing simple IP‑spoofing bypass.
- The
tryAcquirePermissioncall is non‑blocking with a short timeout; if the permit is not granted, the handler returns a429 Too Many Requestsresponse before any CQL traffic is sent to Cassandra. - The Cassandra session and prepared statement are reused, eliminating the connection‑pool exhaustion vector.
- If you prefer a library‑free solution, a simple
java.util.concurrent.Semaphoreor a GuavaRateLimitercan be used in the same way.
In addition to client‑side throttling, you can tune Cassandra’s internal inter‑node traffic limits to protect against intra‑cluster overload. In cassandra.yaml set:
stream_throughput_outbound_megabits_per_sec: 200
inter_dc_stream_throughput_outbound_megabits_per_sec: 100
These values cap the bandwidth used for streaming operations (e.g., repair, bootstrap) and help ensure that a sudden surge of client‑initiated writes does not saturate the network between nodes.
By combining a strict API‑level rate limiter with responsible driver usage (session reuse, prepared statement caching) and, where appropriate, Cassandra’s internal traffic throttling, you close the most common bypass vectors that attackers exploit when targeting Cassandra‑backed services.
Related CWEs: resourceConsumption
| CWE ID | Name | Severity |
|---|---|---|
| CWE-400 | Uncontrolled Resource Consumption | HIGH |
| CWE-770 | Allocation of Resources Without Limits | MEDIUM |
| CWE-799 | Improper Control of Interaction Frequency | MEDIUM |
| CWE-835 | Infinite Loop | HIGH |
| CWE-1050 | Excessive Platform Resource Consumption | MEDIUM |