Memory Leak in Cassandra
How Memory Leak Manifests in Cassandra
A memory leak in a Cassandra backend typically appears as unbounded consumption of heap or off‑heap memory during query execution. Common patterns include:
- Returning large result sets without a
LIMITclause, causing the coordinator to buffer entire rows in memory. - Using
ALLOW FILTERINGon tables without a proper primary key, which forces full table scans and can exceed heap limits. - Executing batches with overly large
INSERTorUPDATEcounts, leading to excessive write buffers. - Improper use of counters or wide rows that grow without bound, especially when client drivers accumulate results.
- Long‑running aggregations or
GROUP BYqueries that materialize intermediate data structures.
These issues map to CWE‑400 (Uncontrolled Resource Consumption) and can be triggered by unauthenticated API calls that expose internal storage engines. Real‑world incidents have been linked to CVE‑2021‑41772 (Cassandra 4.0 remote code execution) where malformed queries caused excessive memory allocation.
Cassandra‑Specific Detection
middleBrick detects potential memory‑leak risks by analyzing the unauthenticated API surface. The scanner:
- Parses CQL statements in request bodies and query parameters for missing
LIMIT, unboundedSELECT *, orALLOW FILTERINGusage. - Inspects request headers and JSON payloads for large
batch_sizevalues or excessiveINclause lengths. - Monitors response size and timing; responses larger than a configurable threshold (e.g., >10 MB) trigger a high severity finding.
- Cross‑references OpenAPI definitions to flag endpoints that accept unconstrained parameters.
Example of a detected pattern:
GET /api/v1/users?status=active&page=0&pageSize=0
middleBrick flags the missing pageSize limit and suggests adding a maximum page size.
In practice, the scanner will raise a finding when it observes any of the following:
- Requests that reference tables with high cardinality without pagination.
- Batch statements exceeding 100 statements.
- Responses that contain more than 10 000 rows without a
LIMIT.
These indicators are mapped to OWASP API Top 10 A01:2021 – Broken Access Control when they lead to resource exhaustion.
Cassandra‑Specific Remediation
Remediation focuses on constraining memory usage at the query and driver level. Recommended steps include:
- Always append a
LIMIT(orMAXROWSin newer drivers) to queries that could return large result sets. - Replace
SELECT *with explicit column lists to avoid fetching unnecessary data. - Remove
ALLOW FILTERINGwhen a proper secondary index exists; otherwise, redesign the data model. - Set driver‑level limits such as
maxBytesPerPartitionandmaxResultSizeto prevent buffering beyond safe bounds. - Cap batch sizes; a typical safe maximum is 100 statements per batch.
- Use prepared statements with bound variables to avoid dynamic query parsing that can lead to unbounded allocations.
Code example – adding a limit in Java driver:
PreparedStatement stmt = session.prepare("SELECT * FROM users WHERE tenant_id = ?")
.setConsistencyLevel(ConsistencyLevel.ONE)
.bind(tenantId)
.setBoundStatement(stmt)
.setPageSize(100); // enforce maximum rows
Example – restricting batch size in CQL:
BEGIN BATCH
INSERT INTO events (event_id, event_date) VALUES ('e1', toDate('2024-01-01'));
INSERT INTO events (event_id, event_date) VALUES ('e2', toDate('2024-01-02'));
-- Do not exceed 100 statements per batch
APPLY BATCH;
Finally, monitor memory usage in production with tools like nodetool metrics and set alerts when heap usage approaches 80 % of available RAM.
Frequently Asked Questions
How can I tell if a Cassandra query is likely to cause a memory leak?
LIMIT, use of SELECT *, ALLOW FILTERING on large tables, or batch statements with more than 100 commands. middleBrick will flag these patterns during an unauthenticated scan and assign a high severity finding.What configuration changes prevent memory exhaustion in Cassandra APIs?
maxResultSize, enforce LIMIT on all public endpoints, avoid unbounded IN clauses, and cap batch sizes. Additionally, restrict access to tables with high cardinality and use explicit column projections instead of *.