Pii Leakage in Cockroachdb
How PII Leakage Manifests in CockroachDB
In a CockroachDB‑backed API, personally identifiable information (PII) such as email addresses, phone numbers, or government IDs is often leaked when the database layer returns more columns than the API consumer needs. Because CockroachDB speaks the PostgreSQL wire protocol, many developers use generic SELECT * statements or ORM methods that load entire rows. An attacker who can reach an unauthenticated endpoint (or a low‑privilege user) can simply request the resource and receive the full row, exposing columns that were never meant to be visible.
Typical vulnerable patterns include:
- A Go handler that uses the
pgxdriver and runsdb.Query(ctx, "SELECT * FROM users"), then marshals the result directly into JSON. - A Node.js/Express route that uses the
pgpackage withclient.query('SELECT * FROM customers')and sendsresult.rowsto the client. - An ORM such as Gorm or Hibernate configured to fetch all fields by default, e.g.,
DB.Find(&users)where the struct maps to every column in the table. - Using CockroachDB’s changefeeds or CDC pipelines that stream the full row to downstream services without field‑level filtering.
These patterns map to OWASP API Security Top 10 2023 API4:2023 "Excessive Data Exposure". Real‑world incidents, such as the accidental exposure of customer emails in a SaaS platform (CVE‑2021‑3156‑like scenario), often trace back to a missing column whitelist in the data access layer.
CockroachDB‑Specific Detection
middleBrick performs unauthenticated, black‑box scanning of the API surface. When it encounters an endpoint that returns JSON, it inspects the payload for data elements that match common PII patterns (email, phone, SSN, etc.) and compares them against the endpoint’s documented contract (if an OpenAPI spec is supplied). If the response contains fields that are not declared in the spec—or if the spec itself allows additionalProperties: true without restriction—middleBrick flags a "Data Exposure" finding with severity High.
For example, scanning a CockroachDB‑powered endpoint with the CLI:
middlebrick scan 'https://api.example.com/v1/users'
might produce a JSON excerpt like:
{
"findings": [
{
"id": "API4-EXPOSURE-001",
"name": "Excessive Data Exposure",
"severity": "high",
"description": "Response includes columns 'email', 'phone_number', and 'ssn' that are not documented in the OpenAPI spec.",
"remediation": "Limit the SELECT list to only required columns or create a view that excludes PII."
}
]
}
Because middleBrick does not need agents, credentials, or source code, it can detect this issue in staging or production environments simply by providing the public URL. The scanner’s 12 parallel checks include the "Data Exposure" module, which actively probes for over‑fetching and cross‑references any supplied OpenAPI/Swagger spec (versions 2.0, 3.0, 3.1) with the actual runtime response.
CockroachDB‑Specific Remediation
Fixing PII leakage in a CockroachDB‑backed service involves ensuring that the database layer returns only the data the API is authorized to expose. The following CockroachDB‑native techniques are effective:
- Explicit column selection – Replace
SELECT *with a list of needed columns. In Go:
rows, err := db.QueryContext(ctx, "SELECT id, username, email FROM users WHERE id = $1", userID)
- Using the EXCLUDE clause (CockroachDB v22.1+ mirrors PostgreSQL 14) to omit sensitive columns while still using
SELECT *:
SELECT * EXCLUDE (ssn, phone_number) FROM customers WHERE region = 'us-east';
- Creating a security view that hides PII and granting access only to that view:
CREATE VIEW vw_public_customers AS
SELECT id, username, email, created_at
FROM customers;
GRANT SELECT ON TABLE vw_public_customers TO webapp_role;
- Column‑level privileges via roles (if your CockroachDB version supports it) – grant SELECT on specific columns to a limited role:
GRANT SELECT (id, username, email) ON TABLE customers TO readonly_role;
At the API layer, apply defensive serialization: after fetching rows, strip or mask any unexpected fields before sending the response. For instance, in Node.js:
const safeRows = rows.map(r => ({
id: r.id,
username: r.username,
email: r.email
}));
res.json(safeRows);
Finally, enforce least‑privilege database users: the application should connect with a role that only has permission to the view or the column‑restricted table, preventing accidental over‑fetching even if the query string is altered.
Related CWEs: dataExposure
| CWE ID | Name | Severity |
|---|---|---|
| CWE-200 | Exposure of Sensitive Information | HIGH |
| CWE-209 | Error Information Disclosure | MEDIUM |
| CWE-213 | Exposure of Sensitive Information Due to Incompatible Policies | HIGH |
| CWE-215 | Insertion of Sensitive Information Into Debugging Code | MEDIUM |
| CWE-312 | Cleartext Storage of Sensitive Information | HIGH |
| CWE-359 | Exposure of Private Personal Information (PII) | HIGH |
| CWE-522 | Insufficiently Protected Credentials | CRITICAL |
| CWE-532 | Insertion of Sensitive Information into Log File | MEDIUM |
| CWE-538 | Insertion of Sensitive Information into Externally-Accessible File | HIGH |
| CWE-540 | Inclusion of Sensitive Information in Source Code | HIGH |
Frequently Asked Questions
Does middleBrick need any credentials or agents to scan my CockroachDB API?
How can I verify that a view I created in CockroachDB is actually preventing PII exposure?
SELECT * FROM vw_public_customers;. If the result set omits columns such as ssn, phone_number, or email (depending on what you excluded), the view is working. You can also use middleBrick to scan the endpoint; the scanner will report no excessive data exposure if the view limits the returned fields.