CWE-120 in APIs
- CWE ID
- CWE-120
- Category
- Input Validation
- Severity
- CRITICAL
- Short Name
- Buffer Copy
What is CWE-120?
CWE-120, or Buffer Copy without Checking Size of Input, is a classic memory corruption vulnerability that occurs when data is copied into a fixed-size buffer without verifying that the destination buffer is large enough to hold the source data. This weakness allows attackers to overwrite adjacent memory, potentially leading to arbitrary code execution, data corruption, or application crashes.
The vulnerability manifests when an application uses functions like strcpy(), memcpy(), or similar copy operations without proper bounds checking. If the source data exceeds the destination buffer size, the excess data overflows into neighboring memory locations, corrupting the application's state.
According to MITRE's CWE database, this weakness has been responsible for numerous high-severity vulnerabilities across software systems. The fundamental issue is the mismatch between untrusted input size and fixed buffer allocation, creating an opportunity for memory corruption attacks.
CWE-120 in API Contexts
While CWE-120 is traditionally associated with low-level languages like C and C++, it can manifest in API contexts in several ways that API developers should understand:
- Binary Protocol Handling: APIs that process binary protocols (gRPC, Thrift, custom binary formats) may use unsafe buffer operations when parsing request bodies. A malicious client could craft oversized binary payloads to trigger buffer overflows.
- Memory-Unsafe Libraries: Node.js addons, Python C extensions, or other native modules used by APIs can contain buffer overflow vulnerabilities that affect the entire application.
- Deserialization Attacks: APIs that deserialize untrusted data using unsafe libraries may be vulnerable to buffer overflows during object reconstruction.
- Header Processing: APIs that parse custom headers or binary metadata without proper validation can be exploited through oversized header values.
Even in memory-safe languages like Python, Java, or JavaScript, the risk persists when these languages interface with native code or when developers use unsafe operations (like eval() or unsafe deserialization). The consequence is that an API endpoint vulnerable to CWE-120 could allow attackers to execute arbitrary code on the server, potentially compromising the entire system.
Detection
Detecting CWE-120 requires a multi-layered approach combining static analysis, dynamic testing, and runtime monitoring:
- Static Code Analysis: Tools like Coverity, SonarQube, or ESLint plugins can identify unsafe buffer operations and missing bounds checks in your codebase.
- Dynamic Application Security Testing (DAST): Tools like middleBrick perform black-box scanning of API endpoints, testing for buffer overflow vulnerabilities by sending oversized payloads to observe application behavior. middleBrick's Input Validation check specifically tests how APIs handle boundary conditions and oversized inputs.
- Fuzz Testing: Automated fuzzers generate random, oversized, or malformed inputs to discover buffer overflow conditions that might not be caught by manual testing.
- Runtime Monitoring: Application Performance Monitoring (APM) tools can detect anomalous memory usage patterns that might indicate buffer overflow exploitation attempts.
middleBrick's scanning approach is particularly valuable for API developers because it tests the actual running API without requiring source code access. The scanner sends boundary-testing payloads to each endpoint and analyzes the responses for signs of buffer overflow vulnerabilities, memory corruption, or application instability.
For example, middleBrick would test an API endpoint by sending payloads that exceed typical buffer sizes by 10%, 100%, and 1000%, then analyzing the response codes, error messages, and timing to identify potential vulnerabilities. The scanner's Inventory Management check also helps identify endpoints that might be processing binary data without proper validation.
Remediation
Fixing CWE-120 vulnerabilities requires eliminating unsafe buffer operations and implementing proper input validation. Here are code-level fixes for common scenarios:
Safe String Handling in C/C++
// Vulnerable code - CWE-120 present
char buffer[256];
strcpy(buffer, userInput); // No bounds checking
// Secure replacement
char buffer[256];
strncpy(buffer, userInput, sizeof(buffer) - 1);
buffer[sizeof(buffer) - 1] = '\0'; // Ensure null termination
// Or use safer alternatives
strlcpy(buffer, userInput, sizeof(buffer));
Input Validation in Memory-Safe Languages
// Node.js API endpoint - vulnerable
app.post('/upload', (req, res) => {
const data = req.body.data; // No size validation
processData(data);
});
// Secure version
app.post('/upload', (req, res) => {
const data = req.body.data;
if (data.length > MAX_ALLOWED_SIZE) {
return res.status(413).json({
error: 'Payload too large'
});
}
processData(data);
});
Binary Protocol Safety
// Unsafe deserialization
function deserializeBinary(data) {
const obj = binaryReader.read(data);
return obj; // No validation
}
// Secure approach
function deserializeBinary(data) {
if (data.length > MAX_BINARY_SIZE) {
throw new Error('Binary data exceeds maximum size');
}
const obj = binaryReader.read(data);
validateObjectStructure(obj); // Schema validation
return obj;
}
General Best Practices
- Always validate input sizes before processing any data, regardless of source
- Use safe library functions that include bounds checking (e.g.,
strncpyinstead ofstrcpy) - Implement size limits at API boundaries (request body size, header size, etc.)
- Use memory-safe languages when possible, but remain vigilant about unsafe operations
- Apply principle of least privilege - limit what code can do if compromised
- Keep dependencies updated - many buffer overflow vulnerabilities are patched in library updates
middleBrick's Inventory Management check helps identify endpoints that might be processing binary data without proper validation, while the Input Validation check tests boundary conditions that could reveal buffer overflow vulnerabilities. The scanner provides specific findings with severity levels and remediation guidance, helping developers understand exactly which endpoints need attention and how to fix them.
Frequently Asked Questions
Can buffer overflows occur in memory-safe languages like Python or JavaScript?
eval(). The key is that the memory corruption happens in the native layer, but the API written in the high-level language can still be the attack vector.