CWE-502 in APIs
What is CWE-502?
CWE-502: Deserialization of Untrusted Data describes a vulnerability where an application deserializes data from an untrusted source without proper validation or integrity checks. This weakness allows attackers to craft malicious serialized objects that, when deserialized, can execute arbitrary code, cause denial of service, or bypass authentication mechanisms.
The core issue stems from the fact that deserialization processes often create objects in memory without proper input validation. If an attacker can control the serialized data, they can potentially instantiate classes with dangerous side effects, trigger constructor logic that performs unintended operations, or cause memory corruption.
Common attack vectors include:
- Remote procedure calls accepting serialized objects
- Session storage using serialized objects
- API endpoints accepting binary or JSON payloads for object reconstruction
- Database fields containing serialized data
The severity of CWE-502 varies by language and framework. In Java, Python, and PHP, deserialization vulnerabilities can lead to remote code execution. In JavaScript/TypeScript environments, while direct code execution is less common, attackers can still trigger application crashes, data corruption, or logic bypasses.
CWE-502 in API Contexts
APIs face unique deserialization risks because they inherently accept data from untrusted sources. Several API-specific scenarios commonly lead to CWE-502 vulnerabilities:
Binary Protocol Endpoints - Some APIs use binary protocols like Protocol Buffers, MessagePack, or custom binary formats. While these formats offer performance benefits, they can introduce deserialization risks if the parser contains vulnerabilities or if the API accepts arbitrary binary data without validation.
Object Reconstruction Endpoints - APIs that accept serialized objects for state reconstruction are particularly vulnerable. For example, an endpoint that accepts a "UserSession" object might deserialize it directly, allowing an attacker to craft a malicious session object that executes code during construction.
GraphQL APIs - GraphQL implementations sometimes use JSON serialization for query execution contexts. An attacker could potentially manipulate the serialized execution context to alter query behavior or trigger unintended operations.
Event-Driven Architectures - APIs processing serialized events from message queues or event streams may deserialize untrusted data. If event producers are compromised or if the serialization format allows arbitrary object types, this creates a deserialization attack surface.
Configuration-as-Code APIs - APIs that accept serialized configuration objects (YAML, JSON, or custom formats) can be vulnerable if the deserialization process instantiates classes based on configuration data. This pattern appears in infrastructure-as-code tools and configuration management APIs.
Real-world examples include CVE-2017-5941 (Node.js deserialization leading to remote code execution) and various Java Spring Boot deserialization vulnerabilities that allow attackers to execute arbitrary code by sending crafted serialized objects.
Detection
Detecting CWE-502 requires both static analysis and dynamic testing approaches. Here's how to identify deserialization vulnerabilities in your APIs:
Static Code Analysis - Review your codebase for deserialization functions and their input sources. Look for patterns like:
ObjectInputStream.readObject() // Java
pickle.loads() // Python
unserialize() // PHP
JSON.parse() with reviver functions // JavaScript
Identify whether these functions receive data from external sources (HTTP requests, message queues, files) without validation.
Dynamic Scanning - Use automated tools to probe your API endpoints for deserialization vulnerabilities. middleBrick's black-box scanning approach tests for deserialization weaknesses by:
- Sending serialized payloads to endpoints that accept JSON, XML, or binary data
- Checking for deserialization-related error messages that reveal framework details
- Testing for unsafe deserialization by sending crafted payloads that trigger different code paths
- Analyzing API responses for signs of successful deserialization attacks
middleBrick Scanning - middleBrick's 12-point security scan includes deserialization testing as part of its Input Validation category. The scanner:
- Automatically identifies endpoints that might perform deserialization
- Tests for common deserialization frameworks and their known vulnerabilities
- Checks for unsafe object reconstruction patterns
- Provides a security score (0-100) with specific findings about deserialization risks
- Offers remediation guidance with severity levels (Critical/High/Medium/Low)
Manual Testing Techniques - Security researchers use these techniques to test for deserialization vulnerabilities:
1. Modify JSON payloads to include unexpected object types
2. Send serialized objects with modified constructor parameters
3. Test with known deserialization gadget chains (if framework is identified)
4. Check for differences in error messages between valid and invalid input
Framework-Specific Testing - Different frameworks have different deserialization behaviors:
- Java Spring Boot: Test for Spring-powered deserialization gadgets
- Node.js: Test for Node.js vm module usage in deserialization
- PHP: Test for unserialize() with user-controlled data
- Python: Test for pickle.loads() with external data
Remediation
Fixing CWE-502 requires a defense-in-depth approach. Here are proven remediation strategies with code examples:
1. Input Validation and Whitelisting - Never deserialize arbitrary objects. Validate input type and structure before deserialization:
// Insecure - accepts any serialized object
Object deserialized = serializer.deserialize(input);
// Secure - validates input format first
public UserSession deserializeSession(String input) {
if (!isValidSessionFormat(input)) {
throw new InvalidFormatException();
}
return sessionSerializer.deserialize(input);
}
private boolean isValidSessionFormat(String input) {
// Check JSON structure, required fields, and types
return input.matches('\{"userId":\d+,"timestamp":\d+,"data":\{.*\}\}')
&& !containsUnsafeCharacters(input);
}
2. Use Safe Serialization Formats - Prefer formats that don't allow arbitrary object instantiation:
// Instead of Java serialization, use JSON with strict typing
public UserSession fromJson(String json) {
JsonObject obj = JsonParser.parseString(json).getAsJsonObject();
if (!obj.has("userId") || !obj.has("timestamp")) {
throw new InvalidFormatException();
}
return new UserSession(
obj.get("userId").getAsInt(),
obj.get("timestamp").getAsLong(),
obj.get("data").getAsJsonObject()
);
}
3. Implement Deserialization Filters - In Java 9+, use deserialization filtering to restrict allowed classes:
import java.io.ObjectInputFilter;
public class SafeDeserialization {
private static final ObjectInputFilter filter = filter -> {
if (filter.classDesc().getClassName().contains("com.example")) {
return ObjectInputFilter.Status.UNDECIDED;
}
return ObjectInputFilter.Status.REJECTED;
};
public static Object safeDeserialize(byte[] data) throws IOException, ClassNotFoundException {
ObjectInputStream ois = new ObjectInputStreamWithFilter(
new ByteArrayInputStream(data), filter
);
return ois.readObject();
}
}
4. Use Language-Specific Safe Alternatives - Avoid dangerous functions:
// Python - avoid pickle, use safer alternatives
import json
from dataclasses import dataclass
@dataclass
class UserSession:
user_id: int
timestamp: int
data: dict
def safe_deserialize(json_str: str) -> UserSession:
try:
data = json.loads(json_str)
return UserSession(**data)
except (json.JSONDecodeError, TypeError, ValueError) as e:
raise ValueError("Invalid session data") from e
5. Implement Integrity Checks - Add cryptographic signatures to serialized data:
public class SignedSerializer {
private final Mac mac;
public SignedSerializer(SecretKey key) {
this.mac = Mac.getInstance("HmacSHA256");
this.mac.init(key);
}
public String serializeWithSignature(Object obj) throws Exception {
byte[] data = serialize(obj);
byte[] signature = mac.doFinal(data);
return Base64.getEncoder().encodeToString(data) + ":" +
Base64.getEncoder().encodeToString(signature);
}
public Object deserializeWithVerification(String signedData) throws Exception {
String[] parts = signedData.split(":");
if (parts.length != 2) throw new InvalidFormatException();
byte[] data = Base64.getDecoder().decode(parts[0]);
byte[] signature = Base64.getDecoder().decode(parts[1]);
if (!MessageDigest.isEqual(mac.doFinal(data), signature)) {
throw new InvalidSignatureException();
}
return deserialize(data);
}
}
6. API Gateway Protection - Add deserialization protection at the API gateway level:
// API Gateway middleware for deserialization protection
public class DeserializationProtectionFilter implements Filter {
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
// Check content type and size limits
if (request.getContentType().contains("application/octet-stream")) {
if (request.getContentLength() > MAX_BINARY_SIZE) {
response.sendError(413, "Payload too large");
return;
}
}
// Validate JSON structure before processing
if (request.getContentType().contains("application/json")) {
try {
String json = request.getReader().lines().collect(Collectors.joining());
JsonParser.parseString(json); // Basic validation
} catch (JsonParseException e) {
response.sendError(400, "Invalid JSON format");
return;
}
}
chain.doFilter(request, response);
}
}
7. Runtime Monitoring - Implement monitoring for deserialization-related anomalies:
// Monitor for unusual deserialization patterns
public class DeserializationMonitor {
private static final Set<String> ALLOWED_CLASSES = Set.of(
"com.example.UserSession",
"com.example.ApiRequest"
public void monitorDeserialization(String className) {
if (!ALLOWED_CLASSES.contains(className)) {
log.warning("Deserialization of unexpected class: " + className);
alertSecurityTeam(className);
}
}
}