Xml Bomb Attack
How Xml Bomb Works
An XML bomb is a denial-of-service attack that exploits XML parsers' ability to handle entity references. The attack leverages XML's Document Type Definition (DTD) feature to create exponentially expanding data structures that consume vast amounts of memory and processing power.
The core mechanism uses nested entity definitions. An attacker defines an entity that references itself multiple times, creating a tree that expands exponentially when parsed. For example:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY a "1234567890">
<!ENTITY b "&a;&a;&a;&a;&a;">
<!ENTITY c "&b;&b;&b;&b;&b;">
<!ENTITY d "&c;&c;&c;&c;&c;">
<!ENTITY e "&d;&d;&d;&d;&d;">
]>
<root>&e;</root>
This creates a 10^5 expansion (each level multiplies by 10). The entity 'e' expands to 10^5 characters, consuming approximately 100KB. A deeper nesting with base-10 expansion reaches 10^10 characters—over 10GB of data—from just a few hundred bytes of input.
The attack exploits XML's entity substitution mechanism. When a parser encounters &e;, it recursively expands all nested entities. Modern parsers that disable DTD processing or limit entity expansion are immune, but many APIs still accept XML with DTDs enabled.
Xml Bomb Against APIs
APIs become vulnerable to XML bombs when they accept XML payloads and parse them without proper safeguards. Common attack vectors include:
- SOAP APIs: Legacy SOAP services often use XML extensively and may have lax parsing configurations
- File upload endpoints: APIs that accept XML documents for processing (reports, configurations, etc.)
- Configuration APIs: Services that accept XML for system configuration or data import
- Document processing APIs: Services that parse XML for content extraction or transformation
The attack works by sending a crafted XML payload that appears legitimate but contains nested entities. When the API parses the request, the XML parser expands these entities, consuming server resources. A single 1KB request can trigger gigabytes of memory allocation.
Real-world impact includes:
- Memory exhaustion: The server runs out of available RAM, causing crashes or slowdowns
- CPU exhaustion: Recursive expansion consumes processing cycles, making the service unresponsive
- Service disruption: The attack can take down entire API endpoints or services
Attackers often combine XML bombs with other techniques. For instance, they might use the attack to create a denial-of-service condition, then exploit the distracted state to launch additional attacks on other vulnerabilities.
Detection & Prevention
Preventing XML bomb attacks requires multiple layers of defense. The most effective approach combines configuration hardening with runtime monitoring.
Parser Configuration: Modern XML parsers offer security features that should be enabled:
// Java - SAXParserFactory with security features
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
// Python - lxml with security settings
from lxml import etree
parser = etree.XMLParser(resolve_entities=False, load_dtd=False)
Input Validation: Implement strict schema validation and size limits. Reject XML documents exceeding reasonable thresholds (typically under 1MB for most APIs). Validate against schemas that don't allow DTD declarations.
Rate Limiting: Apply per-client rate limits to XML processing endpoints. This prevents attackers from overwhelming your service with multiple simultaneous requests.
Resource Quotas: Set hard limits on memory and processing time for XML parsing operations. Terminate parsing that exceeds thresholds.
middleBrick API Security Scanning can detect XML bomb vulnerabilities in your APIs. The scanner tests for DTD processing vulnerabilities and entity expansion issues, providing specific findings about whether your API accepts potentially dangerous XML constructs. middleBrick's black-box scanning approach tests the actual runtime behavior without requiring credentials or access to source code.
Monitoring & Alerting: Set up monitoring for unusual memory usage patterns or processing spikes. Alert when XML processing times exceed normal thresholds.
Defense in Depth: Combine these technical controls with network-level protections like API gateways that can inspect and block suspicious XML payloads before they reach your application servers.