MEDIUM Input Validation

Crlf Injection in APIs

What is CRLF Injection?

CRLF injection (Carriage Return Line Feed injection) is a web security vulnerability that occurs when an attacker can manipulate the line breaks in HTTP headers or other protocol data. The attack exploits the way protocols like HTTP use CRLF sequences (\r\n) to separate different parts of a message.

In APIs, CRLF injection typically happens when user input is included in HTTP headers without proper validation or encoding. An attacker can inject additional CRLF sequences to add new headers, split existing headers, or even inject entirely new HTTP requests or responses.

The vulnerability exists because HTTP headers are structured as key-value pairs separated by CRLF sequences. If user input isn't properly sanitized, an attacker can break out of the intended header field and inject malicious content. This can lead to HTTP response splitting, header injection, or protocol smuggling attacks.

For example, if an API reflects a user-controlled value in a Location header without validation, an attacker could inject \r\n sequences to add arbitrary headers like Set-Cookie or manipulate the response structure entirely.

How CRLF Injection Affects APIs

CRLF injection in APIs can have several serious consequences depending on the context and implementation:

HTTP Response Splitting: Attackers can inject additional HTTP headers or even entire response bodies, leading to cache poisoning, cross-site scripting (XSS), or session fixation attacks.
Header Injection: Malicious headers like Set-Cookie, Location, or Content-Type can be injected to manipulate client behavior or steal session data.
Protocol Smuggling: In some cases, CRLF injection can be used to confuse intermediaries like proxies or load balancers, causing them to misinterpret the protocol boundaries.
Log Injection: If API logs include user input, CRLF injection can corrupt log files, making them difficult to parse or hiding malicious activity.

A common scenario involves APIs that use user input in redirect URLs or error messages. Consider an API that returns a Location header for redirects: if the redirect URL comes from user input and isn't validated, an attacker could craft a URL containing \r\n sequences to inject additional headers.

Another example is APIs that include user data in custom headers. If an API reflects a user ID in an X-User header without validation, an attacker could inject \r\n to add malicious headers like X-Javascript: <script>alert(1)</script>.

How to Detect CRLF Injection

Detecting CRLF injection requires both manual testing and automated scanning. Here are the key approaches:

Input Validation Testing: Test API endpoints that accept user input in headers, URLs, or parameters by submitting payloads containing %0D%0A (URL-encoded CRLF) or actual \r\n sequences.
Header Reflection Analysis: Identify endpoints that reflect user input in HTTP headers and test for injection vulnerabilities.
Automated Scanning: Tools like middleBrick scan APIs for CRLF injection by testing common injection points and analyzing responses for signs of successful injection.

middleBrick specifically tests for CRLF injection by submitting payloads containing CRLF sequences to various API endpoints and analyzing the responses. The scanner checks for:

HTTP response splitting by looking for unexpected headers or response structure changes
Header injection by verifying if malicious headers were added to the response
Protocol anomalies that might indicate successful injection

The scanner also examines API specifications to identify endpoints that might be vulnerable based on their parameter usage and header manipulation patterns.

Prevention & Remediation

Preventing CRLF injection requires a defense-in-depth approach with multiple layers of protection:

Input Validation and Sanitization

The most effective prevention is strict input validation. Never trust user input that will be included in HTTP headers or protocol data. Implement validation that:

Whitelists allowed characters instead of blacklisting dangerous ones
Validates URLs against a strict pattern (scheme, domain, path)
Encodes or removes CRLF sequences from user input

Here's a Node.js example of proper input validation:

function validateRedirectUrl(url) {
  const urlPattern = /^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$/i;
  if (!urlPattern.test(url)) {
    throw new Error('Invalid URL format');
  }
  return url;
}

// Usage
app.get('/api/redirect', (req, res) => {
  try {
    const validatedUrl = validateRedirectUrl(req.query.next);
    res.redirect(validatedUrl);
  } catch (error) {
    res.status(400).json({ error: 'Invalid redirect URL' });
  }
});

Header Construction Best Practices

When constructing HTTP headers, use safe APIs that automatically handle encoding:

Use framework-provided header setting methods instead of manual string concatenation
Never concatenate user input directly into header values
Use context-aware encoding for different data types

Python example with Flask:

from flask import Flask, redirect, request, abort
app = Flask(__name__)

@app.route('/api/redirect')
def safe_redirect():
    next_url = request.args.get('next', '')
    
    # Validate URL
    if not next_url.startswith(('http://', 'https://')):
        return abort(400, 'Invalid redirect URL')
    
    # Use safe redirect method
    return redirect(next_url)

Security Headers and Configuration

Implement additional security measures:

Set Content-Security-Policy headers to mitigate XSS impact
Use X-Content-Type-Options: nosniff to prevent MIME-type confusion
Implement proper HTTP response splitting protection at the server level

Real-World Impact

CRLF injection has been responsible for several notable security incidents. While specific API-focused CVEs are less common than web application cases, the underlying vulnerability pattern affects both domains.

In 2010, CVE-2010-0467 affected multiple web frameworks where HTTP response splitting vulnerabilities allowed attackers to inject arbitrary headers and content. The vulnerability was particularly dangerous in applications that reflected user input in Location headers for redirects.

More recently, API security researchers have discovered CRLF injection in various API gateways and middleware. For example, some API management platforms that allowed custom header manipulation were found vulnerable to header injection attacks that could be chained with other vulnerabilities.

The impact of CRLF injection extends beyond immediate security concerns. Attackers can use these vulnerabilities for:

Session fixation attacks by injecting Set-Cookie headers
Cache poisoning to distribute malicious content through CDN caches
Cross-site scripting when combined with other injection vectors
Information disclosure through header manipulation

middleBrick's scanning methodology helps identify these vulnerabilities before they can be exploited in production environments. By testing APIs with CRLF injection payloads and analyzing the responses, the scanner can detect potential vulnerabilities that might be missed during manual testing.

Frequently Asked Questions

What's the difference between CRLF injection and HTTP response splitting?

HTTP response splitting is a specific type of CRLF injection attack where the attacker injects CRLF sequences to split an HTTP response into multiple responses. While CRLF injection is the broader vulnerability category (injecting CRLF sequences anywhere), HTTP response splitting specifically targets the response structure to create multiple responses from one request.

Can CRLF injection be exploited in HTTPS APIs?

Yes, HTTPS encrypts the transport layer but doesn't prevent CRLF injection at the application layer. The vulnerability exists in how the application processes and constructs headers, regardless of whether the connection is encrypted. HTTPS protects against network eavesdropping but not against application-level injection vulnerabilities.

How does middleBrick detect CRLF injection vulnerabilities?

middleBrick tests APIs by submitting payloads containing CRLF sequences (%0D%0A or \r\n) to various endpoints and analyzing the responses. The scanner looks for signs of successful injection such as unexpected headers, response structure changes, or protocol anomalies. It also examines API specifications to identify endpoints that might be vulnerable based on their parameter usage patterns.