MEDIUM api rate abusebearer tokens

Api Rate Abuse with Bearer Tokens

How Api Rate Abuse Manifests in Bearer Tokens

Bearer tokens are commonly used to statically authenticate API calls (e.g., Authorization: Bearer <jwt>). When an endpoint that validates a bearer token lacks proper rate limiting, attackers can abuse the token validation path in several ways:

  • Token guessing / brute‑force: An attacker sends many requests with random or sequentially generated tokens. Each request triggers the server’s signature verification, consuming CPU and potentially causing a denial‑of‑service.
  • Token replay exhaustion: If a token is valid for a long window, an attacker can replay it thousands of times to spike traffic on a protected resource (e.g., GET /api/orders). Without throttling, the backend may be overwhelmed.
  • Credential stuffing via token endpoints: Some APIs expose a token introspection or refresh endpoint that also accepts bearer tokens. Missing rate limits there enable attackers to test large volumes of stolen tokens, leading to account takeover or credential leakage.
  • Resource‑specific amplification: Certain endpoints perform expensive operations after token validation (e.g., database joins, report generation). By flooding the validation step, an attacker amplifies the cost per request, turning a cheap token guess into a heavy load.

These patterns map directly to OWASP API Security Top 10 2023 – API4: Lack of Resources & Rate Limiting. Real‑world examples include CVE‑2020‑13942 (rate‑limit bypass in a JWT‑protected API) and CVE‑2021‑21315 (unauthenticated token exhaustion in a cloud service).

Code example – missing rate limit on a Bearer‑token‑protected route (Node.js/Express):

const express = require('express');
const jwt = require('jsonwebtoken');
const app = express();

app.get('/api/profile', (req, res) => {
  const auth = req.headers.authorization;
  if (!auth || !auth.startsWith('Bearer ')) {
    return res.status(401).json({error: 'Missing token'});
  }
  const token = auth.slice(7);
  try {
    jwt.verify(token, process.env.JWT_SECRET);
    // expensive operation – e.g., fetch user profile from DB
    res.json({userId: decoded.sub});
  } catch (err) {
    res.status(401).json({error: 'Invalid token'});
  }
});

app.listen(3000);

Notice there is no check limiting how many times /api/profile can be called per token or per IP, enabling the abuse patterns described above.

Bearer Tokens-Specific Detection

middleBrick discovers missing or ineffective rate limits on bearer‑token endpoints by performing unauthenticated, black‑box probes that focus on the token validation path. The scanner:

  • Sends a series of requests with random Bearer tokens (e.g., Authorization: Bearer <random‑base64>) to each discovered endpoint.
  • Measures response times and status codes. If the API returns 200 OK or 401 Unauthorized for a high volume of distinct tokens without ever returning 429 Too Many Requests or a Retry‑After header, the scanner flags a potential rate‑limiting gap.
  • Checks for the presence of standard rate‑limit response headers (X-RateLimit-Limit, X-RateLimit-Remaining, Retry‑After) and notes when they are absent.
  • Optionally, when an OpenAPI/Swagger spec is supplied, middleBrick cross‑references the security sections that require oauth2 or http bearer schemes with the observed runtime behavior.

Because the scan is agentless and requires only a URL, developers can run it locally or in CI:

# Install the CLI (npm)
npm i -g middlebrick

# Scan a target API
middlebrick scan https://api.example.com

# Example output excerpt
Scan ID: abc123
Overall Score: D (42/100)
Category: Rate Limiting – Findings:
- Endpoint GET /api/profile missing rate‑limit headers (no 429 observed after 150 requests with random Bearer tokens).
- Severity: Medium
- Remediation: Add per‑token or per‑IP rate limiting middleware.

This detection method works without any credentials, agents, or configuration changes, aligning with middleBrick’s “no‑setup” promise.

Bearer Tokens-Specific Remediation

Fixing rate‑abuse issues for bearer‑token APIs involves adding throttling logic that respects the token’s semantics. Effective strategies include:

  • Per‑token limits: Track request counts using the token’s unique identifier (e.g., the jti claim in a JWT) or the raw token string. This prevents a single token from being abused while allowing legitimate users to make bursts of requests.
  • Per‑IP limits as a fallback: Combine per‑token with per‑IP limits to mitigate token‑stuffing attacks where many different tokens originate from the same address.
  • Use a distributed store: For horizontally scaled services, employ Redis or a similar fast store to share counters across instances.
  • Leverage framework middleware: Many libraries provide ready‑made rate‑limiters that can be scoped to custom keys (e.g., token sub or jti).

Remediation example – adding per‑JWT rate limiting with express-rate-limit and Redis store (Node.js):

const express = require('express');
const rateLimit = require('express-rate-limit');
const RedisStore = require('rate-limit-redis');
const jwt = require('jsonwebtoken');
const redis = require('redis');

const app = express();

const redisClient = redis.createClient({url: process.env.REDIS_URL});
redisClient.connect().catch(console.error);

// Limiter that uses the JWT 'jti' claim as the key
const jwtLimiter = rateLimit({
  store: new RedisStore({ sendCommand: (...args) => redisClient.call(...args) }),
  windowMs: 60 * 1000, // 1 minute
  max: 30,               // max 30 requests per token per window
  keyGenerator: (req) => {
    const auth = req.headers.authorization;
    if (!auth || !auth.startsWith('Bearer ')) return 'anon';
    const token = auth.slice(7);
    try {
      const decoded = jwt.verify(token, process.env.JWT_SECRET);
      return decoded.jti || token; // fallback to raw token if no jti
    } catch (_) {
      return token; // still limit by token even if invalid
    }
  },
  handler: (req, res) => {
    res.status(429).json({error: 'Too many requests, please try again later'});
  },
  skipFailedRequests: false,   // count failed auth attempts
  skipSuccessfulRequests: false
});

// Apply limiter to all API routes
app.use(jwtLimiter);

app.get('/api/profile', (req, res) => {
  const auth = req.headers.authorization;
  const token = auth.slice(7);
  const decoded = jwt.verify(token, process.env.JWT_SECRET);
  res.json({userId: decoded.sub});
});

app.listen(3000);

Key points:

  • The keyGenerator extracts the JWT’s jti (or the raw token) to create a unique counter per token.
  • Using rate-limit-redis ensures the counter is shared across multiple API instances.
  • The limiter counts both successful and failed authentication attempts, preventing attackers from bypassing limits by sending invalid tokens.
  • When the threshold is exceeded, the API returns 429 Too Many Requests with a JSON error body, which middleBrick will recognize as proper rate‑limiting behavior.

Similar patterns exist in other languages: django-ratelimit with a custom key based on request.META.get('HTTP_AUTHORIZATION'), or Spring Boot’s Bucket4j using ServletRequestUtils.getHeader(request, "Authorization") as the key. Implementing any of these mitigations will eliminate the bearer‑token‑specific rate‑abuse vectors that middleBrick detects.

Frequently Asked Questions

Does middleBrick need a valid Bearer token to test rate limits on my API?
No. middleBrick performs unauthenticated, black‑box scanning. It sends requests with random or malformed Bearer tokens to discover whether the endpoint enforces rate limits regardless of token validity.
If I add a per‑token rate limiter, will legitimate users be blocked during normal bursts?
A well‑tuned limiter (e.g., 30 requests per minute per token) accommodates typical API usage while stopping abuse. Adjust the windowMs and max values based on your traffic patterns and monitor the dashboard for any false positives.