HIGH axumapi scraping

Api Scraping in Axum

How API Scraping Manifests in Axum

API scraping is the systematic extraction of data from an API by iterating through predictable identifiers (like sequential integers) or by exploiting unrestricted list endpoints. In Axum applications, this vulnerability commonly stems from two patterns: (1) endpoints that accept user-controlled identifiers (e.g., path parameters) without verifying that the requester is authorized to access the specific resource, and (2) list endpoints that allow unrestricted pagination or lack rate limiting, enabling enumeration of all resources.

Axum's ergonomic extractors make it easy to capture path and query parameters, but developers sometimes forget to couple these with proper authorization. Consider this typical Axum handler for fetching a user by ID:

async fn get_user(Path(id): Path<u32>) -> impl IntoResponse {
    let user = sqlx::query_as!(User, "SELECT * FROM users WHERE id = $1", id)
        .fetch_one(&pool)
        .await
        .map_err(|_| (StatusCode::NOT_FOUND, "User not found"))?;
    (StatusCode::OK, Json(user))
}

This endpoint is publicly accessible (no authentication extractor) and uses a sequential integer ID. An attacker can script requests to /api/users/1, /api/users/2, etc., harvesting every user record. Even if the endpoint requires authentication, if it doesn't verify that the authenticated user owns the requested ID or has admin privileges, a logged-in attacker can still scrape other users' data by changing the ID parameter.

Similarly, list endpoints often expose too much data:

async fn list_orders(Query(params): Query<ListParams>) -> impl IntoResponse {
    let limit = params.limit.unwrap_or(100);
    let offset = params.page.unwrap_or(1).saturating_sub(1) * limit;
    let orders = sqlx::query_as!(Order, "SELECT * FROM orders LIMIT $1 OFFSET $2", limit, offset)
        .fetch_all(&pool)
        .await?;
    (StatusCode::OK, Json(orders))
}

With no maximum page size and no rate limiting, an attacker can request /api/orders?limit=100&page=1, then page=2, and so on, eventually downloading all orders. If the response includes sensitive fields (e.g., customer PII, payment tokens), this becomes a critical data exposure.

These patterns are exacerbated by Axum's default behavior of passing extracted parameters directly to business logic without automatic authorization checks. The framework provides the building blocks (extractors, middleware) but leaves security entirely to the developer.

Axum-Specific Detection

Detecting API scraping vulnerabilities requires both static and dynamic analysis. In Axum codebases, look for:

Handlers that accept Path<T> or Query parameters and use them directly in database queries without an authorization check (e.g., comparing user.id to the path id).
List endpoints that accept limit and page (or offset) without enforcing a maximum limit or without rate limiting middleware.
Routes that return entire database records (via Json(user)) without filtering out sensitive fields.

Dynamic scanning with middleBrick automates detection of these issues. When you submit an Axum API endpoint to middleBrick, it performs a black-box scan that includes:

BOLA/IDOR testing: For each endpoint that accepts an identifier (e.g., /api/users/:id), middleBrick sends requests with a sequence of IDs (1, 2, 3, ...) and analyzes the responses. If unauthenticated requests return valid data (HTTP 200 with non-empty body) for multiple IDs, it flags an unauthenticated IDOR vulnerability.
Rate limiting assessment: middleBrick issues a burst of requests (e.g., 100 requests in 10 seconds) to list and detail endpoints. If the server does not respond with HTTP 429 (Too Many Requests) or other throttling signals, the lack of rate limiting is reported.
Data exposure scanning: The response bodies are scanned for patterns indicating sensitive data (e.g., email addresses, credit card numbers, API keys). If found, middleBrick reports potential data exposure.

For example, scanning an Axum app with a vulnerable /api/users/:id endpoint might yield a finding like:

Check	Severity	Endpoint	Evidence
BOLA/IDOR	High	GET /api/users/{id}	Unauthenticated access to user records for IDs 1-10

middleBrick's CLI makes it easy to integrate into your development workflow:

middlebrick scan https://your-axum-app.com/api

The scan completes in 5–15 seconds and returns a risk score (A–F) with actionable remediation steps tailored to your Axum stack.

Axum-Specific Remediation

Fixing API scraping in Axum involves implementing proper authorization, rate limiting, and data filtering. Here are concrete steps with code examples.

1. Enforce Authorization on Resource Endpoints
For endpoints that return a single resource by ID, ensure that the requester is authorized to access that specific resource. In Axum, you can create an extractor that loads the authenticated user (e.g., from a session or JWT) and then compare the user's ID or roles to the requested resource ID.

use axum::{
    extract::{Path, State},
    http::StatusCode,
    response::IntoResponse,
};
use serde::Serialize;

#[derive(Serialize)]
struct User {
    id: u32,
    username: String,
    email: String, // might be sensitive
}

// Assume we have an extractor that gives us the current user
async fn get_user(
    user: AuthenticatedUser, // custom extractor that returns the logged-in user
    Path(id): Path<u32>,
    State(pool): State<DbPool>,
) -> impl IntoResponse {
    // Authorization check: user can only access their own data unless admin
    if user.id != id && !user.is_admin {
        return Err((StatusCode::FORBIDDEN, "Not authorized"));
    }

    let user_record = sqlx::query_as!(
        User,
        "SELECT id, username, email FROM users WHERE id = $1",
        id
    )
    .fetch_optional(&pool)
    .await
    .map_err(|_| (StatusCode::INTERNAL_SERVER_ERROR, "Database error"))?;

    match user_record {
        Some(user) => Ok((StatusCode::OK, Json(user))),
        None => Err((StatusCode::NOT_FOUND, "User not found")),
    }
}

2. Implement Rate Limiting
Use middleware to limit the number of requests per IP address. Axum integrates with tower middleware, so you can add a rate limiter layer:

use tower::limit::rate::RateLimit;
use std::time::Duration;

let app = Router::new()
    .route("/api/users/:id", get(get_user))
    .route("/api/users", get(list_users))
    .layer(RateLimit::new(100, Duration::from_secs(60))); // 100 requests per minute per IP

For more advanced rate limiting (e.g., per-user after authentication), you would need a custom middleware that extracts the user ID and applies limits accordingly.

3. Restrict List Endpoints
Prevent enumeration by capping the page size and using cursor-based pagination instead of offset-based. Also, filter out sensitive fields from the response.

#[derive(Deserialize)]
struct ListParams {
    cursor: Option<String>, // base64-encoded cursor (last seen ID or timestamp)
    limit: Option<usize>,
}

async fn list_users(
    Query(params): Query<ListParams>,
    State(pool): State<DbPool>,
) -> impl IntoResponse {
    let limit = params.limit.unwrap_or(20).min(50); // enforce max 50 per page
    let cursor = params.cursor.as_deref().and_then(decode_cursor); // decode cursor to last ID

    let mut query = String::from("SELECT id, username FROM users"); // only non-sensitive fields
    let mut args: Vec<&dyn sqlx::Encode<'_, sqlx::Postgres>> = Vec::new();

    if let Some(last_id) = cursor {
        query.push_str(" WHERE id > $1");
        args.push(&last_id);
    }

    query.push_str(" ORDER BY id LIMIT $");
    args.push(&(limit as i64 + 1)); // fetch one extra to see if there's a next page

    // Build query dynamically (using sqlx::query_as with arguments)
    // ... (omitted for brevity)

    let users = sqlx::query_as!(User, &query, args).fetch_all(&pool).await?;

    // If we fetched limit+1, then there's a next page; use the last ID as next cursor
    let next_cursor = if users.len() > limit {
        users.last().map(|u| encode_cursor(u.id))
    } else {
        None
    };

    let response = Json(json!({
        "users": &users[..limit],
        "next_cursor": next_cursor,
    }));

    (StatusCode::OK, response)
}

Note: The above cursor implementation is simplified. In production, you'd want to use a more robust cursor (e.g., based on timestamp and ID) and ensure it's not guessable.

4. Use UUIDs for Public Identifiers
If an endpoint must be publicly accessible (e.g., a public profile page), use UUIDs instead of sequential integers. This makes enumeration infeasible because the keyspace is too large to brute-force. In your database, store a UUID column and use it in the route:

async fn get_public_profile(Path(uuid): Path<Uuid>) -> impl IntoResponse {
    let profile = sqlx::query_as!(Profile, "SELECT id, username, bio FROM profiles WHERE uuid = $1", uuid)
        .fetch_optional(&pool)
        .await?;
    match profile {
        Some(p) => (StatusCode::OK, Json(p)),
        None => (StatusCode::NOT_FOUND, "Profile not found"),
    }
}

Combine this with rate limiting to further reduce the risk of targeted guessing.

By applying these Axum-specific fixes, you can prevent attackers from scraping your API's data. Regularly scan your Axum APIs with middleBrick to catch any regressions — especially after adding new endpoints or changing existing ones.

Frequently Asked Questions

Can middleBrick detect API scraping vulnerabilities that require authentication?

middleBrick's standard scan is unauthenticated, so it only finds issues exploitable without logging in. For authenticated BOLA/IDOR, you would need to provide credentials, but middleBrick does not currently support authenticated scanning. However, it will flag endpoints that require authentication (return 401/403) and skip unauthenticated tests for them. To test authenticated paths, consider using middleBrick's GitHub Action in a CI pipeline where you can provide test credentials via secrets, or run a manual pentest.

How does middleBrick differentiate between legitimate API usage and scraping during its rate limit test?

middleBrick's rate limiting test sends a high volume of requests in a short burst (e.g., 100 requests within 10 seconds). This pattern is atypical for normal user behavior and is designed to trigger rate limiting defenses. If the server does not throttle such a burst, middleBrick reports the absence of rate limiting. The test is intentional and safe; it only probes endpoints that are already publicly accessible.

Api Scraping in Axum

How API Scraping Manifests in Axum

Axum-Specific Detection

Axum-Specific Remediation

Frequently Asked Questions

Related Pages