HIGH llm data leakageactixrust

Llm Data Leakage in Actix (Rust)

Llm Data Leakage in Actix with Rust — how this specific combination creates or exposes the vulnerability

When building Actix web services in Rust, developers often integrate LLM endpoints for tasks such as summarization, chat completions, or code generation. If route handlers, middleware, or response serialization logic inadvertently expose system prompts, user inputs, or internal metadata, these endpoints can leak sensitive information through LLM responses. Because Actix is asynchronous and strongly typed, data can flow through layers (extractors, guards, and responders) where improper handling may expose context that should remain private.

For example, consider an Actix handler that forwards user messages to an LLM without sanitizing inputs or isolating prompts. If the handler embeds debugging data, feature flags, or internal identifiers into the prompt, an attacker who can influence inputs may coax the model to repeat or reveal that data. Similarly, if the application reuses a single Actix app state containing LLM configuration (e.g., system prompts or API keys), a logic flaw such as Insecure Direct Object Reference (IDOR) or BOLA can allow one user to trigger requests that expose another user’s context or the model’s instructions.

middleBrick’s LLM/AI Security checks specifically probe for these risks by testing unauthenticated endpoints with sequential probes: system prompt extraction, instruction override, DAN jailbreak, data exfiltration, and cost exploitation. The scanner also checks whether Actix-hosted LLM endpoints are unintentionally exposed without authentication, and it examines outputs for PII, API keys, and executable code. These tests are valuable because they surface leakage that may not be obvious when reviewing Rust code, especially around serialization, header propagation, and error handling in Actix pipelines.

In an Actix service written in Rust, leakage can also arise from how responses are serialized back to the client. If an Actix HttpResponse or a custom struct includes fields such as prompt, session_id, or internal_flags, and those fields are inadvertently included in JSON output, an attacker may read them through crafted requests or error messages. Even with strict typing, failing to strip or redact sensitive context before sending data to an LLM or returning LLM responses to the client can violate confidentiality.

To illustrate, imagine an Actix route that builds a completion request by concatenating a static system prompt with user input and then returning the model’s raw reply. Without explicit scrubbing of the prompt and without validating that the user is authorized to access that prompt, a BOLA/IDOR-style issue may allow enumeration of prompts or extraction of instructions. middleBrick’s OpenAPI/Swagger analysis can help identify mismatches between declared routes and runtime behavior, highlighting where an LLM endpoint might be reachable without proper controls.

Rust-Specific Remediation in Actix — concrete code fixes

Remediation focuses on isolating sensitive data, tightening Actix extractors and state management, and ensuring that only sanitized data reaches LLM endpoints. Use dedicated configuration structs that are excluded from serialization, keep system prompts in environment variables or secure configuration stores, and avoid embedding runtime user identifiers in prompts or headers.

Below are concrete Actix patterns in Rust that reduce the risk of LLM data leakage.

1. Isolate prompts from user data

Define separate structures for internal prompt templates and runtime inputs. Do not serialize prompt templates in responses.

use actix_web::{web, HttpResponse, Responder};
use serde::{Deserialize, Serialize};

#[derive(Deserialize)]
struct CompletionRequest {
    user_message: String,
}

// Internal only: not exposed to API responses
struct PromptContext {
    system_prompt: String,
}

async fn chat_handler(
    req: web::Json,
    data: web::Data<PromptContext>,
) -> impl Responder {
    let user_input = &req.user_message;
    // Build prompt without leaking context
    let prompt = format!("{}\nUser: {}", data.system_prompt, user_input);

    // Call LLM here using `prompt`
    let response = call_llm(&prompt).await;
    HttpResponse::Ok().json(serde_json::json!({ "reply": response }))
}

async fn call_llm(prompt: &str) -> String {
    // Placeholder: integrate with your LLM client
    format!("Echo: {}", prompt)
}

2. Secure Actix app state

Store sensitive configuration such as system prompts in environment variables and avoid sharing raw state across routes that may be subject to IDOR or BOLA.

use actix_web::App;
use std::env;

#[actix_web::main]
async fn main() -> std::io::Result<> {
    let system_prompt = env::var("SYSTEM_PROMPT")
        .unwrap_or_else(|_| "You are a helpful assistant.".to_string());

    let prompt_context = web::Data::new(PromptContext { system_prompt });

    actix_web::HttpServer::new(move || {
        App::new()
            .app_data(prompt_context.clone())
            .route("/chat", web::post().to(chat_handler))
    })
    .bind("127.0.0.1:8080")?
    .run()
    .await
}

3. Scrub outputs and avoid verbose errors

Ensure that error handlers and logging do not include prompt or model metadata. Use Actix’s default error handlers carefully and avoid exposing internal flags in HTTP headers or JSON bodies.

use actix_web::error::ResponseError;
use actix_web::HttpResponse;
use std::fmt;

#[derive(Debug)]
enum ApiError {
    Internal,
}

impl fmt::Display for ApiError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{}", match self {
            ApiError::Internal => "internal error",
        })
    }
}

impl ResponseError for ApiError {
    fn error_response(&self) -> HttpResponse {
        HttpResponse::InternalServerError().json(serde_json::json!({ "error": self.to_string() }))
    }
}

4. Validate and rate-limit LLM endpoints

Apply Actix middleware for rate limiting and input validation to reduce abuse vectors that could be used to probe for leakage. This complements middleBrick’s checks for Rate Limiting and Input Validation by ensuring production services enforce bounds before requests reach the LLM integration.

use actix_web::middleware::Logger;
use actix_web::{web, App, HttpServer};

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    std::env::set_var("RUST_LOG", "actix_web=info");
    env_logger::init();

    HttpServer::new(|| {
        App::new()
            .wrap(Logger::default())
            .wrap(rate_limiter::RateLimiter::new(100)) // example crate
            .route("/chat", web::post().to(chat_handler))
    })
    .bind("127.0.0.1:8080")?
    .run()
    .await
}

Related CWEs: llmSecurity

CWE IDNameSeverity
CWE-754Improper Check for Unusual or Exceptional Conditions MEDIUM

Frequently Asked Questions

Can middleBrick detect LLM data leakage in Actix Rust services?
Yes. middleBrick scans unauthenticated attack surfaces and includes LLM/AI Security checks that test for system prompt leakage, prompt injection, and output exposure. It maps findings to frameworks like OWASP API Top 10 and provides remediation guidance, but it does not fix the service directly.
Does using OpenAPI specs with Actix guarantee no LLM data leakage?
No. OpenAPI/Swagger analysis helps identify route definitions and mismatches between spec and runtime behavior, but it cannot detect runtime leakage such as prompts inadvertently included in LLM responses. Secure coding practices and runtime testing are still required.