Hallucination Attacks in Actix with Firestore
Hallucination Attacks in Actix with Firestore — how this specific combination creates or exposes the vulnerability
Hallucination attacks in an Actix service that uses Firestore as a backend occur when an application returns fabricated or misleading information derived from Firestore documents. Because Firestore stores structured data that an Actix handler queries and then formats into responses, an attacker can manipulate input or exploit weak selection logic to cause the service to generate incorrect, inconsistent, or invented data.
In this stack, the vulnerability typically arises at the boundary between Firestore document retrieval and the Actix response construction. For example, an Actix handler might query a collection with an incomplete or attacker-controlled filter, receive a partial or empty set of documents, and then synthesize a response by filling missing fields with plausible but incorrect values. This synthesis can be intentional (as in prompt-injection-style attacks against an LLM-integrated handler) or accidental (due to missing validation or normalization logic).
Consider an Actix endpoint that retrieves user profile data from Firestore and returns a JSON summary. If the handler trusts client-supplied identifiers without strict validation, an attacker can provide identifiers that do not map to any document. Instead of returning a clear “not found,” the handler might combine whatever partial data exists with default or inferred data, producing a response that appears authoritative but is partially invented. Firestore’s flexible schema can exacerbate this: missing fields are not errors, so the handler may silently fill gaps with hallucinated content rather than enforcing required fields or schema constraints.
When Firestore security rules are misconfigured or bypassed (for example, through server-side admin access used insecurely in Actix), an attacker may be able to read broader datasets than intended. The Actix service might then attempt to construct a coherent narrative from this broader or noisy data, leading to leakage of unrelated information or generation of false relations across documents. This is especially risky when the handler aggregates multiple Firestore reads and merges them into a single response, as inconsistencies across documents can be smoothed over in a way that introduces false assertions.
Another vector involves document structure assumptions. Firestore allows nested maps and arrays, but if an Actix handler assumes a fixed shape without validating existence or types, it can misinterpret nulls or missing keys as indicators that data should be generated. For instance, a handler expecting a numeric field for “score” might substitute a computed or guessed value when the field is absent, producing a hallucinated score that seems legitimate to downstream consumers or LLM components.
Compounding these risks, Actix services often integrate LLM components that consume Firestore-derived data. If the data fed to the LLM contains invented or inconsistent content—either from Firestore or from handler interpolation—the LLM can amplify these hallucinations in its outputs. Active prompt injection probes in this context might attempt to inject instructions that cause the Actix handler to omit Firestore reads or to fabricate document references, leading to synthetic responses that appear to be grounded in Firestore but are not.
Firestore-Specific Remediation in Actix — concrete code fixes
Remediation focuses on strict validation, explicit schema enforcement, and defensive handling of missing or partial Firestore data within Actix handlers. Below are concrete patterns and code examples for Actix with Firestore in Rust.
1. Validate input and enforce required fields
Do not trust client-supplied identifiers or query parameters. Use strong types and validation before issuing Firestore reads.
use actix_web::{web, HttpResponse};
use firestore::*;
use serde::{Deserialize, Serialize};
#[derive(Deserialize, Validate)]
struct ProfileRequest {
#[validate(length(min = 1))]
user_id: String,
}
async fn get_profile(
body: web::Json,
db: web::Data,
) -> HttpResponse {
// Input validation ensures user_id is non-empty before querying.
match validate(&body) {
Ok(_) => {}
Err(e) => return HttpResponse::BadRequest().json(e.to_string()),
}
let doc_path = format!("users/{}", body.user_id);
let result: Result
2. Use strongly-typed Firestore documents and reject partial data
Define explicit structs that mirror Firestore documents and require all mandatory fields. Do not fill missing fields with defaults; return errors when required data is absent.
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize)]
struct Profile {
user_id: String,
email: String,
#[serde(rename = "profileComplete")]
profile_complete: bool,
// Do not add optional inferred fields here.
}
async fn get_profile_strict(db: web::Data, user_id: String) -> HttpResponse {
let doc_path = format!("users/{}", user_id);
match db.get::(&doc_path).await {
Ok(Some(profile)) if profile.profile_complete => HttpResponse::Ok().json(profile),
Ok(Some(_)) => HttpResponse::BadRequest().json("Profile incomplete"),
Ok(None) => HttpResponse::NotFound().json("Profile not found"),
Err(e) => HttpResponse::InternalServerError().json(e.to_string()),
}
}
3. Avoid merging or inferring across multiple Firestore reads
If you must aggregate, ensure each read is validated and that missing documents produce explicit errors rather than synthesized data.
async fn get_user_with_settings(
db: web::Data,
user_id: String,
) -> Result {
let user_doc = format!("users/{}", user_id);
let settings_doc = format!("settings/{}", user_id);
let user: Option = db.get(&user_doc).await.map_err(|e| e.to_string())?;
let settings: Option = db.get(&settings_doc).await.map_err(|e| e.to_string())?;
match (user, settings) {
(Some(u), Some(s)) => Ok(UserData { user: u, settings: s }),
_ => Err("Missing user or settings document".to_string()),
}
}
4. Harden against LLM-assisted hallucination when Firestore data is fed to models
When constructing prompts from Firestore data, include explicit instructions to reject fabrication and to cite only retrieved fields. Validate model outputs against the original Firestore values where possible.
async fn build_llm_prompt(db: web::Data, user_id: String) -> Result {
let doc_path = format!("users/{}", user_id);
let profile: Profile = db.get(&doc_path).await.map_err(|e| e.to_string())?
.ok_or_else(|| "Profile not found".to_string())?;
// Include only retrieved fields in the prompt; do not add inferred content.
let prompt = format!(
"User profile: name={}, email={}. Do not invent additional attributes.",
profile.user_id, profile.email
);
Ok(prompt)
}
5. Enforce Firestore security rules and use least-privilege service accounts
Ensure Firestore rules restrict reads to authorized paths. In Actix, use a service account with minimal permissions to limit the impact of misconfigurations or compromised handlers.
6. Schema and consistency checks
Implement periodic validation that Firestore documents conform to expected shapes. Reject documents or responses that contain unexpected nulls or type mismatches instead of filling gaps with invented data.
| Remediation Aspect | Action | Outcome |
|---|---|---|
| Input validation | Validate identifiers before queries | Prevent queries for non-existent paths |
| Type safety | Use strongly-typed structs; require mandatory fields | Avoid silent null interpretation |
| Aggregation safety | Fail if any read is missing instead of synthesizing | No fabricated cross-document relations |
| LLM prompt construction | Include explicit anti-hallucination instructions; cite retrieved fields | Reduce model invention from partial data |
| Security rules | Least-privilege access; server-side reads limited to service accounts | Minimize over-read risks |
Related CWEs: llmSecurity
| CWE ID | Name | Severity |
|---|---|---|
| CWE-754 | Improper Check for Unusual or Exceptional Conditions | MEDIUM |