HIGH llm data leakageadonisjsfirestore

Llm Data Leakage in Adonisjs with Firestore

Llm Data Leakage in Adonisjs with Firestore — how this specific combination creates or exposes the vulnerability

When building AI-enabled features in Adonisjs that interact with Google Cloud Firestore, developers may inadvertently expose sensitive data through LLM endpoints. This occurs when application code passes raw Firestore documents, including potentially sensitive fields such as internal IDs, timestamps, or user-specific metadata, into prompts or LLM tool calls without proper sanitization.

Adonisjs does not inherently expose Firestore data to LLMs; the risk arises from developer patterns where query results are forwarded to LLM clients or used to construct dynamic prompts. For example, retrieving a user document with Firestore and directly including its contents in a system prompt can lead to system prompt leakage, where internal instructions or context are exposed to the LLM or downstream observers.

LLM-specific checks in middleBrick detect patterns where Firestore query results containing sensitive keys are used in LLM invocations. Real-world attack patterns include using overly permissive Firestore security rules that allow an unauthenticated API endpoint to read documents containing private metadata, which then become part of LLM input. This can facilitate prompt injection attempts where injected text appears to originate from the data itself, or enable cost exploitation if token usage is tied to uncontrolled data volumes.

Because Firestore documents often contain nested objects and arrays, developers may inadvertently pass entire document structures into LLM functions. This increases the risk of output scanning failures if LLM responses contain PII or API keys derived from the input data. middleBrick’s LLM/AI Security checks specifically look for these conditions by correlating runtime input sources like Firestore query results with active prompt injection probes, including system prompt extraction and data exfiltration tests.

In an Adonisjs application, an insecure implementation might look like a route handler that fetches a user profile and sends it to an LLM without field filtering. If the Firestore document includes an apiKey or internalRole field, these could be exposed through LLM outputs or used in excessive agency scenarios where tool calls are generated based on uncontrolled input. Proper remediation involves strict input validation, field-level filtering before prompt construction, and ensuring that only necessary, sanitized data reaches the LLM layer.

Firestore-Specific Remediation in Adonisjs — concrete code fixes

To mitigate LLM data leakage when using Firestore in Adonisjs, apply strict data selection and sanitization before passing information to LLM endpoints. Always retrieve only the fields required for the immediate operation and remove or mask sensitive metadata.

// Safe Firestore document retrieval in Adonisjs
const { Firestore } = require('@google-cloud/firestore');
const firestore = new Firestore();

async function getPublicUserProfile(userId) {
  const userDoc = await firestore
    .collection('users')
    .doc(userId)
    .get();

  if (!userDoc.exists) {
    throw new Error('User not found');
  }

  const data = userDoc.data();
  // Explicitly allow-listed fields to prevent leakage
  return {
    displayName: data.displayName,
    email: data.email,
    avatarUrl: data.avatarUrl,
  };
}

Use the filtered document when constructing prompts to ensure that no internal or sensitive fields such as apiKey, internalRole, or createdAt are included in LLM input.

// Constructing a safe prompt for LLM usage
const userProfile = await getPublicUserProfile('user_123');
const prompt = `You are assisting user ${userProfile.displayName} with email ${userProfile.email}. Provide general guidance only.`;

When using Firestore arrays or nested objects, explicitly traverse and sanitize contents rather than passing raw data structures to the LLM. Validate and limit token usage by calculating approximate input sizes before sending to the LLM endpoint.

// Example of handling Firestore map fields safely
function sanitizeNestedData(data) {
  const safeObject = {};
  for (const [key, value] of Object.entries(data)) {
    if (typeof value === 'string' && key.startsWith('public_')) {
      safeObject[key] = value;
    }
  }
  return safeObject;
}

In Adonisjs services that integrate with LLMs, centralize data preparation logic to ensure consistent filtering across endpoints. Combine this with environment-controlled rules to dynamically adjust which fields are permitted based on deployment context.

For applications using middleBrick Pro, continuous monitoring can help identify risky patterns where Firestore documents are exposed to LLM endpoints, while the CLI and GitHub Action integrations enable automated checks in development and deployment pipelines.

Related CWEs: llmSecurity

CWE IDNameSeverity
CWE-754Improper Check for Unusual or Exceptional Conditions MEDIUM

Frequently Asked Questions

How can I verify that sensitive Firestore fields are not reaching the LLM in my Adonisjs app?
Review your route handlers and service layers to confirm that only explicitly allowed fields are used when constructing LLM prompts. Implement logging for prompt inputs and validate using middleBrick’s LLM/AI Security checks, which include active prompt injection tests and output scanning for PII or secrets.
Does Firestore’s built-in security rules fully protect against LLM data leakage in Adonisjs?
Firestore security rules control database access, but they do not prevent application-level code from forwarding data to LLMs. You must enforce field-level filtering in your Adonisjs code to ensure that sensitive document fields are not included in prompts, regardless of Firestore rules.