LLM chat endpoints security

What middleBrick covers

  • Black-box LLM endpoint probing without code access
  • Adversarial prompt tests across three scan tiers
  • Authentication support for tokens and keys
  • Mapping findings to OWASP API Top 10 (2023)
  • Header allowlist for controlled request forwarding
  • Prioritized risk scoring with remediation guidance

Threat model for LLM chat endpoints

LLM chat endpoints expose prompts, context, and configuration to a model-facing API surface. Attackers probe these routes to extract system instructions, jailbreak guardrails, enforce unintended behavior, or harvest sensitive data from responses. Because prompts travel in headers, body, or metadata, misconfigured endpoints can leak credentials or internal logic. Black-box scanning validates what an attacker can observe without code or network access, testing only the runtime behavior of the endpoint.

How black-box scanning evaluates LLM chat security

The scanner interacts with chat completions endpoints using read-only methods and text-only POST payloads, avoiding destructive or intrusive tests. It sends adversarial probes across three scan tiers to map risks across prompt injection, data exfiltration, and token abuse. Findings are mapped to OWASP API Top 10 (2023) and highlight issues such as system prompt extraction, instruction override attempts, jailbreak patterns, and PII leakage in responses.

Covered LLM security categories

Detection focuses on input handling, model behavior, and output safety. The scanner checks for missing or weak authentication on chat routes, excessive data exposure in responses, and insecure handling of URL-accepting parameters that could enable SSRF. It also identifies missing versioning and server fingerprinting around the chat API, alongside risks such as debug endpoints or dangerous HTTP methods that increase exposure surface.

Authenticated scanning requirements

Higher-tier scans support authenticated chat sessions using Bearer tokens, API keys, Basic auth, or cookies. Domain verification is enforced through DNS TXT records or an HTTP well-known file to ensure only the domain owner can submit credentials. The scanner forwards a restricted allowlist of headers, including Authorization, X-API-Key, Cookie, and X-Custom-* headers, limiting exposure of internal infrastructure.

Remediation guidance and limitations

The scanner detects and reports findings with prioritized risk scores and remediation guidance, but it does not patch, block, or fix runtime behavior. LLM-specific business logic vulnerabilities require domain expertise and cannot be automatically detected. Organizations should combine scanner output with manual review and red-team exercises for high-stakes architectures.

Frequently Asked Questions

Can this scanner test authenticated chat APIs?
Yes. Provide Bearer tokens, API keys, Basic auth, or cookies through the dashboard or CLI. Authentication requires domain verification to confirm ownership before scanning protected endpoints.
What LLM-specific risks are detected?
The scanner identifies system prompt extraction, jailbreak and DAN attempts, instruction override, data exfiltration probes, token smuggling, and prompt injection variants across Quick, Standard, and Deep scan tiers.
Does scanning affect the model or chat completions behavior?
No. Testing is read-only and uses text-only POST payloads. The scanner never sends destructive payloads or alters model state, ensuring operational safety during assessment.
How are findings mapped to compliance frameworks?
Findings map directly to OWASP API Top 10 (2023). The tool supports alignment with SOC 2 Type II and PCI-DSS 4.0 by surfacing evidence relevant to those control sets.