Multi-turn manipulation audit
What middleBrick covers
- Multi-turn adversarial probe execution across three scan tiers
- OpenAPI 3.0/3.1 and Swagger 2.0 parsing with recursive $ref resolution
- Detection of prompt injection, jailbreak, and PII extraction patterns
- Identification of encoding bypasses and token smuggling attempts
- Cross-reference runtime behavior against API specification definitions
- Programmatic access via CLI, API client, and MCP Server
What multi-turn manipulation audit means
Multi-turn manipulation audit examines how an API-driven LLM agent behaves across a chain of requests. Instead of a single prompt, the test follows a workflow where earlier responses influence later instructions, tool usage, and data exposure. The goal is to detect whether an agent can be steered into executing unintended actions, exfiltrating information, or bypassing guardrails over multiple conversational turns.
Risks of skipping this workflow
Without a structured multi-turn audit, teams miss chained prompt-injection paths that only appear after several interactions. An agent may appear safe in isolation but can be coaxed into roleplay jailbreaks, instruction override, or PII extraction after following a sequence of plausible requests. This exposes surface to indirect prompt injection, token smuggling, and tool-abuse that single-turn tests do not surface, increasing the chance of real-world compromise through conversational interfaces.
A practical audit workflow
Begin with reconnaissance to identify endpoints that accept LLM inputs or generate agentic behavior. Run a baseline quick scan to establish a score, then progress to Standard and Deep tiers that increase chain length and probe complexity. Track findings across turns, correlate indicators such as repeated jailbreak patterns or encoded exfiltration, and document which turns trigger unsafe outcomes. Use the results to tighten system instructions, constrain tool permissions, and add turn-level monitoring.
middlebrick scan https://api.example.com/agent --tier deep --output jsonCoverage provided by middleBrick
middleBrick maps findings to OWASP API Top 10 (2023) and supports audit evidence for SOC 2 Type II and PCI-DSS 4.0. It runs 18 adversarial probes across three scan tiers focused on system prompt extraction, instruction override, DAN and roleplay jailbreaks, data exfiltration, cost exploitation, encoding bypasses, translation-embedded injection, few-shot poisoning, markdown injection, multi-turn manipulation, indirect prompt injection, token smuggling, tool-abuse, nested instruction injection, and PII extraction. The scanner parses OpenAPI 3.0, 3.1, and Swagger 2.0 with recursive $ref resolution and cross-references spec definitions against runtime behavior to highlight undefined security schemes and deprecated operations.
Operational considerations and limitations
middleBrick is a scanner and does not fix, patch, or remediate findings. It does not perform active SQL injection or command injection testing, and it does not detect business logic vulnerabilities that require domain context. Blind SSRF that relies on out-of-band infrastructure is out of scope. The tool surfaces findings and remediation guidance; validation and prioritization still require human expertise aligned with your application architecture and threat model.