Arp Spoofing in Together Ai
How ARP Spoofing Manifests in Together Ai
Together Ai provides model inference over HTTPS endpoints such as https://api.together.xyz/v1/completions. Clients authenticate with a Bearer token in the Authorization header and send JSON payloads that contain prompts, model parameters, and sometimes sensitive data. When an attacker is on the same Layer‑2 network (e.g., a shared Wi‑Fi, corporate LAN, or a compromised container host), they can perform ARP spoofing to convince nearby devices that the attacker’s MAC address belongs to the Together Ai API gateway.
Once the ARP tables are poisoned, traffic destined for the Together Api endpoint is redirected to the attacker’s machine. The attacker can then:
- Perform a classic man‑in‑the‑middle (MITM) attack and capture the Authorization header, thereby stealing the API key.
- Replay or modify requests: inject malicious prompts that cause the model to emit disallowed content, exfiltrate data, or trigger costly model invocations (the latter is flagged by middleBrick’s LLM/AI security check for cost exploitation).
- Alter responses: embed executable code or PII in the model output, which the client may then execute or store, leading to secondary compromise.
Because the Together Ai API is typically accessed via standard HTTP libraries without additional network‑level protections, a successful ARP spoof gives the attacker full visibility into the request/response stream and the ability to act as a trusted intermediary.
Together Ai-Specific Detection
Detecting ARP spoofing directly from an API scan is outside middleBrick’s scope, but the scanner can reveal the conditions that make an API vulnerable to MITM attacks and highlight successful exploitation attempts.
When you run a middleBrick scan against a Together Ai endpoint, the following checks are particularly relevant:
- Encryption – verifies that TLS is enforced, checks for weak cipher suites, and flags missing HSTS headers. Absence of strong TLS makes traffic easier to intercept once ARP is poisoned.
- Data Exposure – scans responses for leaked API keys, tokens, or internal IPs that could aid an attacker in refining a MITM attack.
- LLM/AI Security – runs active prompt‑injection probes; if an attacker can inject via a compromised channel, these probes will detect successful instruction override or data exfiltration attempts.
- Inventory Management – lists exposed endpoints; any undocumented or internal‑only routes appearing in the response may indicate a misconfigured proxy that could be leveraged after ARP spoof.
Example CLI usage:
middlebrick scan https://api.together.xyz/v1/completions
The output (JSON) might include:
{
"overall_score": 42,
"letter_grade": "F",
"categories": {
"Encryption": {"score": 30, "findings": [{"severity": "high", "description": "Missing HSTS header"}]},
"LLM/AI Security": {"score": 55, "findings": [{"severity": "medium", "description": "Prompt injection possible via instruction override"}]}
}
}
A low Encryption score or findings related to exposed credentials should prompt a network‑level review for ARP anomalies (e.g., using arp -a on Linux or Get-NetNeighbor on Windows) and deployment of ARP‑monitoring tools such as arpwatch or XArp.
Together Ai-Specific Remediation
Mitigating ARP spoofing requires protecting the communication channel between the client and the Together Ai API. The following controls are effective and can be implemented using Together Ai’s native capabilities or standard libraries.
- Enforce Mutual TLS (mTLS) – Configure the Together Ai client to present a client certificate and validate the server’s certificate. This prevents an attacker who only can ARP‑spoof from completing the TLS handshake, because they lack a valid client certificate.
- Enable HSTS and Certificate Pinning – Instruct browsers or HTTP libraries to remember the Together Ai certificate and refuse connections that do not match the pinned public key.
- Restrict Access via IP Allowlist or Private Networking – Together Ai supports VPC peering, AWS PrivateLink, or Azure Private Endpoint for customers who need network isolation. By limiting the API to specific IP ranges or a private link, an attacker on a LAN cannot reach the endpoint even if ARP tables are poisoned.
- Rotate and Scope API Keys – Use short‑lived tokens or OAuth‑style scopes so that a stolen key has limited utility and expires quickly.
- Monitor Usage for Anomalies – Set up alerts in the Together Io dashboard for sudden spikes in token consumption or calls to unexpected models; these can indicate that a compromised key is being abused after an MITM attack.
Code example – Python client using the official Together Ai SDK (or raw requests) with mTLS:
import os
import requests
# Paths to client certificate and key (PEM format)
CLIENT_CERT = os.getenv('TOGETHER_CLIENT_CERT', '/etc/together/client.crt')
CLIENT_KEY = os.getenv('TOGETHER_CLIENT_KEY', '/etc/together/client.key')
# CA bundle that includes Together Ai’s server cert
CA_BUNDLE = os.getenv('TOGETHER_CA_BUNDLE', '/etc/together/ca.crt')
# Optional: use a private endpoint if you have set one up via VPC peering
BASE_URL = os.getenv('TOGETHER_AI_BASE_URL', 'https://api.together.xyz')
headers = {
'Authorization': f'Bearer {os.getenv("TOGETHER_API_KEY")}',
'Content-Type': 'application/json'
}
payload = {
"model": "togethercomputer/llama-2-7b-chat",
"prompt": "Explain quantum entanglement in simple terms.",
"max_tokens": 150
}
response = requests.post(
f'{BASE_URL}/v1/completions',
json=payload,
headers=headers,
cert=(CLIENT_CERT, CLIENT_KEY), # client cert for mTLS
verify=CA_BUNDLE # server cert validation
)
print(response.json())
If you prefer the Together Ai SDK, the same parameters can be passed through its constructor:
from together import Together
client = Together(
api_key=os.getenv('TOGETHER_API_KEY'),
base_url=os.getenv('TOGETHER_AI_BASE_URL'),
cert=(CLIENT_CERT, CLIENT_KEY),
verify=CA_BUNDLE
)
result = client.completions.create(
model='togethercomputer/llama-2-7b-chat',
prompt='Explain quantum entanglement in simple terms.',
max_tokens=150
)
print(result)
By combining mTLS, private networking, and vigilant key management, you eliminate the practical benefit an attacker gains from ARP spoofing against Together Ai services.