Api Rate Abuse in Aws Bedrock
How Api Rate Abuse Manifests in Aws Bedrock
Api rate abuse in Aws Bedrock occurs when attackers exploit the lack of proper rate limiting on Bedrock API endpoints, consuming disproportionate resources through legitimate API calls. This manifests in several specific ways within the Bedrock ecosystem.
# Vulnerable Bedrock API call without rate limiting
import boto3
from botocore.exceptions import ClientError
def generate_text_with_bauble(prompt):
bedrock = boto3.client('bedrock-runtime')
# No rate limiting - attacker can call this repeatedly
response = bedrock.invoke_model(
body={
'messages': [{'role': 'user', 'content': prompt}]
},
modelId='amazon.bauble-instruct-1'
)
return response['content']['text']
# Attack scenario: automated script calling this function thousands of times
for i in range(10000):
generate_text_with_bauble("Repeated prompt to abuse rate limits")
The above code demonstrates a critical vulnerability - there's no throttling mechanism to prevent abuse. An attacker can programmatically invoke Bedrock models thousands of times, potentially exhausting API quotas, incurring massive costs, or degrading service for legitimate users.
Specific Bedrock rate abuse patterns include:
- Prompt flooding: Rapid-fire identical or slightly varied prompts to Bauble or Claude models
- Token exhaustion: Crafting prompts designed to maximize token usage per request
- Sequential model abuse: Cycling through multiple Bedrock models to bypass per-model limits
- Concurrent session abuse: Opening numerous parallel sessions to overwhelm Bedrock's concurrency limits
The financial impact is particularly severe with Bedrock's token-based pricing. A single Bedrock model invocation might cost $0.0025 per 1K tokens. Without rate limiting, an attacker could generate $250 worth of tokens in seconds by making 100,000 rapid requests.
Aws Bedrock-Specific Detection
Detecting rate abuse in Aws Bedrock requires monitoring both API Gateway metrics and Bedrock-specific usage patterns. middleBrick's Bedrock-specific scanning identifies these vulnerabilities through black-box testing of your Bedrock endpoints.
# Scan Bedrock API for rate abuse vulnerabilities
middlebrick scan https://api.bedrock.amazonaws.com --profile=bedrock
middleBrick tests for Bedrock-specific rate abuse by:
- Testing unauthenticated access to Bedrock model endpoints
- Checking for missing rate limiting headers (X-RateLimit-Limit, X-RateLimit-Remaining)
- Analyzing API Gateway configurations for Bedrock integrations
- Verifying IAM policy restrictions on Bedrock model invocations
Key detection indicators in Bedrock:
| Indicator | Bedrock-Specific Pattern | Severity |
|---|---|---|
| Missing rate limiting | InvokeModel API accepts unlimited requests | Critical |
| Open IAM policies | Allow * for bedrock:InvokeModel | High |
| Exposed model IDs | Bedrock model IDs visible in client-side code | Medium |
| No usage quotas | Service quotas not configured for Bedrock | High |
CloudWatch metrics to monitor for Bedrock rate abuse:
# Monitor Bedrock for suspicious patterns
import boto3
from datetime import datetime, timedelta
def detect_bedrock_abuse():
cloudwatch = boto3.client('cloudwatch')
# Check for abnormal invocation patterns
response = cloudwatch.get_metric_statistics(
Namespace='AWS/Bedrock',
MetricName='InvokeModelRequests',
Dimensions=[{'Name': 'ModelId', 'Value': 'amazon.bauble-instruct-1'}],
StartTime=datetime.utcnow() - timedelta(minutes=5),
EndTime=datetime.utcnow(),
Period=300,
Statistics=['Sum', 'SampleCount']
)
if response['Datapoints'] and response['Datapoints'][0]['SampleCount'] > 100:
print(f"Potential rate abuse detected: {response['Datapoints'][0]['SampleCount']} requests")
Aws Bedrock-Specific Remediation
Remediating rate abuse in Aws Bedrock requires implementing controls at multiple layers - API Gateway, IAM policies, and application logic. Here are Bedrock-specific fixes using native AWS features.
# Secure Bedrock wrapper with rate limiting
import boto3
import time
from ratelimit import limits, sleep_and_retry
# 100 requests per minute per user
ONE_MINUTE = 60
RATE_LIMIT = 100
@sleep_and_retry
@limits(calls=RATE_LIMIT, period=ONE_MINUTE)
def invoke_bedrock_model(prompt, model_id='amazon.bauble-instruct-1'):
bedrock = boto3.client('bedrock-runtime')
try:
response = bedrock.invoke_model(
body={
'messages': [{'role': 'user', 'content': prompt}]
},
modelId=model_id
)
return response['content']['text']
except bedrock.exceptions.ProvisionedThroughputExceededException:
time.sleep(1)
return invoke_bedrock_model(prompt, model_id)
# Usage with proper rate limiting
result = invoke_bedrock_model("Hello, Bedrock!")
API Gateway configuration for Bedrock rate limiting:
# SAM template with Bedrock rate limiting
AWSTemplateFormatVersion: '2010-09-09'
Resources:
BedrockApi:
Type: AWS::Serverless::Api
Properties:
DefinitionUri: openapi.yaml
EndpointConfiguration: REGIONAL
MethodSettings:
- HttpMethod: POST
ResourcePath: /{proxy+}
ThrottlingBurstLimit: 100
ThrottlingRateLimit: 10
BedrockIntegration:
Type: AWS::Serverless::Function
Properties:
Handler: index.handler
Runtime: python3.9
Policies:
- Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- bedrock:InvokeModel
Resource: arn:aws:bedrock:us-east-1:123456789012:model/*
Condition:
# Custom condition to enforce rate limiting
ForAnyValue:StringEquals:
aws:userid: !Ref WhitelistedUsers
Additional Bedrock-specific protections:
- IAM policy restrictions: Limit which IAM roles can invoke specific Bedrock models
- Service quotas: Configure default service quotas for Bedrock model invocations
- Token counting: Implement token counting middleware to track cost per request
- Geographic restrictions: Use WAF to block requests from unexpected regions
Monitoring and alerting setup:
# Set up CloudWatch alarms for Bedrock abuse
import boto3
from datetime import datetime, timedelta
def setup_bedrock_alarms():
cloudwatch = boto3.client('cloudwatch')
# Alarm for excessive requests
cloudwatch.put_metric_alarm(
AlarmName='Bedrock_HighRequestRate',
ComparisonOperator='GreaterThanThreshold',
EvaluationPeriods=1,
MetricName='InvokeModelRequests',
Namespace='AWS/Bedrock',
Period=300, # 5 minutes
Statistic='Sum',
Threshold=500, # Threshold for alert
AlarmActions=['arn:aws:sns:us-east-1:123456789012:BedrockAlarms']
)