Nosql Injection in Flask with Dynamodb
Nosql Injection in Flask with Dynamodb — how this specific combination creates or exposes the vulnerability
NoSQL injection in a Flask application that uses Amazon DynamoDB typically occurs when application code builds DynamoDB expressions from untrusted input. Unlike SQL, DynamoDB does not use a query language with parsing in the same way; instead, it uses condition expressions, filter expressions, and key-condition expressions. If these expressions are constructed by string concatenation or by directly passing user-controlled data into API parameters, an attacker can manipulate the logical structure of the request.
In Flask, a common pattern is to read query parameters (e.g., via request.args.get) and use them directly in DynamoDB calls such as query or scan. For example, consider filtering items by a user-provided `user_id`:
table.query(KeyConditionExpression=Key('user_id').eq(request.args.get('user_id')))
At first glance this appears safe because the AWS SDK treats the value as a literal. However, if the application instead builds a filter expression as a string and passes it to scan or uses dynamic key condition construction, injection becomes possible:
user_input = request.args.get('filter')
response = table.scan(FilterExpression=user_input)
An attacker could supply attribute_exists(id) AND #a = 'admin' and, with knowledge of attribute names or through enumeration, attempt to bypass authorization or extract data. DynamoDB’s support for expression attribute names and values helps mitigate this when used correctly, but if the application dynamically injects placeholder names or values from user input, the protection breaks down.
Another vector involves key-condition expressions in queries. DynamoDB requires the partition key to be specified exactly; however, if an application allows an attacker to influence which key is queried (for instance, by using user input to decide the partition key name), they may force queries against unintended partitions or induce expensive scans that lead to denial of service. The scan vs query choice is often controlled by input, and misuse can expose broader datasets.
The 12 security checks in middleBrick operate in parallel and include specific checks for Input Validation and Unsafe Consumption. These checks are designed to detect patterns where untrusted data reaches DynamoDB API calls without proper sanitization, normalization, or strict schema enforcement. The scanner examines the unauthenticated attack surface of your Flask endpoints, including how DynamoDB parameters are derived from request data, and flags findings with severity and remediation guidance.
Additionally, if your Flask application exposes an OpenAPI specification that references DynamoDB patterns, middleBrick’s OpenAPI/Swagger spec analysis resolves $ref definitions and cross-references spec definitions with runtime findings. This helps identify mismatches between documented behavior and actual implementation, which is especially useful when DynamoDB-specific request structures are involved.
Dynamodb-Specific Remediation in Flask — concrete code fixes
Remediation centers on strict input validation, using DynamoDB’s built-in safeguards, and avoiding the concatenation of user input into expression strings. Always prefer parameterized expressions with placeholder attribute names and values, and enforce allowlists on known-safe values.
1. Use KeyConditionExpression with strict key validation:
from flask import request
import boto3
from boto3.dynamodb.conditions import Key
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Items')
@app.route('/items')
def get_items():
user_id = request.args.get('user_id')
# Validate format expected for the partition key
if not isinstance(user_id, str) or not user_id.isalnum():
return {'error': 'Invalid user_id'}, 400
response = table.query(
KeyConditionExpression=Key('user_id').eq(user_id)
)
return {'items': response.get('Items', [])}
This ensures the value is of the expected type and structure before it reaches DynamoDB, preventing injection through malformed keys.
2. Use ExpressionAttributeNames and ExpressionAttributeValues for dynamic attributes:
import boto3
from boto3.dynamodb.conditions import Attr
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Products')
# Unsafe: directly concatenating user input into FilterExpression
# FilterExpression = request.args.get('filter') # DO NOT DO THIS
# Safe: parameterize attribute names and values
attribute_name = 'category'
attribute_value = 'electronics'
response = table.scan(
FilterExpression=Attr(attribute_name).eq(attribute_value)
)
# For fully dynamic attributes, validate against an allowlist
allowed_attributes = {'category', 'price', 'in_stock'}
user_attr = request.args.get('attr')
user_val = request.args.get('val')
if user_attr not in allowed_attributes:
return {'error': 'Invalid attribute'}, 400
response = table.scan(
FilterExpression=Attr(user_attr).eq(user_val)
)
By using Attr and providing attribute names as literals (or after strict allowlist checks), you prevent attackers from injecting logical operators or malformed expressions.
3. Avoid dynamic key condition construction; if necessary, map user input to known keys:
allowed_keys = {'user_id', 'product_id'}
requested_key = request.args.get('key_type')
if requested_key not in allowed_keys:
return {'error': 'Unsupported key type'}, 400
if requested_key == 'user_id':
response = table.query(
KeyConditionExpression=Key('user_id').eq(request.args.get('id_value'))
)
else:
response = table.query(
KeyConditionExpression=Key('product_id').eq(request.args.get('id_value'))
)
This pattern ensures that the structure of the key condition remains predictable and prevents injection via the key name.
middleBrick’s LLM/AI Security checks are unique in detecting prompt injection attempts, system prompt leakage, and output risks. While these checks do not directly analyze DynamoDB code, they help secure any AI-assisted development workflows that might generate or modify such API integrations. The CLI tool (middlebrick scan <url>) can be used to validate endpoint behavior, and the GitHub Action can enforce score thresholds in CI/CD pipelines to prevent insecure changes from reaching production.