Format String in Flask with Dynamodb
Format String in Flask with Dynamodb — how this specific combination creates or exposes the vulnerability
A format string vulnerability occurs when user-controlled input is passed directly into a string formatting operation without proper sanitization. In a Flask application using Amazon DynamoDB, this typically arises when building API request parameters, log messages, or error responses using Python’s % operator, .format(), or f-strings with data retrieved from or influenced by DynamoDB responses.
Consider a Flask route that fetches a user record from DynamoDB and includes a user-supplied locale parameter in a status message:
import boto3
from flask import Flask, request
app = Flask(__name__)
ddb = boto3.client('dynamodb', region_name='us-east-1')
@app.route('/user')
def get_user():
user_id = request.args.get('id')
locale = request.args.get('locale', 'en')
# Fetch user from DynamoDB
resp = ddb.get_item(
TableName='users',
Key={'user_id': {'S': user_id}}
)
user = resp.get('Item', {})
username = user.get('username', {'S': 'unknown'}).get('S', 'unknown')
# Vulnerable: direct string interpolation with user input
message = 'Welcome %s (lang: %s)' % (username, locale)
return {'message': message}
If an attacker controls the locale parameter and the application inadvertently uses it in a format string operation (e.g., constructing a log line or API response that is later interpolated), they can inject format specifiers such as %s, %x, or %n. This can lead to information disclosure, crashes, or in some environments, code execution. The DynamoDB item itself may also contain user-controlled fields that, if used unsafely in formatting, expand the attack surface.
Another scenario involves constructing DynamoDB attribute names or filter expressions using string formatting with unchecked input. For example:
@app.route('/query')
def unsafe_query():
table = request.args.get('table')
# Vulnerable: table name inserted via string formatting
response = ddb.scan(TableName='%s_users' % table)
return {'items': response.get('Items', [])}
Here, an attacker could provide a table value like '; DROP TABLE users; -- (if logging or downstream processing is involved) or exploit format specifiers if the value is used in log messages or error outputs. While DynamoDB does not execute SQL, format string weaknesses can still corrupt log data, bypass log parsers, or trigger downstream vulnerabilities in instrumentation and monitoring tools that consume these logs.
Dynamodb-Specific Remediation in Flask — concrete code fixes
To mitigate format string issues in a Flask + DynamoDB application, avoid string interpolation for dynamic values used in logging, response messages, or DynamoDB parameters. Use explicit parameterization and structured data handling instead.
1. Safe message formatting without user-controlled format specifiers
Use simple concatenation or JSON-safe construction for user-facing messages:
import json
from flask import Flask, request
import boto3
app = Flask(__name__)
ddb = boto3.client('dynamodb', region_name='us-east-1')
@app.route('/user-safe')
def get_user_safe():
user_id = request.args.get('id')
locale = request.args.get('locale', 'en')
resp = ddb.get_item(
TableName='users',
Key={'user_id': {'S': user_id}}
)
user = resp.get('Item', {})
username = user.get('username', {'S': 'unknown'}).get('S', 'unknown')
# Safe: no format specifiers in user-controlled data
message = 'Welcome ' + username + ' (lang: ' + locale + ')'
# Alternatively, use json.dumps for structured logging
app.logger.info(json.dumps({'event': 'welcome', 'username': username, 'locale': locale}))
return {'message': message}
2. Parameterized DynamoDB calls without string interpolation
Never build table names or key values using string formatting. Use static or whitelisted values:
ALLOWED_TABLES = {'profiles', 'settings'}
@app.route('/query-safe')
def query_safe():
table = request.args.get('table')
if table not in ALLOWED_TABLES:
return {'error': 'invalid table'}, 400
# Safe: table name validated against a whitelist
response = ddb.scan(TableName=table + '_users')
return {'items': response.get('Items', [])}
3. Structured logging and output encoding
When generating logs or API responses, keep data separate from control logic and encode output based on context (e.g., HTML, JSON):
import logging
logger = logging.getLogger(__name__)
@app.route('/item')
def get_item():
item_id = request.args.get('item_id')
resp = ddb.get_item(
TableName='items',
Key={'id': {'S': item_id}}
)
item = resp.get('Item', {})
# Structured log entry; no interpolation of raw item fields into a format string
logger.info('item_fetch', extra={'item_id': item_id, 'has_item': bool(item)})
return {'item': item}
These practices reduce the risk that data from DynamoDB or client input is misused in format string operations, helping to prevent information leakage and instability in the Flask application.