HIGH zip slipflaskdynamodb

Zip Slip in Flask with Dynamodb

Zip Slip in Flask with Dynamodb — how this specific combination creates or exposes the vulnerability

Zip Slip is a path traversal vulnerability that occurs when an application constructs extraction paths by directly concatenating user-supplied archive entries with a base extraction directory. In a Flask application that interacts with Amazon DynamoDB, the risk emerges not from DynamoDB itself, but from unsafe file handling before or after data is stored or retrieved. If a Flask endpoint accepts file uploads or archive downloads, uses values from client requests to build file paths, and then extracts archives without validating path components, an attacker can craft entries like ../../../etc/passwd to escape the intended directory. When combined with DynamoDB, the exposure surface involves how application code maps user input to primary keys or attribute values used in archive naming or storage references. For example, if a developer uses a user-controlled value as part of a filename that is later archived and extracted, or includes a DynamoDB primary key in an archive path without validation, Zip Slip can lead to arbitrary file writes or reads outside the intended directory. Because the scan tests unauthenticated attack surfaces across input validation and file handling checks, it can flag unsafe patterns where DynamoDB identifiers or request parameters influence local filesystem paths in Flask routes.

Dynamodb-Specific Remediation in Flask — concrete code fixes

To mitigate Zip Slip in Flask when working with DynamoDB, ensure that any user input used to construct file paths, archive entries, or keys is sanitized and confined to a safe directory. Use strict path normalization and reject entries that attempt directory traversal. Below are concrete code examples for a Flask route that stores and retrieves objects while referencing DynamoDB, demonstrating safe practices.

import os
from flask import Flask, request, jsonify
import boto3
from werkzeug.utils import secure_filename

app = Flask(__name__)

# Initialize DynamoDB client
client = boto3.client('dynamodb', region_name='us-east-1')
TABLE_NAME = os.getenv('TABLE_NAME', 'secure_files')

# Ensure the target base directory exists
UPLOAD_BASE = '/safe/extract/dir'
os.makedirs(UPLOAD_BASE, exist_ok=True)

def is_within_directory(directory, target):
    # Normalize and check that target remains inside directory
    abs_directory = os.path.abspath(directory)
    abs_target = os.path.abspath(target)
    return os.path.commonpath([abs_directory, abs_target]) == abs_directory

@app.route('/file', methods=['POST'])
def handle_file():
    # Validate and sanitize inputs
    file = request.files.get('file')
    if not file:
        return jsonify({'error': 'missing file'}), 400

    # Use secure_filename and an additional allowlist for safety
    filename = secure_filename(file.filename)
    if not filename:
        return jsonify({'error': 'invalid filename'}), 400

    # Build target path safely; do not use user input directly in paths
    target_path = os.path.join(UPLOAD_BASE, filename)
    if not is_within_directory(UPLOAD_BASE, target_path):
        return jsonify({'error': 'invalid path'}), 400

    # Save file to safe location
    file.save(target_path)

    # Store metadata in DynamoDB using a sanitated key, not raw user input
    item = {
        'file_id': {'S': filename},  # or use a UUID
        'original_name': {'S': filename},
        'size': {'N': str(os.path.getsize(target_path))}
    }
    client.put_item(TableName=TABLE_NAME, Item=item)

    return jsonify({'status': 'stored', 'file_id': filename}), 201

@app.route('/file/', methods=['GET'])
def get_file(file_id):
    # Retrieve metadata from DynamoDB
    resp = client.get_item(TableName=TABLE_NAME, Key={'file_id': {'S': file_id}})
    item = resp.get('Item')
    if not item:
        return jsonify({'error': 'not found'}), 404

    safe_name = item['original_name']['S']
    # Ensure safe_name is still valid before joining
    safe_name = secure_filename(safe_name)
    target_path = os.path.join(UPLOAD_BASE, safe_name)
    if not is_within_directory(UPLOAD_BASE, target_path):
        return jsonify({'error': 'invalid reference'}), 400

    if not os.path.exists(target_path):
        return jsonify({'error': 'file missing'}), 404

    return send_file(target_path, as_attachment=True, download_name=safe_name)

@app.route('/extract', methods=['POST'])
def extract_archive():
    archive = request.files.get('archive')
    if not archive:
        return jsonify({'error': 'missing archive'}), 400

    tmp_path = os.path.join(UPLOAD_BASE, secure_filename(archive.filename))
    archive.save(tmp_path)

    import zipfile
    try:
        with zipfile.ZipFile(tmp_path, 'r') as zf:
            for member in zf.infolist():
                # Reject paths that attempt traversal regardless of platform
                member_path = os.path.normpath(member.filename)
                if member_path.startswith('..') or not is_within_directory(UPLOAD_BASE, os.path.join(UPLOAD_BASE, member_path)):
                    return jsonify({'error': 'unsafe archive entry'}), 400
                zf.extract(member, path=UPLOAD_BASE)
    except Exception:
        return jsonify({'error': 'extraction failed'}), 400
    finally:
        if os.path.exists(tmp_path):
            os.remove(tmp_path)

    return jsonify({'status': 'extracted'}), 200

Key remediation points specific to the Flask + DynamoDB context:

  • Do not use raw DynamoDB attribute values (such as primary keys or filenames stored in the table) to construct filesystem paths without normalization and allowlist validation.
  • Apply secure_filename or a strict allowlist for filenames derived from user input or DynamoDB attributes before path joins.
  • Always verify extracted archive members with a path traversal check (e.g., is_within_directory) rather than relying on the archive library’s default behavior.
  • Keep user-controlled data separate from filesystem decisions; use opaque identifiers (e.g., UUIDs) as keys in DynamoDB, and map them to safe filenames server-side.

Frequently Asked Questions

How does middleBrick detect Zip Slip risks in Flask APIs that use DynamoDB?
middleBrick performs black-box scans without credentials, sending crafted requests that include path traversal sequences in parameters and archive entries. It analyzes input validation controls and how application logic maps DynamoDB identifiers or request data into local filesystem paths, flagging unsafe patterns where user-influenced values can escape intended directories.
Can DynamoDB primary keys be safely used in archive or filename generation?
Yes, if they are opaque identifiers (e.g., UUIDs) and not directly concatenated into paths. When storing or retrieving files in Flask, map DynamoDB keys to sanitized filenames server-side using allowlists and secure normalization, avoiding any direct use of user-supplied names or attributes that could contain traversal sequences.