Handling Large Blobs — Cross-Cutting Pattern

The Problem

Large files — images, videos, documents — break every assumption built for small request/response cycles. Uploads fail mid-stream on flaky mobile connections. Processing a 2GB video into six renditions is not a synchronous API call. Serving a popular asset to millions of users is not a database read. Staff engineers recognize that "upload a file" is actually three separate systems with different failure modes, latency budgets, and scaling characteristics.

Playbooks That Use This Pattern

Blob Storage & Media Pipeline — Upload, processing, storage lifecycle
File Sync & Cloud Storage — Chunked sync, delta transfer
Collaborative Editing — Document storage and versioning
CDN & Edge Caching — Large asset delivery
Chat & Messaging — Media message handling

The Core Tradeoff

Strategy	What Works	What Breaks	Who Pays
Direct upload (through API)	Simple; one hop	Blocks API servers; timeout risk on large files; memory pressure	Your API fleet
Presigned URL upload	Blobs never touch your servers; scales with object storage	Client must handle signing flow; CORS configuration	Object storage provider
Chunked / resumable upload	Survives network interruptions; progress tracking	Client complexity; server-side reassembly and ordering	Client + storage layer
Streaming upload	Low memory; works for unbounded input	Harder to retry; backpressure management	Both sides

Staff Default Position

"Upload reliability is a client-side problem." The API server's job is to issue a presigned URL and record metadata — it should never buffer a blob.

Staff default for any system handling files:

Upload: Presigned URLs to object storage. Chunked/resumable for anything >10MB.
Process: Async pipeline triggered by storage event (S3 notification, GCS Pub/Sub). Thumbnails, transcoding, virus scanning — none of this belongs in the request path.
Store: Object storage (S3, GCS, Azure Blob) with lifecycle policies. Metadata in your database; bytes in the bucket.
Serve: CDN with origin shielding. Cache warming for predictably popular content.

These are three different systems with three different SLAs. Conflating them is how you get API servers OOM-killed by a 4GB video upload.

When to Deviate

Small, trusted blobs (<1MB): Direct upload through API is fine. Avatars, config files — the complexity of presigned URLs may not be worth it.
Regulatory / compliance constraints: Some industries require blobs to transit through your servers for inspection before reaching storage. Accept the cost; add streaming + size limits.
Real-time collaborative editing: Binary blobs may need to be chunked at the application layer (not HTTP layer) for delta sync and conflict resolution.
Edge-origin latency matters: For latency-sensitive workloads (live streaming ingest), upload directly to edge PoPs rather than a central region.

Common Interview Mistakes

What Candidates Say	What Interviewers Hear	What Staff Engineers Say
"The client uploads to our API, then we store it in S3"	"They'll OOM their API servers under load"	"Presigned URL straight to S3. Our API only handles metadata."
"We'll process the image synchronously before returning 200"	"They've never handled a 500MB file"	"Upload triggers an async pipeline. Client polls or subscribes for completion."
"We'll store files in the database"	"They haven't thought about cost or throughput"	"Metadata in Postgres. Bytes in object storage. Always."
"We'll serve files directly from S3"	"They haven't considered latency or cost at scale"	"CDN in front with origin shielding. S3 is the origin, not the serving layer."
"Resumable upload adds too much complexity"	"They've never dealt with mobile networks"	"Anything over 10MB needs resumability. Upload reliability is a client-side problem."

Quick Reference

Rendering diagram...

Staff Sentence Templates

Implementation Deep Dive

1. Presigned URL Upload — Zero-Copy Ingestion

The Staff default for any file upload: the client uploads directly to object storage. Your API server never touches the bytes. This is "zero-copy ingestion" — the blob travels from the client to S3 in one hop.

Presigned URL Flow

# Step 1: Client requests upload URL from API
function requestUpload(userId, filename, contentType, fileSize):
    # Validate before generating URL
    if fileSize > MAX_FILE_SIZE:
        return { error: "file_too_large", maxSize: MAX_FILE_SIZE }
    if contentType not in ALLOWED_TYPES:
        return { error: "invalid_content_type" }

    # Generate unique storage key
    blobId = generateUUID()
    storageKey = f"uploads/{userId}/{blobId}/{filename}"

    # Create presigned PUT URL (valid for 15 minutes)
    presignedUrl = s3.generatePresignedUrl(
        method = "PUT",
        bucket = UPLOAD_BUCKET,
        key = storageKey,
        contentType = contentType,
        conditions = [
            ["content-length-range", 0, MAX_FILE_SIZE],  # Enforce size at S3 level
        ],
        expiresIn = 900    # 15 minutes
    )

    # Record pending upload in database
    db.execute("""
        INSERT INTO uploads (blob_id, user_id, storage_key, status, created_at)
        VALUES (?, ?, ?, 'pending', now())
    """, blobId, userId, storageKey)

    return {
        "blobId": blobId,
        "uploadUrl": presignedUrl,
        "expiresAt": now() + 900
    }

# Step 2: Client uploads directly to S3 using the presigned URL
# (No server involvement — client → S3 directly)

# Step 3: S3 event notification triggers processing
function onS3Upload(event):
    storageKey = event.Records[0].s3.object.key
    upload = db.query("SELECT * FROM uploads WHERE storage_key = ?", storageKey)

    if not upload:
        # Orphaned upload — clean up
        s3.deleteObject(UPLOAD_BUCKET, storageKey)
        return

    db.execute("UPDATE uploads SET status = 'processing' WHERE blob_id = ?", upload.blobId)

    # Trigger async processing pipeline
    processingQueue.enqueue({
        "blobId": upload.blobId,
        "storageKey": storageKey,
        "contentType": upload.contentType,
        "userId": upload.userId
    })

2. Chunked Resumable Upload — The Mobile-First Protocol

For files over 10MB on mobile networks, a single PUT request is unreliable. Chunked resumable upload breaks the file into parts that can be uploaded independently and retried individually.

Resumable Upload Protocol

# Client-side: chunk and upload
function uploadFile(file, uploadUrl):
    CHUNK_SIZE = 5 * 1024 * 1024    # 5MB per chunk (S3 minimum)
    totalChunks = ceil(file.size / CHUNK_SIZE)

    # Initiate multipart upload
    uploadId = api.post("/uploads/initiate", {
        filename: file.name,
        contentType: file.type,
        totalSize: file.size,
        totalChunks: totalChunks
    }).uploadId

    completedParts = []

    for i in range(totalChunks):
        chunk = file.slice(i * CHUNK_SIZE, (i + 1) * CHUNK_SIZE)
        partNumber = i + 1

        # Retry each chunk independently
        for attempt in range(MAX_RETRIES):
            try:
                response = api.put(
                    f"/uploads/{uploadId}/parts/{partNumber}",
                    body = chunk,
                    headers = { "Content-MD5": md5(chunk) }
                )
                completedParts.append({
                    "partNumber": partNumber,
                    "etag": response.etag
                })
                onProgress(partNumber / totalChunks)
                break
            except NetworkError:
                if attempt == MAX_RETRIES - 1:
                    raise UploadFailed(f"Part {partNumber} failed after {MAX_RETRIES} retries")
                sleep(exponentialBackoff(attempt))

    # Complete the upload
    api.post(f"/uploads/{uploadId}/complete", { parts: completedParts })

# Server-side: manage multipart upload via S3
function initiate(request):
    response = s3.createMultipartUpload(
        bucket = UPLOAD_BUCKET,
        key = storageKey,
        contentType = request.contentType
    )
    db.execute("INSERT INTO multipart_uploads (upload_id, s3_upload_id, status) VALUES (?, ?, 'active')",
               generateId(), response.uploadId)
    return { uploadId: response.uploadId }

function uploadPart(uploadId, partNumber, body, contentMD5):
    response = s3.uploadPart(
        bucket = UPLOAD_BUCKET,
        key = storageKey,
        uploadId = uploadId,
        partNumber = partNumber,
        body = body,
        contentMD5 = contentMD5     # S3 validates integrity
    )
    return { etag: response.etag }

function complete(uploadId, parts):
    s3.completeMultipartUpload(
        bucket = UPLOAD_BUCKET,
        key = storageKey,
        uploadId = uploadId,
        parts = parts
    )
    db.execute("UPDATE multipart_uploads SET status = 'completed' WHERE s3_upload_id = ?", uploadId)

3. Async Processing Pipeline — Event-Driven Transformation

Blobs are uploaded, then processed: thumbnail generation, video transcoding, virus scanning, content moderation. None of this belongs in the request path.

Processing Pipeline

# Processing queue consumer
function processBlob(job):
    blob = s3.getObject(UPLOAD_BUCKET, job.storageKey)

    match job.contentType:
        case "image/*":
            results = imageProcessor.process(blob, [
                { "name": "thumbnail",  "width": 200,  "height": 200,  "format": "webp" },
                { "name": "medium",     "width": 800,  "height": 600,  "format": "webp" },
                { "name": "original",   "width": null, "height": null, "format": "webp" },
            ])

        case "video/*":
            results = videoTranscoder.transcode(blob, [
                { "name": "720p",  "width": 1280, "height": 720,  "codec": "h264", "bitrate": "2500k" },
                { "name": "480p",  "width": 854,  "height": 480,  "codec": "h264", "bitrate": "1000k" },
                { "name": "thumb", "frame": "00:00:01", "format": "webp" },
            ])

    # Store processed variants
    for variant in results:
        variantKey = f"processed/{job.blobId}/{variant.name}.{variant.format}"
        s3.putObject(PROCESSED_BUCKET, variantKey, variant.data)

    # Update database with variant URLs
    db.execute("""
        UPDATE uploads SET status = 'ready', variants = ?, processed_at = now()
        WHERE blob_id = ?
    """, serialize(results.map(r => r.url)), job.blobId)

    # Notify client
    notificationService.send(job.userId, {
        "type": "upload_complete",
        "blobId": job.blobId,
        "variants": results.map(r => r.url)
    })

4. CDN Serving with Origin Shielding

Serving popular assets directly from S3 is expensive ($0.09/GB) and slow (single-region origin). A CDN with origin shielding reduces both cost and latency.

CDN Configuration

# CloudFront distribution config
Distribution:
  Origins:
    - Id: processed-bucket
      DomainName: processed-bucket.s3.amazonaws.com
      S3OriginConfig:
        OriginAccessIdentity: OAI-12345    # S3 only accessible via CDN

  CacheBehaviors:
    - PathPattern: "/media/*"
      ViewerProtocolPolicy: redirect-to-https
      CachePolicyId: "MediaCachePolicy"     # TTL: 30 days
      OriginRequestPolicyId: "S3Origin"
      Compress: true                         # Gzip/Brotli for text-based formats

  # Origin shielding: one regional cache between CDN PoPs and S3
  OriginShield:
    Enabled: true
    OriginShieldRegion: us-east-1           # Closest to S3 bucket

Why origin shielding: Without it, a cache miss at any of 400+ CloudFront PoPs hits S3 directly. A viral image with 400 cache misses (one per PoP) causes 400 S3 GetObject requests. With origin shielding, a single regional cache sits between all PoPs and S3. Cache misses from any PoP are served by the shield — resulting in at most 1 S3 request for any asset, no matter how many PoPs request it simultaneously.

Architecture Diagram

Rendering diagram...

Upload path (steps 1-3): Client gets a presigned URL from the API, then uploads directly to S3. The API server never touches the blob — zero memory pressure.

Processing path (steps 4-5): S3 event triggers processing workers. Workers read raw blob, generate variants, store in processed bucket. Fully async and retryable.

Delivery path (step 6): Users fetch variants from CDN. Origin shield protects S3 from stampede. 30-day TTL means popular content is served from edge with zero origin hits.

Failure Scenarios

1. Orphaned Multipart Uploads — Silent Storage Cost Growth

Timeline: Clients start multipart uploads for large files. Some uploads are abandoned mid-stream (app closed, network lost, user cancels). The incomplete multipart upload parts remain in S3 indefinitely. Over 6 months, abandoned parts accumulate to 15TB of storage — invisible in standard S3 metrics.

Blast radius: $345/month in wasted storage (at $0.023/GB), growing linearly. No user impact, but the cost is hidden because S3 metrics show bucket size including incomplete multipart parts.

Detection: Use aws s3api list-multipart-uploads --bucket UPLOAD_BUCKET to enumerate incomplete uploads. Monitor for uploads older than 24 hours.

Recovery:

Immediate: run AbortMultipartUpload for all uploads older than 24 hours
Permanent: add an S3 lifecycle rule to automatically abort incomplete multipart uploads after 7 days

# S3 lifecycle rule — auto-cleanup abandoned uploads
{
    "Rules": [{
        "ID": "AbortIncompleteMultipartUploads",
        "Status": "Enabled",
        "AbortIncompleteMultipartUpload": {
            "DaysAfterInitiation": 7
        }
    }]
}

2. Processing Pipeline Poison Message — Transcoding Crash Loop

Timeline: A user uploads a corrupted 4GB video file. The video transcoder crashes on the file (invalid codec header). The processing queue retries the message. The transcoder crashes again. After 3 retries, the message is retried again because the dead-letter policy was not configured. The crash-retry loop consumes 100% of transcoding capacity.

Blast radius: All video processing is blocked. New uploads queue behind the poison message. The processing backlog grows by ~100 videos/hour. Users see "processing" status for hours.

Detection: Processing queue age metric spikes. Transcoder restart count increases. Processing throughput drops to zero.

Recovery:

Immediate: identify the poison message and manually move it to a dead-letter queue
Short-term: configure dead-letter queue policy — after 3 failed attempts, route to DLQ for manual investigation
Long-term: add pre-processing validation — check file header, codec, and estimated duration before sending to the transcoder. Reject obviously corrupt files at upload time with a user-friendly error

3. CDN Cache Invalidation Delay — Stale Content Served After Deletion

Timeline: A user deletes a profile photo (privacy request). The API deletes the object from S3 and issues a CDN cache invalidation. CloudFront invalidation takes 5-10 minutes to propagate to all 400+ PoPs. During the propagation window, the deleted photo is still served from CDN edge caches. The user reports a privacy violation.

Blast radius: One user's deleted content is accessible for up to 10 minutes after deletion. For GDPR-relevant content, this may constitute a compliance violation.

Detection: CDN invalidation propagation monitoring. User complaint. Compliance audit log comparing deletion timestamps with CDN cache hit timestamps.

Recovery:

Immediate: CDN invalidation is already in flight — wait for propagation
Architecture change: serve private/deletable content through a signed URL with short TTL (15 minutes) instead of a long-lived cache key. When the object is deleted from S3, the signed URL expires naturally
Alternatively: use a versioned URL scheme (/media/v3/photo.jpg). On update or delete, the old version URL is never reused, so stale cache entries point to a non-existent object and return 404

Staff Interview Application

How to Introduce This Pattern

Name the three systems explicitly. This tells the interviewer you understand that "upload a file" is not one problem — it is three.

When NOT to Use This Pattern

Small files (<1MB): Direct upload through the API is fine for avatars, config files, and thumbnails. Presigned URLs add client complexity that isn't justified for small payloads.
Server-side validation required before storage: If you must scan the file (virus, content moderation) before it touches object storage, a presigned URL flow doesn't work — the blob must pass through your server. Use streaming upload with size limits.
Private, ephemeral files: Temporary files that expire in minutes (chat image previews, one-time links) don't need CDN caching or lifecycle management. Serve directly from S3 with a short-lived signed URL.

Follow-Up Questions to Anticipate

Interviewer Asks	What They Are Testing	How to Respond
"How do you handle upload failure at 95%?"	Reliability engineering	"Chunked resumable upload. The client retries only the failed chunk, not the entire file. For a 200MB upload at 95%, that is retrying 10MB instead of 200MB."
"How do you prevent malicious uploads?"	Security thinking	"Three layers: presigned URL conditions enforce size and content-type at S3. Processing pipeline runs virus scanning. Content moderation flags inappropriate images before they are served."
"What about storage cost at scale?"	Cost awareness	"Lifecycle policies: Standard for 30 days, then Infrequent Access, then Glacier for archival. Deduplication by content hash saves 20-40% on user-generated content. Chargeback to teams that store the most."
"How do you handle concurrent uploads to the same file?"	Concurrency reasoning	"Each upload gets a unique blob ID — no conflicts. If the business needs 'replace existing file' semantics, it is a new upload with a database pointer update, not an S3 overwrite."
"Why not store files in the database?"	Architecture fundamentals	"Databases are optimized for structured queries, not blob storage. A 1GB file in PostgreSQL bloats WAL, backup, and replication. Object storage is purpose-built: $0.023/GB/month, 11 nines durability, unlimited scale."