Handling Large Blobs — Cross-Cutting Pattern
The Problem
Large files — images, videos, documents — break every assumption built for small request/response cycles. Uploads fail mid-stream on flaky mobile connections. Processing a 2GB video into six renditions is not a synchronous API call. Serving a popular asset to millions of users is not a database read. Staff engineers recognize that "upload a file" is actually three separate systems with different failure modes, latency budgets, and scaling characteristics.
Playbooks That Use This Pattern
- Blob Storage & Media Pipeline — Upload, processing, storage lifecycle
- File Sync & Cloud Storage — Chunked sync, delta transfer
- Collaborative Editing — Document storage and versioning
- CDN & Edge Caching — Large asset delivery
- Chat & Messaging — Media message handling
The Core Tradeoff
| Strategy | What Works | What Breaks | Who Pays |
|---|---|---|---|
| Direct upload (through API) | Simple; one hop | Blocks API servers; timeout risk on large files; memory pressure | Your API fleet |
| Presigned URL upload | Blobs never touch your servers; scales with object storage | Client must handle signing flow; CORS configuration | Object storage provider |
| Chunked / resumable upload | Survives network interruptions; progress tracking | Client complexity; server-side reassembly and ordering | Client + storage layer |
| Streaming upload | Low memory; works for unbounded input | Harder to retry; backpressure management | Both sides |
Staff Default Position
"Upload reliability is a client-side problem." The API server's job is to issue a presigned URL and record metadata — it should never buffer a blob.
Staff default for any system handling files:
- Upload: Presigned URLs to object storage. Chunked/resumable for anything >10MB.
- Process: Async pipeline triggered by storage event (S3 notification, GCS Pub/Sub). Thumbnails, transcoding, virus scanning — none of this belongs in the request path.
- Store: Object storage (S3, GCS, Azure Blob) with lifecycle policies. Metadata in your database; bytes in the bucket.
- Serve: CDN with origin shielding. Cache warming for predictably popular content.
These are three different systems with three different SLAs. Conflating them is how you get API servers OOM-killed by a 4GB video upload.
When to Deviate
- Small, trusted blobs (<1MB): Direct upload through API is fine. Avatars, config files — the complexity of presigned URLs may not be worth it.
- Regulatory / compliance constraints: Some industries require blobs to transit through your servers for inspection before reaching storage. Accept the cost; add streaming + size limits.
- Real-time collaborative editing: Binary blobs may need to be chunked at the application layer (not HTTP layer) for delta sync and conflict resolution.
- Edge-origin latency matters: For latency-sensitive workloads (live streaming ingest), upload directly to edge PoPs rather than a central region.
Common Interview Mistakes
| What Candidates Say | What Interviewers Hear | What Staff Engineers Say |
|---|---|---|
| "The client uploads to our API, then we store it in S3" | "They'll OOM their API servers under load" | "Presigned URL straight to S3. Our API only handles metadata." |
| "We'll process the image synchronously before returning 200" | "They've never handled a 500MB file" | "Upload triggers an async pipeline. Client polls or subscribes for completion." |
| "We'll store files in the database" | "They haven't thought about cost or throughput" | "Metadata in Postgres. Bytes in object storage. Always." |
| "We'll serve files directly from S3" | "They haven't considered latency or cost at scale" | "CDN in front with origin shielding. S3 is the origin, not the serving layer." |
| "Resumable upload adds too much complexity" | "They've never dealt with mobile networks" | "Anything over 10MB needs resumability. Upload reliability is a client-side problem." |
Quick Reference
Staff Sentence Templates
Implementation Deep Dive
1. Presigned URL Upload — Zero-Copy Ingestion
The Staff default for any file upload: the client uploads directly to object storage. Your API server never touches the bytes. This is "zero-copy ingestion" — the blob travels from the client to S3 in one hop.
Presigned URL Flow
# Step 1: Client requests upload URL from API
function requestUpload(userId, filename, contentType, fileSize):
# Validate before generating URL
if fileSize > MAX_FILE_SIZE:
return { error: "file_too_large", maxSize: MAX_FILE_SIZE }
if contentType not in ALLOWED_TYPES:
return { error: "invalid_content_type" }
# Generate unique storage key
blobId = generateUUID()
storageKey = f"uploads/{userId}/{blobId}/{filename}"
# Create presigned PUT URL (valid for 15 minutes)
presignedUrl = s3.generatePresignedUrl(
method = "PUT",
bucket = UPLOAD_BUCKET,
key = storageKey,
contentType = contentType,
conditions = [
["content-length-range", 0, MAX_FILE_SIZE], # Enforce size at S3 level
],
expiresIn = 900 # 15 minutes
)
# Record pending upload in database
db.execute("""
INSERT INTO uploads (blob_id, user_id, storage_key, status, created_at)
VALUES (?, ?, ?, 'pending', now())
""", blobId, userId, storageKey)
return {
"blobId": blobId,
"uploadUrl": presignedUrl,
"expiresAt": now() + 900
}
# Step 2: Client uploads directly to S3 using the presigned URL
# (No server involvement — client → S3 directly)
# Step 3: S3 event notification triggers processing
function onS3Upload(event):
storageKey = event.Records[0].s3.object.key
upload = db.query("SELECT * FROM uploads WHERE storage_key = ?", storageKey)
if not upload:
# Orphaned upload — clean up
s3.deleteObject(UPLOAD_BUCKET, storageKey)
return
db.execute("UPDATE uploads SET status = 'processing' WHERE blob_id = ?", upload.blobId)
# Trigger async processing pipeline
processingQueue.enqueue({
"blobId": upload.blobId,
"storageKey": storageKey,
"contentType": upload.contentType,
"userId": upload.userId
})
2. Chunked Resumable Upload — The Mobile-First Protocol
For files over 10MB on mobile networks, a single PUT request is unreliable. Chunked resumable upload breaks the file into parts that can be uploaded independently and retried individually.
Resumable Upload Protocol
# Client-side: chunk and upload
function uploadFile(file, uploadUrl):
CHUNK_SIZE = 5 * 1024 * 1024 # 5MB per chunk (S3 minimum)
totalChunks = ceil(file.size / CHUNK_SIZE)
# Initiate multipart upload
uploadId = api.post("/uploads/initiate", {
filename: file.name,
contentType: file.type,
totalSize: file.size,
totalChunks: totalChunks
}).uploadId
completedParts = []
for i in range(totalChunks):
chunk = file.slice(i * CHUNK_SIZE, (i + 1) * CHUNK_SIZE)
partNumber = i + 1
# Retry each chunk independently
for attempt in range(MAX_RETRIES):
try:
response = api.put(
f"/uploads/{uploadId}/parts/{partNumber}",
body = chunk,
headers = { "Content-MD5": md5(chunk) }
)
completedParts.append({
"partNumber": partNumber,
"etag": response.etag
})
onProgress(partNumber / totalChunks)
break
except NetworkError:
if attempt == MAX_RETRIES - 1:
raise UploadFailed(f"Part {partNumber} failed after {MAX_RETRIES} retries")
sleep(exponentialBackoff(attempt))
# Complete the upload
api.post(f"/uploads/{uploadId}/complete", { parts: completedParts })
# Server-side: manage multipart upload via S3
function initiate(request):
response = s3.createMultipartUpload(
bucket = UPLOAD_BUCKET,
key = storageKey,
contentType = request.contentType
)
db.execute("INSERT INTO multipart_uploads (upload_id, s3_upload_id, status) VALUES (?, ?, 'active')",
generateId(), response.uploadId)
return { uploadId: response.uploadId }
function uploadPart(uploadId, partNumber, body, contentMD5):
response = s3.uploadPart(
bucket = UPLOAD_BUCKET,
key = storageKey,
uploadId = uploadId,
partNumber = partNumber,
body = body,
contentMD5 = contentMD5 # S3 validates integrity
)
return { etag: response.etag }
function complete(uploadId, parts):
s3.completeMultipartUpload(
bucket = UPLOAD_BUCKET,
key = storageKey,
uploadId = uploadId,
parts = parts
)
db.execute("UPDATE multipart_uploads SET status = 'completed' WHERE s3_upload_id = ?", uploadId)
3. Async Processing Pipeline — Event-Driven Transformation
Blobs are uploaded, then processed: thumbnail generation, video transcoding, virus scanning, content moderation. None of this belongs in the request path.
Processing Pipeline
# Processing queue consumer
function processBlob(job):
blob = s3.getObject(UPLOAD_BUCKET, job.storageKey)
match job.contentType:
case "image/*":
results = imageProcessor.process(blob, [
{ "name": "thumbnail", "width": 200, "height": 200, "format": "webp" },
{ "name": "medium", "width": 800, "height": 600, "format": "webp" },
{ "name": "original", "width": null, "height": null, "format": "webp" },
])
case "video/*":
results = videoTranscoder.transcode(blob, [
{ "name": "720p", "width": 1280, "height": 720, "codec": "h264", "bitrate": "2500k" },
{ "name": "480p", "width": 854, "height": 480, "codec": "h264", "bitrate": "1000k" },
{ "name": "thumb", "frame": "00:00:01", "format": "webp" },
])
# Store processed variants
for variant in results:
variantKey = f"processed/{job.blobId}/{variant.name}.{variant.format}"
s3.putObject(PROCESSED_BUCKET, variantKey, variant.data)
# Update database with variant URLs
db.execute("""
UPDATE uploads SET status = 'ready', variants = ?, processed_at = now()
WHERE blob_id = ?
""", serialize(results.map(r => r.url)), job.blobId)
# Notify client
notificationService.send(job.userId, {
"type": "upload_complete",
"blobId": job.blobId,
"variants": results.map(r => r.url)
})
4. CDN Serving with Origin Shielding
Serving popular assets directly from S3 is expensive ($0.09/GB) and slow (single-region origin). A CDN with origin shielding reduces both cost and latency.
CDN Configuration
# CloudFront distribution config
Distribution:
Origins:
- Id: processed-bucket
DomainName: processed-bucket.s3.amazonaws.com
S3OriginConfig:
OriginAccessIdentity: OAI-12345 # S3 only accessible via CDN
CacheBehaviors:
- PathPattern: "/media/*"
ViewerProtocolPolicy: redirect-to-https
CachePolicyId: "MediaCachePolicy" # TTL: 30 days
OriginRequestPolicyId: "S3Origin"
Compress: true # Gzip/Brotli for text-based formats
# Origin shielding: one regional cache between CDN PoPs and S3
OriginShield:
Enabled: true
OriginShieldRegion: us-east-1 # Closest to S3 bucket
Why origin shielding: Without it, a cache miss at any of 400+ CloudFront PoPs hits S3 directly. A viral image with 400 cache misses (one per PoP) causes 400 S3 GetObject requests. With origin shielding, a single regional cache sits between all PoPs and S3. Cache misses from any PoP are served by the shield — resulting in at most 1 S3 request for any asset, no matter how many PoPs request it simultaneously.
Architecture Diagram
Upload path (steps 1-3): Client gets a presigned URL from the API, then uploads directly to S3. The API server never touches the blob — zero memory pressure.
Processing path (steps 4-5): S3 event triggers processing workers. Workers read raw blob, generate variants, store in processed bucket. Fully async and retryable.
Delivery path (step 6): Users fetch variants from CDN. Origin shield protects S3 from stampede. 30-day TTL means popular content is served from edge with zero origin hits.
Failure Scenarios
1. Orphaned Multipart Uploads — Silent Storage Cost Growth
Timeline: Clients start multipart uploads for large files. Some uploads are abandoned mid-stream (app closed, network lost, user cancels). The incomplete multipart upload parts remain in S3 indefinitely. Over 6 months, abandoned parts accumulate to 15TB of storage — invisible in standard S3 metrics.
Blast radius: $345/month in wasted storage (at $0.023/GB), growing linearly. No user impact, but the cost is hidden because S3 metrics show bucket size including incomplete multipart parts.
Detection: Use aws s3api list-multipart-uploads --bucket UPLOAD_BUCKET to enumerate incomplete uploads. Monitor for uploads older than 24 hours.
Recovery:
- Immediate: run
AbortMultipartUploadfor all uploads older than 24 hours - Permanent: add an S3 lifecycle rule to automatically abort incomplete multipart uploads after 7 days
# S3 lifecycle rule — auto-cleanup abandoned uploads
{
"Rules": [{
"ID": "AbortIncompleteMultipartUploads",
"Status": "Enabled",
"AbortIncompleteMultipartUpload": {
"DaysAfterInitiation": 7
}
}]
}
2. Processing Pipeline Poison Message — Transcoding Crash Loop
Timeline: A user uploads a corrupted 4GB video file. The video transcoder crashes on the file (invalid codec header). The processing queue retries the message. The transcoder crashes again. After 3 retries, the message is retried again because the dead-letter policy was not configured. The crash-retry loop consumes 100% of transcoding capacity.
Blast radius: All video processing is blocked. New uploads queue behind the poison message. The processing backlog grows by ~100 videos/hour. Users see "processing" status for hours.
Detection: Processing queue age metric spikes. Transcoder restart count increases. Processing throughput drops to zero.
Recovery:
- Immediate: identify the poison message and manually move it to a dead-letter queue
- Short-term: configure dead-letter queue policy — after 3 failed attempts, route to DLQ for manual investigation
- Long-term: add pre-processing validation — check file header, codec, and estimated duration before sending to the transcoder. Reject obviously corrupt files at upload time with a user-friendly error
3. CDN Cache Invalidation Delay — Stale Content Served After Deletion
Timeline: A user deletes a profile photo (privacy request). The API deletes the object from S3 and issues a CDN cache invalidation. CloudFront invalidation takes 5-10 minutes to propagate to all 400+ PoPs. During the propagation window, the deleted photo is still served from CDN edge caches. The user reports a privacy violation.
Blast radius: One user's deleted content is accessible for up to 10 minutes after deletion. For GDPR-relevant content, this may constitute a compliance violation.
Detection: CDN invalidation propagation monitoring. User complaint. Compliance audit log comparing deletion timestamps with CDN cache hit timestamps.
Recovery:
- Immediate: CDN invalidation is already in flight — wait for propagation
- Architecture change: serve private/deletable content through a signed URL with short TTL (15 minutes) instead of a long-lived cache key. When the object is deleted from S3, the signed URL expires naturally
- Alternatively: use a versioned URL scheme (
/media/v3/photo.jpg). On update or delete, the old version URL is never reused, so stale cache entries point to a non-existent object and return 404
Staff Interview Application
How to Introduce This Pattern
Name the three systems explicitly. This tells the interviewer you understand that "upload a file" is not one problem — it is three.
When NOT to Use This Pattern
- Small files (<1MB): Direct upload through the API is fine for avatars, config files, and thumbnails. Presigned URLs add client complexity that isn't justified for small payloads.
- Server-side validation required before storage: If you must scan the file (virus, content moderation) before it touches object storage, a presigned URL flow doesn't work — the blob must pass through your server. Use streaming upload with size limits.
- Private, ephemeral files: Temporary files that expire in minutes (chat image previews, one-time links) don't need CDN caching or lifecycle management. Serve directly from S3 with a short-lived signed URL.
Follow-Up Questions to Anticipate
| Interviewer Asks | What They Are Testing | How to Respond |
|---|---|---|
| "How do you handle upload failure at 95%?" | Reliability engineering | "Chunked resumable upload. The client retries only the failed chunk, not the entire file. For a 200MB upload at 95%, that is retrying 10MB instead of 200MB." |
| "How do you prevent malicious uploads?" | Security thinking | "Three layers: presigned URL conditions enforce size and content-type at S3. Processing pipeline runs virus scanning. Content moderation flags inappropriate images before they are served." |
| "What about storage cost at scale?" | Cost awareness | "Lifecycle policies: Standard for 30 days, then Infrequent Access, then Glacier for archival. Deduplication by content hash saves 20-40% on user-generated content. Chargeback to teams that store the most." |
| "How do you handle concurrent uploads to the same file?" | Concurrency reasoning | "Each upload gets a unique blob ID — no conflicts. If the business needs 'replace existing file' semantics, it is a new upload with a database pointer update, not an S3 overwrite." |
| "Why not store files in the database?" | Architecture fundamentals | "Databases are optimized for structured queries, not blob storage. A 1GB file in PostgreSQL bloats WAL, backup, and replication. Object storage is purpose-built: $0.023/GB/month, 11 nines durability, unlimited scale." |