StaffSignal
Foundation — Quick Reference

Numbers to Know

The 20 latency, throughput, and storage numbers you must recall in any system design interview. Order-of-magnitude reasoning, not memorization.

Numbers to Know — Staff Interview Quick Reference

The 60-Second Version

  • System design interviews test whether you can size a system without a calculator. Wrong orders of magnitude signal you have not operated production infrastructure.
  • Interviewers do not expect exact figures. They expect you to stay within 2x of reality. Being 10x off on latency or throughput raises immediate credibility concerns.
  • Memorize order of magnitude, not decimal places. L1 cache is nanoseconds, disk seek is milliseconds, cross-region is tens to hundreds of milliseconds.
  • Numbers anchor every capacity plan, every SLA discussion, and every sharding decision. They are not trivia — they are the language of trade-off conversations.
  • Staff candidates connect numbers to architectural choices: "At 100K QPS we need horizontal scaling; at 1K QPS a single Postgres instance is fine."
  • Round aggressively. Use powers of 10. State your assumptions out loud. This is what interviewers actually evaluate.

Staff vs Senior: How Numbers Change the Conversation

NumberSenior Engineers SayStaff Engineers Say
99th percentile latency"We should optimize the p99""p99 at 500ms means 1% of our 10M daily users hit this — that's 100K frustrated sessions. Is that acceptable for checkout vs. search?"
Throughput (QPS)"We need to handle 50K QPS""50K QPS average means 150-500K peak. A single Postgres instance tops out at 50K reads/s — we need a caching layer, not more replicas"
Storage cost"We'll store everything in S3""100TB across 3 replicas with indexes is 500TB actual. At $0.023/GB that's $11.5K/month — do we need 7-year retention or can we tier to Glacier after 90 days?"
Network bandwidth"We have 10Gbps links""10Gbps theoretical is ~7Gbps goodput after overhead. Our 5TB/day outbound needs 460Mbps sustained — one link handles it, but during peak we'll saturate at 3x average"
Failure rate"We target 99.9% availability""99.9% = 43 minutes downtime/month. With 3 dependencies each at 99.95%, our composite availability is 99.85% — we need circuit breakers and fallbacks to close the 0.05% gap"
Cache hit ratio"Our cache hit rate is 95%""95% hit rate at 100K QPS means 5K cache misses/second hitting the database. If DB handles 10K reads/s, we're at 50% capacity from misses alone — a cache failure doubles DB load instantly"

Latency Numbers

OperationLatencyOrder of Magnitude
L1 cache reference1 nsnanoseconds
L2 cache reference4 nsnanoseconds
Main memory reference100 nsnanoseconds
SSD random read100 usmicroseconds
SSD sequential read (1 MB)1 msmilliseconds
Network round-trip, same AZ0.5 msmilliseconds
Network round-trip, same region1-2 msmilliseconds
Network round-trip, cross-region50-150 mstens of milliseconds
Disk seek (HDD)10 msmilliseconds

Throughput Numbers

SystemThroughputNotes
Single web server~10K req/sCPU-bound; I/O-bound workloads vary
Redis (single thread)~100K ops/s~500K with pipelining
Kafka (per partition)~1M msgs/sThroughput scales with partitions
Postgres~10K writes/s, ~50K reads/sAssumes tuned config, SSDs
MySQL~15K writes/sInnoDB, commodity hardware
1 Gbps network link~120 MB/sPractical ceiling after overhead

Storage & Scale Numbers

CalculationResultRule of Thumb
1M users x 1 KB each1 GBFits in RAM on a single machine
1B events/day x 100 bytes100 GB/day, ~36 TB/yearPlan for compression + retention policy
500M tweets/day x 300 bytes150 GB/day~55 TB/year raw, before indexes or replicas
Seconds in a day~86,400 (~100K)Use 100K for quick QPS math

Back-of-Envelope Reasoning

Example 1 — URL shortener write QPS. 100M new URLs/day. 100M / 100K seconds = ~1K writes/s. A single Postgres instance handles this comfortably.

Example 2 — Chat message storage. 1B messages/day, 200 bytes average. 200 GB/day raw. Over one year: ~73 TB. You need sharding and a retention strategy.

Example 3 — Image service bandwidth. 10M images served/day, 500 KB average. 5 TB/day outbound. At 120 MB/s per link, that is roughly 500 link-seconds of sustained throughput — a small CDN handles it.

Common Interview Traps

  • Confusing latency units. Mixing up microseconds and milliseconds changes your architecture. SSD random read is 100 us, not 100 ms.
  • Ignoring replication and indexing overhead. Raw data size is never the full storage cost. Multiply by 3x for replicas, add 30-50% for indexes.
  • Forgetting to convert units consistently. Always normalize to the same time horizon (per second, per day, per year) before comparing.
  • Over-precision. Saying "we need 11,574 QPS" instead of "roughly 12K QPS" signals inexperience with real estimation.

Latency Hierarchy

Rendering diagram...

Quick Conversion Table

FromToRule
Daily volume → QPS÷ 86,400 (use 100K)1M/day ≈ 10 QPS
QPS → daily volume× 86,400 (use 100K)100 QPS ≈ 10M/day
GB/day → MB/s÷ 86,400 × 1,000100 GB/day ≈ 1.2 MB/s
Users → concurrent× 0.01 to 0.1010M users → 100K-1M concurrent
Monthly active → daily active× 0.30 to 0.50100M MAU → 30-50M DAU
Peak → average× 3 to 10Average 1K QPS → peak 3K-10K

Practice Prompt

Staff-Caliber Answer Shape
Expand
  1. Total feed loads/day: 500M × 8 = 4B feed loads
  2. Total post reads/day: 4B × 20 = 80B post reads
  3. QPS: 80B / 100K seconds ≈ 800K QPS (peak: 2-3M QPS)
  4. Bandwidth: 80B × 2 KB = 160 TB/day ≈ 1.8 GB/s sustained
  5. Can a single DB handle it? No. Postgres handles ~50K reads/s. We need at least 16 read replicas for average load and 40+ for peak. This is a caching problem — a 95% cache hit rate reduces DB load to 40K QPS, within single-instance range.

The Staff move: Don't just compute the number. Follow through to the architectural implication: this volume demands a caching layer, not just database scaling.

Common Scale Anchors

Use these as sanity checks when estimating:

SystemKnown ScaleUseful As
Twitter/X~500M tweets/dayHigh-write social benchmark
Google Search~8.5B queries/day (~100K QPS)Read-heavy search benchmark
Uber~20M rides/day, 5M driver location updates/secondReal-time location at scale
Stripe~millions of transactions/dayPayment processing benchmark
WhatsApp~100B messages/dayMessaging throughput ceiling
YouTube~500 hours uploaded/minute, ~1B hours watched/dayMedia storage + bandwidth

Additional Traps

  • Forgetting peak-to-average ratio. Average QPS is useless for capacity planning. You provision for peak, which is 3-10x average depending on the workload.
  • Treating storage as free. "We'll just store everything" ignores that 100 TB of hot data across 3 replicas with indexes is 500+ TB of actual storage cost.
  • Ignoring write amplification. One user action (post a tweet) can generate 10+ writes: the tweet itself, timeline fan-out, index updates, notification triggers, analytics events.
  • Confusing network throughput with goodput. Protocol overhead, retransmissions, and encryption reduce usable throughput to ~70% of theoretical maximum.

Where This Appears