Numbers to Know — Staff Interview Quick Reference

The 60-Second Version

System design interviews test whether you can size a system without a calculator. Wrong orders of magnitude signal you have not operated production infrastructure.
Interviewers do not expect exact figures. They expect you to stay within 2x of reality. Being 10x off on latency or throughput raises immediate credibility concerns.
Memorize order of magnitude, not decimal places. L1 cache is nanoseconds, disk seek is milliseconds, cross-region is tens to hundreds of milliseconds.
Numbers anchor every capacity plan, every SLA discussion, and every sharding decision. They are not trivia — they are the language of trade-off conversations.
Staff candidates connect numbers to architectural choices: "At 100K QPS we need horizontal scaling; at 1K QPS a single Postgres instance is fine."
Round aggressively. Use powers of 10. State your assumptions out loud. This is what interviewers actually evaluate.

Staff vs Senior: How Numbers Change the Conversation

Number	Senior Engineers Say	Staff Engineers Say
99th percentile latency	"We should optimize the p99"	"p99 at 500ms means 1% of our 10M daily users hit this — that's 100K frustrated sessions. Is that acceptable for checkout vs. search?"
Throughput (QPS)	"We need to handle 50K QPS"	"50K QPS average means 150-500K peak. A single Postgres instance tops out at 50K reads/s — we need a caching layer, not more replicas"
Storage cost	"We'll store everything in S3"	"100TB across 3 replicas with indexes is 500TB actual. At $0.023/GB that's $11.5K/month — do we need 7-year retention or can we tier to Glacier after 90 days?"
Network bandwidth	"We have 10Gbps links"	"10Gbps theoretical is ~7Gbps goodput after overhead. Our 5TB/day outbound needs 460Mbps sustained — one link handles it, but during peak we'll saturate at 3x average"
Failure rate	"We target 99.9% availability"	"99.9% = 43 minutes downtime/month. With 3 dependencies each at 99.95%, our composite availability is 99.85% — we need circuit breakers and fallbacks to close the 0.05% gap"
Cache hit ratio	"Our cache hit rate is 95%"	"95% hit rate at 100K QPS means 5K cache misses/second hitting the database. If DB handles 10K reads/s, we're at 50% capacity from misses alone — a cache failure doubles DB load instantly"

Latency Numbers

Operation	Latency	Order of Magnitude
L1 cache reference	1 ns	nanoseconds
L2 cache reference	4 ns	nanoseconds
Main memory reference	100 ns	nanoseconds
SSD random read	100 us	microseconds
SSD sequential read (1 MB)	1 ms	milliseconds
Network round-trip, same AZ	0.5 ms	milliseconds
Network round-trip, same region	1-2 ms	milliseconds
Network round-trip, cross-region	50-150 ms	tens of milliseconds
Disk seek (HDD)	10 ms	milliseconds

Throughput Numbers

System	Throughput	Notes
Single web server	~10K req/s	CPU-bound; I/O-bound workloads vary
Redis (single thread)	~100K ops/s	~500K with pipelining
Kafka (per partition)	~1M msgs/s	Throughput scales with partitions
Postgres	~10K writes/s, ~50K reads/s	Assumes tuned config, SSDs
MySQL	~15K writes/s	InnoDB, commodity hardware
1 Gbps network link	~120 MB/s	Practical ceiling after overhead

Storage & Scale Numbers

Calculation	Result	Rule of Thumb
1M users x 1 KB each	1 GB	Fits in RAM on a single machine
1B events/day x 100 bytes	100 GB/day, ~36 TB/year	Plan for compression + retention policy
500M tweets/day x 300 bytes	150 GB/day	~55 TB/year raw, before indexes or replicas
Seconds in a day	~86,400 (~100K)	Use 100K for quick QPS math

Back-of-Envelope Reasoning

Example 1 — URL shortener write QPS. 100M new URLs/day. 100M / 100K seconds = ~1K writes/s. A single Postgres instance handles this comfortably.

Example 2 — Chat message storage. 1B messages/day, 200 bytes average. 200 GB/day raw. Over one year: ~73 TB. You need sharding and a retention strategy.

Example 3 — Image service bandwidth. 10M images served/day, 500 KB average. 5 TB/day outbound. At 120 MB/s per link, that is roughly 500 link-seconds of sustained throughput — a small CDN handles it.

Common Interview Traps

Confusing latency units. Mixing up microseconds and milliseconds changes your architecture. SSD random read is 100 us, not 100 ms.
Ignoring replication and indexing overhead. Raw data size is never the full storage cost. Multiply by 3x for replicas, add 30-50% for indexes.
Forgetting to convert units consistently. Always normalize to the same time horizon (per second, per day, per year) before comparing.
Over-precision. Saying "we need 11,574 QPS" instead of "roughly 12K QPS" signals inexperience with real estimation.

Latency Hierarchy

Rendering diagram...

Quick Conversion Table

From	To	Rule
Daily volume → QPS	÷ 86,400 (use 100K)	1M/day ≈ 10 QPS
QPS → daily volume	× 86,400 (use 100K)	100 QPS ≈ 10M/day
GB/day → MB/s	÷ 86,400 × 1,000	100 GB/day ≈ 1.2 MB/s
Users → concurrent	× 0.01 to 0.10	10M users → 100K-1M concurrent
Monthly active → daily active	× 0.30 to 0.50	100M MAU → 30-50M DAU
Peak → average	× 3 to 10	Average 1K QPS → peak 3K-10K

Practice Prompt

Staff-Caliber Answer Shape

Expand

Total feed loads/day: 500M × 8 = 4B feed loads
Total post reads/day: 4B × 20 = 80B post reads
QPS: 80B / 100K seconds ≈ 800K QPS (peak: 2-3M QPS)
Bandwidth: 80B × 2 KB = 160 TB/day ≈ 1.8 GB/s sustained
Can a single DB handle it? No. Postgres handles ~50K reads/s. We need at least 16 read replicas for average load and 40+ for peak. This is a caching problem — a 95% cache hit rate reduces DB load to 40K QPS, within single-instance range.

The Staff move: Don't just compute the number. Follow through to the architectural implication: this volume demands a caching layer, not just database scaling.

Common Scale Anchors

Use these as sanity checks when estimating:

System	Known Scale	Useful As
Twitter/X	~500M tweets/day	High-write social benchmark
Google Search	~8.5B queries/day (~100K QPS)	Read-heavy search benchmark
Uber	~20M rides/day, 5M driver location updates/second	Real-time location at scale
Stripe	~millions of transactions/day	Payment processing benchmark
WhatsApp	~100B messages/day	Messaging throughput ceiling
YouTube	~500 hours uploaded/minute, ~1B hours watched/day	Media storage + bandwidth

Additional Traps

Forgetting peak-to-average ratio. Average QPS is useless for capacity planning. You provision for peak, which is 3-10x average depending on the workload.
Treating storage as free. "We'll just store everything" ignores that 100 TB of hot data across 3 replicas with indexes is 500+ TB of actual storage cost.
Ignoring write amplification. One user action (post a tweet) can generate 10+ writes: the tweet itself, timeline fan-out, index updates, notification triggers, analytics events.
Confusing network throughput with goodput. Protocol overhead, retransmissions, and encryption reduce usable throughput to ~70% of theoretical maximum.