Sharding & Partitioning — Staff Interview Quick Reference

The 60-Second Version

Hash-based partitioning gives even distribution but makes range queries impossible. Use it for point lookups (user profiles, session stores).
Range-based partitioning supports range queries naturally but creates hot partitions on sequential keys (timestamps, auto-increment IDs).
Shard key selection is a multi-year commitment. Changing it later means full data migration: dual-write, backfill, verify, cutover. This is not a refactor; it is a project.
Cross-shard queries use scatter-gather, which fans out to every shard and merges results. Design your data model so the dominant access pattern hits a single shard.
Hot keys break the sharding contract. One viral user or popular product concentrates load on one shard. Mitigate with compound keys, virtual shards, or request-level caching.
The real question is never "how many shards?" It is "what does the access pattern look like in three years?"
Re-sharding is an operational migration, not a code change. Plan for it from day one by using logical shards mapped to physical nodes.

What Staff Engineers Say (That Seniors Don't)

Concept	Senior Response	Staff Response
Shard key choice	"Pick the highest-cardinality column"	"Optimize for query isolation first. Cardinality matters, but cross-shard fan-out kills p99 latency."
Hot partitions	"Add more shards"	"More shards don't fix skew. Introduce virtual shards or split the hot key with a suffix so load disperses without resharding."
Re-sharding	"We'll reshard when we need to"	"We provision 2-4x logical shards from day one and rebalance by remapping, not migrating."
Cross-shard joins	"Use a distributed query layer"	"Scatter-gather is O(shards). Denormalize or co-locate related entities on the same shard key to avoid it entirely."
Range vs. hash	"Range for time-series, hash for everything else"	"Range works until write volume skews to the latest partition. Combine range with salting or use a two-level scheme."

The Numbers That Matter

Single PostgreSQL instance: ~10K write TPS before saturation
Single MySQL instance: ~15K write TPS under typical workloads
Sweet spot per shard: 100 GB - 500 GB of data; beyond 500 GB, backup, restore, and replication lag become operational risks
Scatter-gather penalty: latency is bounded by the slowest shard, so p99 degrades as shard count grows
Logical-to-physical ratio: start with 4-8x logical shards per physical node to allow rebalancing without data movement

Common Interview Traps

Jumping to sharding before exhausting vertical scaling. Read replicas, connection pooling, and better indexing solve most problems below 10K TPS. State the threshold before proposing shards.
Ignoring the re-sharding story. If your design cannot add capacity without downtime, the interviewer will press on it. Always mention logical shards and live migration.
Choosing shard key based on write path alone. The read path usually dominates. A shard key that distributes writes perfectly but forces scatter-gather on every read is the wrong key.
Treating sharding as only a database concern. Caches, queues, and event streams all need consistent partitioning. Mismatched partition strategies between layers cause data inconsistency.

Sharding Strategy Selection

Rendering diagram...

Re-Sharding: The Operation Nobody Wants

Phase	What Happens	Duration	Risk
1. Dual-write	New writes go to both old and new shard layout	Days–weeks	Data divergence if writes conflict
2. Backfill	Historical data copied to new layout	Hours–days	Must be idempotent; re-runnable on failure
3. Verify	Checksums compared between old and new	Hours	Missing rows surface as silent data loss
4. Cutover	Read path switches to new layout	Minutes	Rollback plan must exist; test it before cutover
5. Cleanup	Old shard layout decommissioned	Days	Keep old layout for 1-2 weeks as safety net

Staff insight: The teams that re-shard smoothly are the ones who started with logical shards from day one. Re-mapping logical → physical is a config change. Re-hashing keys is a migration.

Practice Prompt

Staff-Caliber Answer Shape

Expand

Acknowledge the conflict. user_id sharding optimizes for per-user queries. Date-range queries require scatter-gather across all 32 shards — that's 32 parallel queries with the result bounded by the slowest shard.
Don't re-shard. Changing the shard key breaks the per-user path. Instead, add a secondary index layer: replicate events into a date-partitioned analytical store (ClickHouse, BigQuery, or a date-range-sharded read replica).
CDC pipeline. Use change data capture to stream from the user_id-sharded OLTP store to the date-partitioned OLAP store. Latency: seconds to minutes, acceptable for analytics.
Query routing. Per-user queries hit the OLTP shards. Date-range analytics hit the OLAP store. Each store is optimized for its access pattern.

The Staff move: Recognize that one sharding strategy cannot serve both access patterns. The answer is two specialized stores connected by a replication pipeline, not a compromise shard key.

Additional Traps

"We'll use a UUID as shard key." UUIDs distribute well but make debugging impossible — you can't tell which shard holds a specific user's data without a lookup.
Ignoring cross-shard transaction cost. Any operation spanning multiple shards requires distributed transactions (2PC/Saga) or accepts eventual consistency. Name which one.
Underestimating shard count growth. Starting with 4 shards and planning to double at each threshold means you need 3 re-sharding operations to reach 32. Start with 128 logical shards mapped to 4 physical nodes.