StaffSignal
Foundation — Quick Reference

Sharding & Partitioning

Shard key is a multi-year commitment. Re-sharding is a migration, not a refactor. Why 'just shard it' is the most expensive sentence in system design.

Sharding & Partitioning — Staff Interview Quick Reference

The 60-Second Version

  • Hash-based partitioning gives even distribution but makes range queries impossible. Use it for point lookups (user profiles, session stores).
  • Range-based partitioning supports range queries naturally but creates hot partitions on sequential keys (timestamps, auto-increment IDs).
  • Shard key selection is a multi-year commitment. Changing it later means full data migration: dual-write, backfill, verify, cutover. This is not a refactor; it is a project.
  • Cross-shard queries use scatter-gather, which fans out to every shard and merges results. Design your data model so the dominant access pattern hits a single shard.
  • Hot keys break the sharding contract. One viral user or popular product concentrates load on one shard. Mitigate with compound keys, virtual shards, or request-level caching.
  • The real question is never "how many shards?" It is "what does the access pattern look like in three years?"
  • Re-sharding is an operational migration, not a code change. Plan for it from day one by using logical shards mapped to physical nodes.

What Staff Engineers Say (That Seniors Don't)

ConceptSenior ResponseStaff Response
Shard key choice"Pick the highest-cardinality column""Optimize for query isolation first. Cardinality matters, but cross-shard fan-out kills p99 latency."
Hot partitions"Add more shards""More shards don't fix skew. Introduce virtual shards or split the hot key with a suffix so load disperses without resharding."
Re-sharding"We'll reshard when we need to""We provision 2-4x logical shards from day one and rebalance by remapping, not migrating."
Cross-shard joins"Use a distributed query layer""Scatter-gather is O(shards). Denormalize or co-locate related entities on the same shard key to avoid it entirely."
Range vs. hash"Range for time-series, hash for everything else""Range works until write volume skews to the latest partition. Combine range with salting or use a two-level scheme."

The Numbers That Matter

  • Single PostgreSQL instance: ~10K write TPS before saturation
  • Single MySQL instance: ~15K write TPS under typical workloads
  • Sweet spot per shard: 100 GB - 500 GB of data; beyond 500 GB, backup, restore, and replication lag become operational risks
  • Scatter-gather penalty: latency is bounded by the slowest shard, so p99 degrades as shard count grows
  • Logical-to-physical ratio: start with 4-8x logical shards per physical node to allow rebalancing without data movement

Common Interview Traps

  • Jumping to sharding before exhausting vertical scaling. Read replicas, connection pooling, and better indexing solve most problems below 10K TPS. State the threshold before proposing shards.
  • Ignoring the re-sharding story. If your design cannot add capacity without downtime, the interviewer will press on it. Always mention logical shards and live migration.
  • Choosing shard key based on write path alone. The read path usually dominates. A shard key that distributes writes perfectly but forces scatter-gather on every read is the wrong key.
  • Treating sharding as only a database concern. Caches, queues, and event streams all need consistent partitioning. Mismatched partition strategies between layers cause data inconsistency.

Sharding Strategy Selection

Rendering diagram...

Re-Sharding: The Operation Nobody Wants

PhaseWhat HappensDurationRisk
1. Dual-writeNew writes go to both old and new shard layoutDays–weeksData divergence if writes conflict
2. BackfillHistorical data copied to new layoutHours–daysMust be idempotent; re-runnable on failure
3. VerifyChecksums compared between old and newHoursMissing rows surface as silent data loss
4. CutoverRead path switches to new layoutMinutesRollback plan must exist; test it before cutover
5. CleanupOld shard layout decommissionedDaysKeep old layout for 1-2 weeks as safety net

Staff insight: The teams that re-shard smoothly are the ones who started with logical shards from day one. Re-mapping logical → physical is a config change. Re-hashing keys is a migration.

Practice Prompt

Staff-Caliber Answer Shape
Expand
  1. Acknowledge the conflict. user_id sharding optimizes for per-user queries. Date-range queries require scatter-gather across all 32 shards — that's 32 parallel queries with the result bounded by the slowest shard.
  2. Don't re-shard. Changing the shard key breaks the per-user path. Instead, add a secondary index layer: replicate events into a date-partitioned analytical store (ClickHouse, BigQuery, or a date-range-sharded read replica).
  3. CDC pipeline. Use change data capture to stream from the user_id-sharded OLTP store to the date-partitioned OLAP store. Latency: seconds to minutes, acceptable for analytics.
  4. Query routing. Per-user queries hit the OLTP shards. Date-range analytics hit the OLAP store. Each store is optimized for its access pattern.

The Staff move: Recognize that one sharding strategy cannot serve both access patterns. The answer is two specialized stores connected by a replication pipeline, not a compromise shard key.

Additional Traps

  • "We'll use a UUID as shard key." UUIDs distribute well but make debugging impossible — you can't tell which shard holds a specific user's data without a lookup.
  • Ignoring cross-shard transaction cost. Any operation spanning multiple shards requires distributed transactions (2PC/Saga) or accepts eventual consistency. Name which one.
  • Underestimating shard count growth. Starting with 4 shards and planning to double at each threshold means you need 3 re-sharding operations to reach 32. Start with 128 logical shards mapped to 4 physical nodes.

Where This Appears