Consistency Models & Partition Behavior — Staff Interview Quick Reference
The 60-Second Version
- CAP is not "choose 2 of 3." Partitions happen whether you like them or not. During a partition, you choose between availability and consistency. That is the only choice.
- The real question is what you do when there is no partition. PACELC extends CAP: without a partition, the daily tradeoff is latency vs. consistency — and that tradeoff runs every single request.
- "Eventually consistent" without a bounded staleness number is not a design — it is a hope. Staff engineers quantify: "replicas converge within 200ms in-region."
- Strong consistency (linearizability) means every read sees the most recent write. The cost is latency: quorum reads and writes on the critical path.
- Causal consistency preserves happens-before ordering and covers most social and collaboration use cases without the latency penalty of linearizability.
- Read-your-writes is the minimum bar for any user-facing system. Users must see their own mutations immediately; violating this feels like data loss.
- Monotonic reads guarantee users never go back in time. Achievable with sticky sessions or version vectors at the routing layer.
What Staff Engineers Say (That Seniors Don't)
| Concept | Senior Response | Staff Response |
|---|---|---|
| CAP theorem | "We pick AP or CP" | "Partitions are a given. The interesting design is the PACELC tradeoff we make on every non-partition request — latency vs. consistency." |
| Eventual consistency | "Replicas sync up eventually" | "Our async replicas lag 10–100ms in-region. We set a bounded staleness SLO of 200ms and alert when p99 exceeds it." |
| Quorum writes | "We write to a majority" | "W + R > N gives us strong reads. With N=3, W=2, R=2 is standard. Dropping to R=1 trades consistency for tail-latency — acceptable for the activity feed, not for payments." |
| Read-your-writes | "We can use sticky sessions" | "We route the writing user to the primary for 5s post-mutation, then fall back to replicas. Other users tolerate bounded staleness." |
| Cross-region replication | "We replicate across regions" | "100–500ms cross-region lag means we offer causal consistency globally and linearizability only within the home region. Region failover surfaces a staleness window we size into the RPO budget." |
The Numbers That Matter
- Same-region async replication lag: 10–100ms typical
- Cross-region async replication lag: 100–500ms typical
- Quorum formula: W + R > N for strong consistency
- Classic quorum config: N=3, W=2, R=2
- Raft/Paxos commit latency: 1–2 RTTs to leader commit
- Bounded staleness SLO (typical): 200ms–1s depending on use case
Common Interview Traps
- Treating CAP as a permanent architecture label. Systems make different consistency choices per operation, not one global setting. A payments path can be CP while a recommendations feed is AP within the same system.
- Ignoring the "no partition" case. Partitions are rare. Interviewers want to hear how you handle the latency-consistency tradeoff that runs on every normal request.
- Saying "eventually consistent" without quantifying staleness. Always attach a number. "How eventual?" is the follow-up that separates Staff answers from Senior ones.
- Forgetting read-your-writes. Proposing an async-replicated system without explaining how the writing user sees their own data is a red flag at Staff level.
Consistency Spectrum
Rendering diagram...
| Model | Guarantee | Typical Latency Cost | Use Case |
|---|---|---|---|
| Eventual | Replicas converge "eventually" | Lowest — async replication | Activity feeds, view counters, recommendations |
| Monotonic reads | Once you see version N, you never see N-1 | Low — sticky routing | User browsing history, dashboard stats |
| Read-your-writes | You see your own mutations immediately | Low-medium — route to primary post-write | Profile edits, settings changes, cart updates |
| Causal | If A caused B, everyone sees A before B | Medium — version vectors or hybrid clocks | Social comments, collaborative documents |
| Linearizable | Global total order, latest write always visible | Highest — quorum or consensus | Payments, inventory, account balances |
Per-Operation Consistency
The Staff-level insight is that consistency is chosen per operation, not per system:
| Operation | Consistency Needed | Why |
|---|---|---|
| Check account balance | Linearizable | Showing stale balance leads to overdraft |
| Display friend's post | Eventual (bounded) | 500ms staleness is invisible to users |
| Add item to cart | Read-your-writes | User must see their own cart immediately |
| Update shared document | Causal | Edits must respect happens-before ordering |
| Show "like" count | Eventual | Approximate counts are acceptable |
Practice Prompt
Staff-Caliber Answer ShapeExpand
- Accept the staleness for reads. Cross-region replication lag (100-500ms) means Tokyo sees a stale inventory count. This is the PACELC tradeoff: during normal operation, we choose latency over consistency for reads.
- Enforce consistency on writes. The purchase operation routes to the inventory's home region (where the source of truth lives). This adds 150ms for a Tokyo user, but it prevents overselling.
- Optimistic UI with server validation. Show "in stock" based on local replica. When user clicks "buy," the request goes to the home region. If inventory is gone, return a clear error: "Sorry, this item just sold out." This is a graceful degradation, not a bug.
- Bounded staleness SLO. Set an SLO: replicas within 500ms of primary. Alert when lag exceeds this. Users see slightly stale counts, but the window is small enough that oversell conflicts are rare.
The Staff move: Frame the answer as a product tradeoff: "We trade slightly stale product pages for sub-100ms global reads. The only strongly-consistent operation is the purchase itself."
Additional Traps
- Applying linearizability everywhere "to be safe." Linearizability at global scale means quorum writes across regions — 300ms+ per write. Most operations don't need it. Over-applying it destroys latency for no user-visible benefit.
- Confusing "consistent" with "correct." Eventual consistency is not inconsistency — it's a guarantee that replicas converge. The question is how fast and what happens during the convergence window.
- Ignoring conflict resolution. If two regions accept conflicting writes during a partition, you need a resolution strategy: last-writer-wins (data loss risk), CRDTs (restricted operations), or manual merge (operational cost).