Dealing with Contention — Cross-Cutting Pattern
The Problem
Multiple writers targeting the same resource at the same time. Two users booking the last seat. Ten thousand carts decrementing the same inventory counter. Contention is not a bug — it's a design constraint. The question isn't whether it happens, but whether you prevent it (avoidance), detect and resolve it (optimistic), or serialize it (pessimistic).
Playbooks That Use This Pattern
- Reservation & Inventory Systems — Seat/room contention under concurrent booking
- Flash Sales & Virtual Queues — Extreme burst contention on limited inventory
- Order Matching & Exchanges — Order book contention at matching time
- Payment Processing — Balance contention during concurrent transactions
- Distributed Consensus — Leader election contention
The Core Tradeoff
| Strategy | What Works | What Breaks | Who Pays |
|---|---|---|---|
| Pessimistic locking | Simple correctness, easy to reason about | Throughput ceiling, deadlock risk, lock convoy | Every request waits — even non-conflicting ones |
| Optimistic concurrency (CAS) | High throughput under low contention | Retry storms when contention spikes | Losers retry — cost grows with conflict rate |
| Queue serialization | Clean ordering, no lost updates | Throughput bound by single consumer | Latency — every request waits in line |
| Reservation / hold pattern | Business-level TTL lock, graceful expiry | Complexity in hold cleanup, phantom inventory | Operational cost of expiry reconciliation |
| Sharded counters | Distributes write contention across N shards | Reads require aggregation, eventual consistency on totals | Read latency and complexity to get accurate counts |
Staff Default Position
Contention avoidance > contention resolution. Before picking a locking strategy, ask: can I reshape the data model so these writers don't collide?
Shard by user. Partition by region. Pre-allocate inventory blocks. Move from a single global counter to per-shard counters that reconcile asynchronously. If contention is truly unavoidable, serialize via queue rather than lock — queues give you backpressure and ordering; locks give you deadlocks and convoys.
When to Deviate
- Low-contention, high-value writes — Optimistic concurrency with CAS is the right call. Retry cost is negligible when conflicts are rare.
- Sub-millisecond latency requirements — Queue serialization adds latency. A sharded lock or CAS may be the only option that meets SLOs.
- Strict global ordering — If the business requires total order (e.g., exchange matching), a single serialization point is the honest answer. Acknowledge the throughput ceiling and design around it.
- User-facing holds — Reservation/hold pattern is the right abstraction when the business already thinks in terms of "hold this for me while I decide." Don't fight the domain model.
Common Interview Mistakes
| What Candidates Say | What Interviewers Hear | What Staff Engineers Say |
|---|---|---|
| "We'll use distributed locks" | "I haven't thought about failure modes" | "Distributed locks have partial failure semantics — I'd prefer CAS or queue serialization unless we need mutual exclusion across services" |
| "Optimistic locking solves this" | "I haven't considered high-contention scenarios" | "Optimistic concurrency works at low contention. Past a threshold, retry storms make it worse than serialization" |
| "Just use a database transaction" | "I'm hoping the database handles it" | "I need to name the isolation level and understand what anomalies it permits under this workload" |
| "We'll lock the row" | "Single-strategy thinking" | "Before locking, can I shard the data so these writes don't compete? Contention avoidance beats contention resolution" |
| "Redis SETNX for the lock" | "No TTL, no fencing, no plan for crashes" | "If we must lock, we need TTL, fencing tokens, and a clear owner-crash recovery path" |
Quick Reference
Staff Sentence Templates
Implementation Deep Dive
1. Optimistic Concurrency with CAS — Redis WATCH/MULTI/EXEC
Optimistic concurrency assumes conflicts are rare. Instead of acquiring a lock before the operation, you attempt the operation and detect if another writer modified the data between your read and write. If so, you retry. Redis provides this natively via WATCH.
Redis CAS Pattern
function decrementInventory(itemId):
maxRetries = 5
for attempt in range(maxRetries):
redis.WATCH("inventory:" + itemId) # Mark key for monitoring
current = int(redis.GET("inventory:" + itemId))
if current <= 0:
redis.UNWATCH()
return { success: false, reason: "out_of_stock" }
# MULTI/EXEC is atomic only if WATCH'd key unchanged
tx = redis.MULTI()
tx.SET("inventory:" + itemId, current - 1)
result = tx.EXEC()
if result != null: # Transaction succeeded
return { success: true, remaining: current - 1 }
# result == null means another writer changed the key — retry
metrics.increment("cas.retry", tags=["item:" + itemId])
return { success: false, reason: "contention_exceeded" }
How it works: WATCH tells Redis to track a key. If any client modifies that key between WATCH and EXEC, the transaction aborts (returns null). The client retries with fresh data. This is lock-free — no other client is blocked.
When CAS Breaks Down
| Conflict Rate | Avg Retries/Request | Effective Throughput | Verdict |
|---|---|---|---|
| 1% | 1.01 | ~99% | CAS is ideal |
| 5% | 1.05 | ~95% | CAS works well |
| 10% | 1.11 | ~85% | Monitor closely |
| 20% | 1.25 | ~60% | Switch to serialization |
| 50% | 2.0 | ~30% | CAS is actively harmful |
2. Pessimistic Locking — PostgreSQL Advisory Locks
When correctness requires mutual exclusion and you cannot tolerate retries, pessimistic locking is the honest choice. PostgreSQL advisory locks are application-level locks that don't interfere with row-level locking.
Advisory Lock Pattern
function processPayment(accountId, amount):
lockKey = hashToInt64("payment:" + accountId) # Advisory locks use bigint keys
# Try to acquire lock with timeout (non-blocking variant)
acquired = db.query("SELECT pg_try_advisory_lock($1)", lockKey)
if not acquired:
return { success: false, reason: "concurrent_payment_in_progress" }
try:
balance = db.query("SELECT balance FROM accounts WHERE id = $1 FOR UPDATE", accountId)
if balance < amount:
return { success: false, reason: "insufficient_funds" }
db.query("UPDATE accounts SET balance = balance - $1 WHERE id = $2", amount, accountId)
db.query("INSERT INTO transactions (account_id, amount, type) VALUES ($1, $2, 'debit')",
accountId, amount)
db.commit()
return { success: true, newBalance: balance - amount }
finally:
db.query("SELECT pg_advisory_unlock($1)", lockKey) # Always release
Why advisory locks over SELECT ... FOR UPDATE: Advisory locks span transactions. A FOR UPDATE lock releases at commit, but an advisory lock can be held across multiple queries, API calls, or even external service calls. Use FOR UPDATE for single-transaction correctness; use advisory locks when the critical section spans more than one transaction.
Deadlock Prevention Rules
| Rule | Implementation | Cost of Violation |
|---|---|---|
| Lock ordering | Always acquire locks in ascending key order | Deadlock requiring timeout + retry |
| Lock timeout | Set lock_timeout = 5s at session level | Indefinite hang consuming a connection |
| Lock scope minimization | Hold lock only during the critical section, not the whole request | Throughput degradation proportional to non-critical work |
| Lock monitoring | Track pg_stat_activity for wait_event = 'advisory' | Silent throughput cliff with no alert |
3. Queue Serialization — Kafka Single-Partition Ordering
When contention is high and correctness requires ordering, serialization through a queue eliminates all concurrency on the contended resource. Every mutation goes through a single-partition topic; the consumer processes them one at a time.
Kafka Serialization Pattern
# Producer: route all mutations for a resource to the same partition
function submitOrderBookUpdate(orderId, symbol, action):
producer.send(
topic = "order-mutations",
key = symbol, # All orders for AAPL go to one partition
value = serialize({ orderId, symbol, action, timestamp: now() }),
acks = "all"
)
return { status: "accepted", position: "queued" }
# Consumer: single-threaded processing per partition
function orderMutationConsumer(partition):
for record in consumer.poll():
mutation = deserialize(record.value)
# Process in strict order — no concurrent access to order book
match mutation.action:
case "place":
orderBook.place(mutation)
case "cancel":
orderBook.cancel(mutation.orderId)
case "match":
orderBook.executeMatch(mutation)
consumer.commitOffset(record)
Why this works: Kafka guarantees ordering within a partition. By routing all mutations for the same resource (e.g., stock symbol) to the same partition, you get total ordering without locks, without retries, and without CAS failures.
Throughput Ceiling and Escape Hatch
A single Kafka partition processes ~50K-100K messages/sec with a single consumer. That is the hard ceiling per contention domain.
To scale beyond this:
- Partition by sub-resource — instead of one partition for all AAPL orders, partition by price band:
AAPL:100-110,AAPL:110-120. Cross-partition matching requires a coordination step. - Batch processing — the consumer processes 100 mutations in a single database transaction, amortizing commit overhead.
- Accept eventual consistency — if strict ordering is not required for all operations (e.g., cancel does not need to be ordered relative to place), split into separate topics with different ordering guarantees.
4. Sharded Counters — Distributed Increment with Periodic Aggregation
When thousands of writes target a single counter (likes, views, inventory), shard the counter across N sub-counters. Each writer increments a random shard; reads aggregate across all shards.
Redis Sharded Counter
SHARD_COUNT = 16
function incrementCounter(entityId):
shard = random(0, SHARD_COUNT - 1)
key = "counter:" + entityId + ":shard:" + shard
redis.INCR(key)
function getCounter(entityId):
keys = ["counter:" + entityId + ":shard:" + i for i in range(SHARD_COUNT)]
values = redis.MGET(keys)
return sum(int(v or 0) for v in values)
# Periodic aggregation (background job every 30s)
function aggregateCounter(entityId):
total = getCounter(entityId)
redis.SET("counter:" + entityId + ":total", total)
# Optionally reset shards after aggregation for bounded memory
The tradeoff: Writes hit random shards — no contention, O(1) per increment. Reads aggregate across N shards — O(N) per read, eventually consistent with the true count. The :total key provides a cached aggregate for fast reads, refreshed every 30 seconds.
When to Aggregate vs. Read All Shards
| Use Case | Strategy | Latency | Consistency |
|---|---|---|---|
| Display "1.2M likes" on a post | Read :total (cached) | <1ms | ~30s stale |
| Check if inventory > 0 for checkout | Read all shards live | ~2ms for 16 shards | Real-time |
| Analytics dashboard counts | Read :total (cached) | <1ms | ~30s stale |
| Flash sale "X remaining" display | Read all shards live | ~2ms | Real-time |
Architecture Diagram
Selection logic: The application tier selects the contention strategy based on the resource type and measured conflict rate. CAS for low-contention key-value updates. Advisory locks for payment-critical mutual exclusion. Queue for order book mutations where ordering matters. Sharded counters for high-volume increment workloads.
Failure Scenarios
1. CAS Retry Storm During Flash Sale
Timeline: Flash sale begins at 12:00:00. 50K users attempt to decrement inventory of 1,000 items simultaneously. CAS conflict rate exceeds 80%. Each request retries 3-5 times before succeeding or exhausting retries. Effective load is 200K+ Redis operations in 10 seconds.
Blast radius: Redis CPU saturates at 90%. All CAS operations — not just inventory — experience elevated latency. Unrelated cache lookups sharing the same Redis cluster degrade. App server thread pools fill with retrying requests.
Detection: cas.retry rate per key spikes beyond configured threshold. Redis CPU utilization exceeds 80%. App server thread pool utilization exceeds 90%.
Recovery:
- Immediate: switch inventory operations from CAS to queue serialization (feature flag)
- Short-term: pre-shard inventory before the sale — allocate 100 units to each of 10 shards so writers do not collide
- Long-term: implement a virtual queue system that serializes access before reaching the inventory layer
2. Advisory Lock Leak After Application Crash
Timeline: An app server acquires pg_advisory_lock(42) for a payment operation. The server crashes mid-transaction. The PostgreSQL connection is severed. The advisory lock is released automatically when the session ends — but session-level locks acquired with pg_advisory_lock persist until explicit unlock or session disconnect. If connection pooling reuses the session, the lock is never released.
Blast radius: All subsequent payment operations for the same account wait indefinitely (if using blocking locks) or fail immediately (if using non-blocking). One crashed request blocks all future requests for that resource.
Detection: Query pg_locks WHERE locktype = 'advisory' and correlate with active PIDs. Alert when advisory lock hold time exceeds 30 seconds (normal payment processing completes in <5 seconds).
Recovery:
- Immediate: identify the leaked lock via
pg_locks, terminate the orphaned session withpg_terminate_backend(pid) - Short-term: switch to transaction-level advisory locks (
pg_advisory_xact_lock) which auto-release on transaction end — even if the application forgets to unlock - Long-term: wrap all advisory lock usage in a library that enforces timeout and automatic cleanup
3. Queue Consumer Lag Creating Phantom Inventory
Timeline: Order mutations are serialized through Kafka. The consumer falls behind — lag grows to 50K messages. During the lag window, the database shows 200 units of inventory, but 180 of those have pending decrements in the queue. The website displays "200 available" while only 20 are actually available.
Blast radius: Customers purchase items that are already sold. The system must handle 180 over-commitments — refunds, apology emails, and trust damage.
Detection: Consumer group lag monitoring. Threshold alert when lag exceeds N seconds of processing time (not just message count — a 10K message lag at 100K messages/sec throughput is only 100ms, which is fine).
Recovery:
- Immediate: pause accepting new orders until consumer catches up
- Short-term: add a "pessimistic available" counter that is decremented at write time (producer side) — may undercount but never overcount
- Long-term: maintain a real-time shadow counter that is decremented synchronously at order acceptance time, independently of the queue. The queue handles order processing, not inventory accounting.
Staff Interview Application
How to Introduce This Pattern
Lead with data model first, concurrency control second. This signals that you think in systems, not in locks.
When NOT to Use This Pattern
- No concurrent writers exist: If updates are single-threaded (batch jobs, cron tasks), adding concurrency control is pure overhead.
- Idempotent operations: If the operation is naturally idempotent (SET, not INCREMENT), concurrent writes produce the same result regardless of ordering. Last-writer-wins may be acceptable.
- Read-heavy workload: If the contention is on reads (thundering herd on cache miss), this pattern does not apply — look at Scaling Reads instead.
- Single-digit QPS: If you have 5 writes/sec to the same resource, a simple
FOR UPDATErow lock is the right answer. Don't over-engineer.
Follow-Up Questions to Anticipate
| Interviewer Asks | What They Are Testing | How to Respond |
|---|---|---|
| "What happens under high contention with CAS?" | Understanding of failure modes | "Retry storms. At 20%+ conflict rate, retries compound — each retry itself may conflict. I'd set a retry budget of 3 attempts and fail fast, or switch to queue serialization." |
| "How do you handle deadlocks?" | Practical lock management | "Lock ordering prevents deadlocks: always acquire locks in ascending key order. For detection, set lock_timeout so deadlocked transactions fail instead of hanging." |
| "How do sharded counters handle reads?" | Read-consistency tradeoffs | "Aggregate across all shards for real-time reads, or read a periodically-updated total for approximate reads. The choice depends on whether the consumer needs real-time accuracy or display-quality approximation." |
| "What is a fencing token?" | Distributed systems depth | "A monotonically increasing value assigned when a lock is acquired. The protected resource rejects operations with stale fencing tokens. This prevents split-brain where two processes both believe they hold the lock." |
| "Why not use distributed locks with Redis?" | Understanding of Redlock limitations | "Redis-based distributed locks (Redlock) have partial failure modes — clock skew, GC pauses, and network partitions can all cause two clients to hold the same lock. I'd prefer queue serialization for safety-critical operations." |