Networking Essentials — Staff Interview Quick Reference

The 60-Second Version

RTT is the fundamental design constraint. Every network decision is a latency budget decision. Same-AZ: ~0.5ms. Same-region: ~1ms. Cross-continent: ~150ms. These numbers shape every architecture choice.
DNS TTL is a staleness injection point. DNS-based load balancing means clients cache resolved IPs. A 60s TTL means up to 60s of traffic to a dead backend after failover.
TLS has a connection tax. TLS 1.3 costs 1 RTT; TLS 1.2 costs 2 RTTs. Connection reuse (keep-alive, pooling) eliminates this cost for all subsequent requests.
TCP slow start throttles new connections. The first ~14KB trickles through a fresh connection. Connection pooling amortizes this ramp-up across many requests.
HTTP/2 multiplexing solves HTTP head-of-line blocking but not TCP's. A single dropped packet stalls all streams on that connection. HTTP/3 (QUIC) over UDP eliminates this.
Connection lifecycle cost drives architecture. Where you terminate TLS, where you pool connections, and where you place proxies are the decisions that determine your tail latency.

What Staff Engineers Say (That Seniors Don't)

Concept	Senior Response	Staff Response
Latency	"We can add a cache to reduce latency"	"Our latency budget is 200ms. Cross-region RTT alone consumes 150ms, so caching must be at the edge or we fail budget before application logic runs"
DNS failover	"DNS will route to the healthy region"	"DNS TTL of 300s means 5 minutes of degraded traffic post-failover. We need client-side retry with a fallback IP, or we accept that SLA gap"
TLS	"We terminate TLS at the load balancer"	"We terminate TLS at the edge to pay the handshake cost once, then run plaintext inside the VPC to avoid re-encryption overhead per hop"
Connection reuse	"We use connection pooling"	"Each new TCP connection costs 1 RTT for handshake plus slow start. A warm pool of 50 connections per backend eliminates that for p99, but we size the pool to avoid file descriptor exhaustion"
HTTP/2 vs HTTP/3	"HTTP/2 is faster because of multiplexing"	"HTTP/2 multiplexing helps, but a single TCP packet loss stalls every stream. For mobile or lossy networks, QUIC gives us independent stream recovery — that is where the real tail latency win lives"

The Numbers That Matter

Metric	Value	Why It Matters
Same-AZ RTT	~0.5ms	Baseline for microservice call chains
Same-region, cross-AZ RTT	~1ms	Replication cost for synchronous writes
Cross-continent RTT	~150ms	Hard floor for global user-facing requests
TLS 1.3 handshake	1 RTT (~0.5–150ms)	Connection reuse eliminates this entirely
TCP slow start window	~14KB initial	First page load is bottlenecked here
1 Gbps throughput	~120 MB/s	Theoretical max; plan for 70% utilization
TCP window ramp to full speed	~4–5 RTTs	New connections are slow for the first ~100ms

Common Interview Traps

Ignoring RTT in call chain math. Five sequential microservice calls at 1ms each is 5ms, not "negligible." Cross-region, that same chain is 750ms and your design is broken.
Treating DNS as instant and reliable. Candidates propose DNS failover without accounting for TTL propagation delay or client-side caching behavior.
Proposing HTTP/2 as a silver bullet. Multiplexing helps, but TCP head-of-line blocking remains. Interviewers probe whether you understand the layer at which the problem actually lives.
Forgetting connection lifecycle costs. Adding a new proxy hop means a new TLS termination and TCP slow start unless you explicitly design for connection pooling at that layer.

Connection Lifecycle

Rendering diagram...

Decision Matrix: Protocol Selection

Scenario	Protocol	Why
Browser → API Gateway	HTTPS/2	Broad compatibility, multiplexing, TLS required
Service → Service (same region)	gRPC over HTTP/2	Binary protobuf, streaming, strong typing
Service → Service (cross-region)	gRPC with retries + deadlines	High RTT demands efficient protocol + explicit timeout budgets
Mobile on lossy network	HTTP/3 (QUIC)	Independent stream recovery, 0-RTT resumption
Real-time bidirectional	WebSocket over TLS	Persistent connection, low per-message overhead
Static assets	CDN + HTTP/2	Edge caching eliminates origin RTT entirely

Deeper: TCP Slow Start Math

A fresh TCP connection starts with an initial congestion window (IW) of 10 segments × 1,460 bytes = ~14 KB. After each RTT, the window roughly doubles:

RTT #	Congestion Window	Cumulative Data
0	14 KB	14 KB
1	28 KB	42 KB
2	56 KB	98 KB
3	112 KB	210 KB
4	224 KB	434 KB

Implication: A 200 KB API response on a fresh connection takes 3 RTTs just to deliver the payload — on top of the TLS handshake. On a cross-region connection (150ms RTT), that's 450ms of slow-start overhead alone. Connection pooling eliminates this entirely.

Practice Prompt

Staff-Caliber Answer Shape

Expand

Decompose the 800ms. Instrument each hop: what's the RTT between each service pair? Are these same-AZ (0.5ms expected) or cross-AZ (1ms)? Is any leg cross-region?
Check connection reuse. Are connections being pooled or re-established per request? Four fresh TLS 1.3 handshakes at 1ms each is 4ms — negligible. But four fresh connections at 150ms cross-region is 600ms just in handshakes.
Measure serialization overhead. Is this JSON over REST (parsing cost) or protobuf over gRPC (binary, fast)? For large payloads, serialization can dominate.
Look at the dependency graph. Can any of the 4 calls be parallelized? Sequential calls are additive latency; parallel calls are max-of-group latency.
Check tail latency amplification. P99 of 4 sequential calls is worse than p99 of any single call. If each service has p99 of 200ms, the chain p99 is higher than 200ms due to probability stacking.

The Staff move: Don't start with code profiling. Start with the network topology and ask whether the call chain can be restructured (parallel, batched, or eliminated).

Additional Traps

Assuming same-region means low latency. Cross-AZ RTT (1ms) × a 10-hop microservice chain = 10ms of pure network overhead before any computation.
Designing for bandwidth when latency is the constraint. Most microservice payloads are <10 KB. The bottleneck is RTT count, not throughput.
Ignoring keepalive configuration. HTTP keepalive defaults vary by language and framework. A 5-second idle timeout means connections are frequently re-established under bursty traffic.
Forgetting DNS resolution latency. Each DNS lookup can add 1-50ms depending on caching. In a fresh container, the first request pays full resolution cost.

Where This Appears

CDN & Edge Caching — Edge latency, DNS-based routing
Load Balancer — L4 vs L7, TCP termination
API Gateway — TLS termination, connection pooling
Service Discovery — DNS-based discovery, TTL management
Chat & Messaging — Persistent connections, keep-alive