System Design Cheat Sheet
The whole series, compressed. Print it, or skim it the night before.
This page is deliberately plain — no animations, no demos — so it prints cleanly and reads fast.
Every row links back to the full concept if a definition has gone fuzzy.
The 9 Questions That Shape Every Design
Requirements · Scale · Data · APIs · Reliability · Performance · Consistency · Operations · Cost
Ask all nine, every time. The questions never change; the answers build wildly different systems.
The 5-Step Framework
- Clarify requirements — what's in, what's explicitly out
- Estimate scale —
users × actions/day ÷ 86,400 ≈ avg/sec, then ×3–4 for peak (shortcut: per day ≈ ÷100,000 per second)
- Start simple — 3–4 boxes; complexity must be earned
- Find bottlenecks — ask "what if X dies / 10×s / goes cold?" for every box
- Name trade-offs — every technique's cost, said out loud
The Honest Defaults
Reach for these unless a requirement forces otherwise — and be ready to say which requirement:
| Decision | Default | Move off it only when… |
|---|
| Architecture | Monolith | Team scaling or per-component scaling demands services |
| Database | SQL (Postgres) | A specific access pattern or scale forces NoSQL |
| API style | REST | Many varied clients (GraphQL) / real-time (WebSockets) / events (Webhooks) |
| Scaling order | Up → out → replicate → shard | Never skip ahead; sharding is the last resort |
| Consistency | Strong where money/safety, eventual elsewhere | — |
| Concept | The one-liner | The trap |
|---|
| Client–Server | One asks, one answers; the contract is the only fixed thing | Trusting client input or server uptime |
| IP | Every box in your diagram has an address; arrows = machines dialing IPs | Hardcoded IPs in configs and tests |
| DNS | Name → IP, cached at every hop, each with a TTL | "Propagation delay" = old caches expiring; stale /etc/hosts |
| Proxies | Forward shields clients; reverse shields servers (TLS, caching, LB) | App sees proxy's IP unless X-Forwarded-For is honored |
| Latency | Propagation + serialization + processing + queuing | Averages lie — measure p99, not p50 |
| HTTP/HTTPS | Stateless protocol; status codes are a contract; TLS adds a round trip | An endpoint that also answers plain HTTP |
| Style | Use when | The cost |
|---|
| REST | Default — CRUD, cacheable, predictable | Over/under-fetching on rich screens |
| GraphQL | Many clients, each wanting different fields | Resolver complexity, no free HTTP caching, N+1 risk |
| WebSockets | Live two-way: chat, presence, dashboards | Stateful — scaling and reconnection are on you |
| Webhooks | Server→server "tell me when X happens" | Needs retries + idempotency keys + event log |
Verb contracts: GET never mutates · PUT/DELETE are idempotent · delivery is at-least-once, so processing must be exactly-once.
| Concept | The one-liner | The trap |
|---|
| DB types | Relational / document / key-value / graph — pick by access pattern | Asking "which DB is best?" |
| SQL vs NoSQL | ACID + joins vs flexible + horizontal scale | "NoSQL is faster" — it just moves the work into your app |
| Indexing | B-tree, O(log n) reads; slower writes, more storage | Tiny test datasets hide missing indexes — read EXPLAIN |
| Vertical partitioning | Split a wide table by columns / access pattern | The multi-table write is no longer atomic for free |
| Caching | Cache-aside + Redis = the biggest read win | Invalidation, cold-start stampedes, thundering herds |
| Denormalization | Duplicate data to kill joins on read-heavy paths | Copies drift — never propose it without a sync plan |
| Blob storage | Files in S3, URL in the database | Orphans on either side of the seam; "private" URLs that aren't |
| Concept | The one-liner | The trap |
|---|
| Vertical | Bigger box; zero code changes | Hard ceiling, single point of failure |
| Horizontal | More boxes behind a load balancer | Servers must be stateless (sessions → shared store) |
| Load balancer | Round robin / least-connections / weighted + health checks | The LB itself is a SPOF until it's redundant |
| Replication | Writes → primary, reads → replicas; failover via promotion | Lag; async failover can lose acked writes |
| Sharding | Split by rows to scale writes (horizontal partitioning) | Hot shards, cross-shard queries, % N reshuffling |
| Consistent hashing | Ring placement — adding a shard moves only ~1/N of keys | Skipping it and migrating everything under load |
Distributed Systems — full page
| Concept | The one-liner | The trap |
|---|
| CAP | During a partition: consistency or availability | Sorting databases into boxes — most are tunable |
| PACELC | No partition? You're still trading latency vs consistency | Forgetting the everyday trade-off bites more than partitions |
| CDN | Pull-through edge caches near users | "Deployed but users see the old version" — cache busting |
| Idempotency | Same request N times = same effect; keys make retries safe | The concurrent-retry race in "check then insert" |
| Timeouts/retries | Every call gets a deadline; retry with backoff + jitter | Retry storms — 3 retries = 4× load at the worst moment |
| Circuit breaker | After enough failures, fail fast; probe half-open to recover | A breaker that never closes again |
Architecture Patterns — full page
| Concept | The one-liner | The trap |
|---|
| Microservices | Independent deploy/scale/teams; each owns its DB | The bugs move into the gaps — contract tests or bust |
| Message queues | Async decoupling + buffering; consumers drain at their pace | At-least-once delivery → consumers must be idempotent; poison messages |
| Rate limiting | Token bucket / sliding window at the gateway, counter in Redis | Per-instance counters = limit × number of servers |
| API gateway | One front door: auth, routing, limits, aggregation | One unprotected route leaks the whole backend |
Numbers Worth Knowing Cold
| Number | Meaning |
|---|
| ~100ms | Feels instant; ~1s feels sluggish; 3s+ users leave |
| 86,400 ≈ 10⁵ | Seconds per day — per day ÷ 100,000 ≈ per second |
| 3–4× | Peak-to-average traffic multiplier for estimates |
| 100:1 | Typical read:write ratio for feeds/content systems |
| 62⁷ ≈ 3.5T | 7-char Base62 namespace (short codes, IDs) |
| p50 vs p99 | The median lies; the p99 writes the angry review |
The QA Attack Checklist
The cross-cutting tests that find real system-level defects, in one list:
- Two nodes, not one — statefulness, sticky sessions, and per-instance counters all hide on a single node
- Write here, read there — replication lag and read-your-own-writes, made explicit
- Same request twice (then twice concurrently) — idempotency and its race window
- Kill it mid-flight — the primary under write load, a WebSocket server mid-session, an upload at 50%
- Slow, not dead — inject latency, not errors; watch threads pile up upstream
- Cold everything — empty cache under load, cold CDN region, queue replay after downtime
- The N+1th request — rate limit boundaries: clean
429 with Retry-After, never a 500
- The seams — DB row ↔ blob file, service ↔ service contracts, cache ↔ source of truth
Keep Going