System Design Interview Cheat Sheet

Q: How many system design questions should I practice before interviewing?

Quality beats quantity. Do 10 questions with full 45-minute timers, self-recording, and review — that's worth more than 50 surface-level reads. A reasonable set: Twitter, WhatsApp, Uber, Dropbox, TinyURL, Instagram, Yelp, Netflix, YouTube, Rate Limiter. These ten cover every major pattern.

Q: Do I need to memorize actual database internals (B-trees, LSM trees)?

For L4/L5 interviews: know the conceptual difference and when to reach for each (LSM for write-heavy like Cassandra; B-tree for read-heavy/range queries like Postgres). You don't need to implement one. For L6+: yes, be prepared to go deeper on one.

Q: Should I use a specific cloud vendor's services in my answer (AWS S3, GCP Spanner)?

It's fine — and actually expected — for FAANG-adjacent interviews. The interviewer wants to know you've built real systems. But back it with a generic explanation: "I'd use S3 (an object store) for…" so the principle is clear.

Q: What if I don't know a concept the interviewer asks about?

Say so, briefly and confidently. "I haven't worked with distributed consensus directly, but I know Raft is the modern go-to — leader election by term, log replication with quorum writes. Could you help me reason through the specific part you're asking about?" That's a stronger signal than fabrication.

Q: How do I know I'm ready?

You can give a complete answer to a new question you've never seen, inside 45 minutes, without looking at notes, covering all 6 phases (requirements → scale). If you can do that three times in a row on random prompts, you're ready.

Most candidates over-prepare for system design. They buy Alex Xu's book, grind a 40-video YouTube playlist, read 8 blog posts from Meta engineers on the Saga pattern, and walk into their interview bloated with trivia they can't recall under pressure.

The truth: about 14 concepts cover roughly 80% of what you'll be asked in a mid-senior-level system design round at Google, Meta, Amazon, Stripe, Airbnb, or any equivalent. Everything else is either (a) a recombination of these, or (b) so company-specific that you'd only encounter it if interviewing for that exact team.

This guide is the minimum viable reference — built around patterns interviewers actually score on, with trade-offs they actually probe, and a one-page cheat sheet at the bottom. Already comfortable with system design and want to fill in the coding round? Skim our 15 LeetCode patterns guide next.

Who this is for: Candidates targeting L4–L6 IC roles (or equivalents: SWE II through Staff). If you're interviewing for Principal or Distinguished Engineer roles, you need more depth on specific domains (databases, distributed consensus, or whatever matches the role).

The Interview Structure You Need to Memorize

Every system-design round — regardless of company — follows roughly the same rhythm. Your interviewer is mentally checking boxes in this order:

Requirements clarification (5 min) — functional, non-functional, out-of-scope
Capacity estimation (3–5 min) — QPS, storage, bandwidth, cache size
High-level API / data model (5 min)
High-level architecture (5–10 min) — components and their boundaries
Deep dive on 1–2 components (10–15 min) — where most of the signal lives
Bottleneck / scale / trade-off discussion (5–10 min) — the grade-A signal

If you burn 20 minutes on requirements and rush the deep dive, you fail even if you "knew" the answer. Pacing is a skill. Practice it with a stopwatch.

The 14 Core Concepts

In rough order of how often they come up:

1. Load Balancing (always)

Every non-trivial system needs it. You should be able to instantly describe:

L4 (TCP) vs L7 (HTTP) — L7 gives you request-aware routing (e.g., by URL path or cookie); L4 is faster and simpler.
Round-robin vs least-connections vs consistent-hash — consistent hash is your answer whenever sticky routing matters (cache affinity, session continuity).
Client-side load balancers (e.g., gRPC) skip an extra hop; good for internal microservices.

2. Caching (always)

Interviewers will push you here. Know:

Cache placement: client → CDN → reverse proxy → application memory → distributed (Redis/Memcached) → database
Eviction policies: LRU (default), LFU (when access patterns are skewed), TTL-based (when freshness matters)
Invalidation strategies: write-through, write-back, write-around, TTL. If asked "what's the hardest problem in CS?" — cache invalidation is one of the two correct answers.
Cache stampede — when the cached key expires and 10,000 requests hit the DB simultaneously. Mitigate with request coalescing or staggered TTLs.

3. Database Choice: SQL vs NoSQL

The answer is always "it depends" but interviewers want to hear the right factors:

Use SQL when…	Use NoSQL when…
You need multi-row transactions	You have massive write throughput
Your schema is stable and relational	Your data is denormalized or document-shaped
You need complex JOINs and ad-hoc queries	Access patterns are known and narrow
Strong consistency is required	Eventual consistency is acceptable
Data volume < 10TB (ballpark)	Need to scale horizontally by design

Default answer for most systems: Postgres (or MySQL) unless you have a specific reason to deviate. "Always NoSQL because Google uses it" is a fail signal.

4. Sharding / Partitioning

Three strategies, in order of when to use:

Range-based: simple, but hot spots are deadly (e.g., sharding by user signup date = all new users hit the same shard)
Hash-based: good distribution, but range queries become expensive or impossible
Consistent hashing: what you want when nodes join/leave. Always mention virtual nodes to smooth the distribution.

5. Replication & Consistency

Leader-follower (primary-replica): writes go to the leader, reads can be spread across followers. Read-your-own-writes consistency requires routing to the leader (or a sticky session).
Multi-leader: conflict resolution is the hard part (last-write-wins loses data; CRDTs preserve it).
Synchronous vs async replication: sync = no data loss but higher write latency; async = fast writes but possible data loss on failover.

6. CAP Theorem (and Why Interviewers Get It Slightly Wrong)

CAP says: under a network partition, you pick Consistency OR Availability. You can't have all three in a distributed system when there's a partition. When the network is healthy, you can have both.

CP systems: Postgres, HBase, MongoDB (with majority write concern). Reject writes to avoid split-brain.
AP systems: DynamoDB, Cassandra, Riak. Accept writes on both sides of a partition; reconcile later.

The mature answer: "it depends on the specific operation" — some APIs on a single system can be CP, others AP.

7. Message Queues & Event-Driven Architecture

When you need decoupling, async processing, or ordered events. Know:

Kafka: log-structured, pull-based consumers, retention for replay. Choose for event-sourcing and high-throughput.
RabbitMQ / SQS: traditional queue semantics, push-based, good for task queues.
Pub/Sub: one event, many consumers (fan-out).
At-least-once vs exactly-once delivery — know that true exactly-once is only achievable with idempotency + dedup on the consumer.

8. CDNs

Mention them for anything user-facing. Static assets, images, video — all hit a CDN before your origin. Know cache busting (query string vs content-hashed filename), and that dynamic content can also be cached via signed URLs or at the edge.

9. Rate Limiting

Four algorithms in order of complexity/accuracy:

Token bucket (default — handles bursts gracefully)
Leaky bucket (smooths traffic to a constant rate)
Fixed window (simple but has boundary problem — 2x requests at the minute boundary)
Sliding window log (most accurate, highest memory)

Distributed rate limiting needs Redis + atomic operations (INCR + EXPIRE) or a specialized service.

10. Consistent Hashing

This is the single concept interviewers most love to quiz on. You should be able to draw the ring on a whiteboard in 60 seconds. Key properties:

Adding/removing a node only affects 1/N of the keys (vs N/N for modulo hashing)
Virtual nodes smooth out load imbalance when N is small
Used in: DynamoDB, Cassandra, Memcached clients, CDN edge routing

11. Quorum-Based Systems

For any eventual-consistency system (DynamoDB, Cassandra), know:

R + W > N   // strong read consistency
W = N       // maximum durability (slow writes)
R = 1, W = 1// fast but possibly stale

where N = replica count, R = read quorum, W = write quorum.

12. Long Polling / WebSockets / Server-Sent Events

For real-time features (chat, notifications, live dashboards). Know when to reach for each:

Long polling: simple, works behind most proxies; high latency and connection overhead
SSE: server-to-client only, auto-reconnect, text-based — great for notifications
WebSocket: full duplex, binary, low latency — chat, games, collaborative editing

13. Search (Elasticsearch / OpenSearch)

When the question involves full-text search, autocomplete, or faceted search, the right answer is Elasticsearch. Know that it's an inverted-index system, not a general database — you still need a system-of-record alongside.

14. Authentication & Authorization

Often skipped but increasingly probed for "real-world" questions:

Session-based (stateful, cookie-in-Redis) vs JWT (stateless, self-contained)
OAuth 2.0 flows: authorization code (web apps), PKCE (mobile), client credentials (service-to-service)
RBAC vs ABAC: role-based is simpler, attribute-based is more flexible for multi-tenant

The Magic Phrases (Memorize These)

Using these phrases correctly is a clear senior-level signal. Interviewers scribble a +1 next to your name every time they hear one:

"This has a read-to-write ratio of roughly 100:1, so I'm optimizing for reads."
"I'd denormalize here to avoid joins at query time, at the cost of an extra write path."
"We can use a materialized view for this query pattern."
"For this, eventual consistency is acceptable, which lets us scale horizontally."
"I'd put a bloom filter in front of the DB to avoid hitting it for keys that don't exist."
"This is a classic thundering-herd problem; I'd solve it with request coalescing."
"This operation should be idempotent, so we can safely retry."
"I'd use a write-ahead log here for durability before the commit."
"Given the hot partition risk, I'd shard on a composite key."

The 3 Things Junior Candidates Miss

If I had to name the three most common unforced errors I see when reviewing candidate transcripts:

1. Jumping to architecture before clarifying requirements

You'll be asked "design Twitter." If you start drawing boxes before asking "are we doing the Twitter-at-Jack-Dorsey scale or are we doing Twitter-for-a-startup-community?", you fail. The first 5 minutes of the interview are about forcing the interviewer to commit to a scope.

2. Not back-of-the-envelope'ing

"We'll have millions of users" is meaningless. "Assuming 100M DAU, each tweeting an average of 3 times per day, that's 300M writes/day = 3,500 writes/sec average, 35,000 peak" is how you build credibility in 60 seconds. Memorize the powers of two and latency numbers every programmer should know.

3. Saying "I'd use Redis" instead of "I'd use Redis because…"

The answer is never the component; it's the trade-off. "Redis for the cache" is a half-signal. "Redis for the cache because we need single-digit-millisecond reads and TTL-based invalidation, and Redis-Cluster gives us horizontal scale when we outgrow 300GB of RAM per shard" is the full signal.

The One-Page Cheat Sheet

Print this. Tape it to your monitor. Glance at it during the interview.

REQUIREMENTS       (5 min — force scope)
  • Functional: "What does it DO?"
  • Non-functional: latency, QPS, consistency, availability
  • Scale: DAU, data volume, read:write ratio

CAPACITY ESTIMATION  (3 min — build credibility)
  • DAU × requests/user = QPS
  • QPS × avg payload = bandwidth
  • QPS × seconds × retention = storage
  • Cache ≈ 20% of hot data

HIGH-LEVEL DESIGN    (10 min — boxes + arrows)
  Client → LB → API GW → Services → DB + Cache + Queue
          CDN → Static Assets
          WebSocket / SSE for real-time

DATA MODEL           (5 min)
  • Entities + relationships
  • SQL for relational + transactions
  • NoSQL for scale + known access patterns
  • Denormalize when read-heavy

DEEP DIVE            (15 min — pick ONE component)
  • Walk through a user action end-to-end
  • Name specific tech (Kafka, Redis, Postgres)
  • Justify every choice with a trade-off

SCALE & BOTTLENECKS  (10 min — senior signal)
  • "What breaks at 10x traffic?"
  • Sharding strategy
  • Replication + consistency
  • Rate limiting + backpressure

CLOSING              (2 min)
  • Recap the key trade-offs
  • Mention what you'd do with more time
  • Acknowledge what you're NOT optimizing for

Practice this cheat sheet on a real system-design question

CoPilot Interview's system design mode generates realistic prompts and walks you through structured answers — with complexity analysis and follow-up questions — so you build muscle memory for the actual interview rhythm.

Try System Design Mode Free →

FAQ

How many system design questions should I practice before interviewing?

Quality beats quantity. Do 10 questions with full 45-minute timers, self-recording, and review — that's worth more than 50 surface-level reads. A reasonable set: Twitter, WhatsApp, Uber, Dropbox, TinyURL, Instagram, Yelp, Netflix, YouTube, Rate Limiter. These ten cover every major pattern.

Do I need to memorize actual database internals (B-trees, LSM trees)?

For L4/L5 interviews: know the conceptual difference and when to reach for each (LSM for write-heavy like Cassandra; B-tree for read-heavy/range queries like Postgres). You don't need to implement one. For L6+: yes, be prepared to go deeper on one.

Should I use a specific cloud vendor's services in my answer (AWS S3, GCP Spanner)?

It's fine — and actually expected — for FAANG-adjacent interviews. The interviewer wants to know you've built real systems. But back it with a generic explanation: "I'd use S3 (an object store) for…" so the principle is clear.

What if I don't know a concept the interviewer asks about?

Say so, briefly and confidently. "I haven't worked with distributed consensus directly, but I know Raft is the modern go-to — leader election by term, log replication with quorum writes. Could you help me reason through the specific part you're asking about?" That's a stronger signal than fabrication.

How do I know I'm ready?

You can give a complete answer to a new question you've never seen, inside 45 minutes, without looking at notes, covering all 6 phases (requirements → scale). If you can do that three times in a row on random prompts, you're ready.

System Design Interview Cheat Sheet: The 14 Concepts Every Candidate Actually Needs

The Interview Structure You Need to Memorize

The 14 Core Concepts

1. Load Balancing (always)

2. Caching (always)

3. Database Choice: SQL vs NoSQL

4. Sharding / Partitioning

5. Replication & Consistency

6. CAP Theorem (and Why Interviewers Get It Slightly Wrong)

7. Message Queues & Event-Driven Architecture

8. CDNs

9. Rate Limiting

10. Consistent Hashing

11. Quorum-Based Systems

12. Long Polling / WebSockets / Server-Sent Events

13. Search (Elasticsearch / OpenSearch)

14. Authentication & Authorization

The Magic Phrases (Memorize These)

The 3 Things Junior Candidates Miss

1. Jumping to architecture before clarifying requirements

2. Not back-of-the-envelope'ing

3. Saying "I'd use Redis" instead of "I'd use Redis because…"

The One-Page Cheat Sheet

Practice this cheat sheet on a real system-design question

FAQ

How many system design questions should I practice before interviewing?

Do I need to memorize actual database internals (B-trees, LSM trees)?

Should I use a specific cloud vendor's services in my answer (AWS S3, GCP Spanner)?

What if I don't know a concept the interviewer asks about?

How do I know I'm ready?

Related reading

Keep going — more interview prep guides