Course → Module 6: System Design Interview Framework

Patterns as Named Tradeoffs

Design patterns are not solutions you apply to problems. They are tradeoffs that someone named. Every pattern solves a specific category of problem by accepting a specific category of cost. Using a pattern without understanding the cost is cargo-culting. Understanding the cost and choosing deliberately is engineering.

This session is a reference card. Six patterns that appear frequently in system design interviews, each with a clear description, a use case, and a tradeoff. Bookmark this page. Return to it before interviews.

Patterns are not solutions. They are named tradeoffs. Knowing the pattern means knowing what you gain and what you pay. Everything else is trivia.

Pattern Reference Table

Pattern One-Line Description When to Use Main Tradeoff
Read Replica Replicate a database to separate read traffic from write traffic Read-heavy workloads where a single DB server cannot handle the read QPS Replication lag means reads may return stale data (eventual consistency)
CQRS Use separate models (and often separate stores) for reads and writes Systems where read and write patterns differ fundamentally in shape or scale Increased system complexity. Two models must be kept in sync.
Event Sourcing Store state as a sequence of immutable events instead of current-state snapshots Audit trails, undo/redo, temporal queries, financial systems Event store grows unbounded. Rebuilding current state from events is expensive.
Saga Coordinate multi-service transactions through a sequence of local transactions with compensating actions Distributed transactions across microservices where 2PC is too slow or unavailable No atomicity guarantee. Partial failures require compensating transactions (rollback logic).
Strangler Fig Gradually replace a legacy system by routing traffic to new components one feature at a time Migrating from monolith to microservices, or replacing any legacy system incrementally Dual-system overhead during migration. Routing logic adds complexity.
Cell-Based Architecture Partition the system into independent, self-contained cells that each serve a subset of users Extreme reliability requirements. Blast radius containment. Cross-cell operations are expensive. Data locality must be carefully managed.

Pattern 1: Read Replica

The simplest scaling pattern. Your primary database handles all writes. One or more replica databases handle reads. The primary replicates data to replicas asynchronously (or synchronously, at the cost of write latency).

When to reach for it: your application is read-heavy (common: 90% reads, 10% writes), and the primary database is overloaded by read queries. Adding read replicas lets you scale reads horizontally without changing your application architecture significantly.

The cost: replication lag. A write to the primary may take 10ms to 500ms (or more, under load) to appear on a replica. During that window, a read from the replica returns stale data. For many applications (social media feeds, product catalogs), this is acceptable. For others (bank balances, inventory counts), it is not.

Pattern 2: CQRS (Command Query Responsibility Segregation)

CQRS takes the read replica idea further. Instead of replicating the same data model, you maintain two separate models: one optimized for writes (commands) and one optimized for reads (queries). The write model might be a normalized relational database. The read model might be a denormalized document store, a search index, or a materialized view.

flowchart LR subgraph "Write Side (Commands)" C["Client"] -->|"POST /orders"| WS["Write Service"] WS --> WDB["Write DB
(normalized)"] WDB -->|"Events"| EB["Event Bus"] end subgraph "Read Side (Queries)" EB -->|"Project"| RP["Read Projector"] RP --> RDB["Read DB
(denormalized)"] RDB --> RS["Read Service"] RS -->|"GET /orders"| C2["Client"] end style C fill:#222221,stroke:#c8a882,color:#ede9e3 style WS fill:#222221,stroke:#6b8f71,color:#ede9e3 style WDB fill:#191918,stroke:#c8a882,color:#ede9e3 style EB fill:#191918,stroke:#c47a5a,color:#ede9e3 style RP fill:#222221,stroke:#6b8f71,color:#ede9e3 style RDB fill:#191918,stroke:#c8a882,color:#ede9e3 style RS fill:#222221,stroke:#8a8478,color:#ede9e3 style C2 fill:#222221,stroke:#8a8478,color:#ede9e3

The write side accepts commands, validates them, and stores the result in the write database. An event is emitted to an event bus. The read side consumes events, projects them into a read-optimized format, and stores that in the read database. Clients query the read side for data.

The cost: you now maintain two data stores, a projection process, and the event bus between them. If the projector fails or falls behind, the read model becomes stale. Debugging inconsistencies between the two models requires tracing through the event pipeline. This complexity is justified only when the read and write patterns are fundamentally different, for example, when writes are transactional and relational, but reads need full-text search across denormalized documents.

Pattern 3: Event Sourcing

Traditional databases store current state. If a user changes their name from "Alice" to "Bob," the database overwrites "Alice" with "Bob." The history is lost.

Event Sourcing stores every change as an immutable event. Instead of storing "name = Bob," you store two events: "NameSet: Alice" and "NameChanged: Bob." The current state is derived by replaying all events in order.

flowchart TD E1["Event 1: AccountCreated
balance = 0"] --> E2["Event 2: Deposited
amount = 500"] E2 --> E3["Event 3: Withdrawn
amount = 200"] E3 --> E4["Event 4: Deposited
amount = 100"] E4 --> S["Current State:
balance = 400"] style E1 fill:#222221,stroke:#c8a882,color:#ede9e3 style E2 fill:#222221,stroke:#6b8f71,color:#ede9e3 style E3 fill:#191918,stroke:#c47a5a,color:#ede9e3 style E4 fill:#222221,stroke:#6b8f71,color:#ede9e3 style S fill:#191918,stroke:#c8a882,color:#ede9e3

The advantages: complete audit trail, ability to reconstruct state at any point in time, natural fit for CQRS (events feed the read projector). Financial systems, healthcare records, and collaborative editors benefit greatly from event sourcing.

The cost: the event store grows without bound. After years, replaying millions of events to compute current state is impractical. You need snapshots (periodic state captures) to make rebuilds feasible. Schema evolution of events is also tricky: when you add a new field to an event, you must handle old events that lack that field.

Pattern 4: Saga

In a monolith, you wrap multiple operations in a database transaction. Either all succeed or all roll back. In a microservices architecture, a single business operation (e.g., "place an order") may span three services: Order Service, Payment Service, and Inventory Service. You cannot use a database transaction across services.

A Saga breaks the distributed transaction into a sequence of local transactions. Each service performs its local transaction and publishes an event. If any step fails, the Saga executes compensating transactions to undo the previous steps.

Example: Place Order Saga. Step 1: Order Service creates order (status: pending). Step 2: Payment Service charges the customer. Step 3: Inventory Service reserves stock. If Payment fails, the compensating action is: Order Service cancels the order. If Inventory fails after payment succeeds, the compensating actions are: Payment Service refunds the charge, then Order Service cancels the order.

The cost: you must design and implement compensating transactions for every step. Some operations are difficult to compensate (you cannot "unsend" an email). Sagas provide eventual consistency but not atomicity. During execution, intermediate states are visible (the order exists but is not yet paid). Your application must handle these intermediate states gracefully.

Pattern 5: Strangler Fig

Named after the strangler fig tree that grows around a host tree and eventually replaces it, this pattern is for migrations. You place a routing layer (often an API gateway or reverse proxy) in front of the legacy system. For each feature you migrate, you route that traffic to the new system. The legacy system continues to serve everything else. Over time, more and more traffic goes to the new system until the legacy system handles nothing and can be decommissioned.

The cost: during migration, you operate two systems simultaneously. You need a routing layer that understands which features are migrated and which are not. Data may need to be synchronized between old and new systems. The migration can take months or years. But the alternative, rewriting the entire system and switching over in one shot, carries far higher risk.

Pattern 6: Cell-Based Architecture

Cell-Based Architecture partitions the entire system into independent, self-contained units called cells. Each cell serves a subset of users (typically assigned by user ID or region). Each cell has its own complete stack: load balancer, application servers, database, cache. Cells share nothing.

The primary benefit is blast radius containment. If a cell fails (bad deployment, database corruption, overload), only the users assigned to that cell are affected. The other cells continue operating normally. This is how AWS structures some of its own services internally.

The cost: cross-cell operations are expensive and complex. If User A (in Cell 1) sends a message to User B (in Cell 3), the request must cross cell boundaries. Data must be carefully partitioned so that most operations stay within a single cell. You also pay the operational overhead of managing many identical but independent infrastructure stacks.

Choosing the Right Pattern

In an interview, you will not use all six patterns in a single design. Most designs use one or two. The skill is knowing which pattern fits the problem at hand. Here is a quick decision guide:

Further Reading

Assignment

Create your own cheat sheet. For each of the six patterns in this session, write in your own words:

  1. (a) A one-line description (no more than 15 words)
  2. (b) When you would use it (one specific scenario)
  3. (c) The main tradeoff (what you gain vs. what you pay)

Do not copy from the table above. Rewriting in your own words is how you internalize the material. If you cannot explain a pattern without looking at the table, you do not understand it yet.