Course → Module 3: Storage, Databases & Caching

Why Replicate?

A single database server is a single point of failure. If it goes down, your application goes down. Replication copies data from one server to others, providing two benefits: fault tolerance (if one server dies, another has the data) and read scaling (distribute read queries across multiple copies).

Database replication is the process of maintaining copies of the same data on multiple database servers. The source of writes is called the primary (or leader). The copies are called replicas (or followers). Replication can be synchronous or asynchronous, and uses either single-primary or multi-primary topology.

Replication and sharding solve different problems. Sharding splits data to handle more writes and store more data. Replication copies data to handle more reads and survive failures. Most production databases use both: shard the data across multiple primaries, then replicate each shard to one or more replicas.

Primary-Replica Replication

The most common replication topology. All writes go to a single primary server. The primary streams changes (via a write-ahead log, binary log, or oplog) to one or more replica servers. Replicas apply these changes and serve read queries.

graph TD App["Application"] --> W["Writes"] App --> R["Reads"] W --> Primary[("Primary
(read/write)")] Primary -->|"WAL stream"| R1[("Replica 1
(read-only)")] Primary -->|"WAL stream"| R2[("Replica 2
(read-only)")] Primary -->|"WAL stream"| R3[("Replica 3
(read-only)")] R --> R1 R --> R2 R --> R3

This topology is straightforward. There is no write conflict because only one server accepts writes. Replicas are read-only, which means you can add replicas to scale read throughput linearly. Three replicas handle roughly three times the read queries of a single server.

The downside is write scalability. All writes funnel through one server. If your write throughput exceeds what one machine can handle, primary-replica replication alone will not help. You need sharding for write scaling.

Failover is the other concern. If the primary dies, one of the replicas must be promoted to primary. This can be automated (most managed databases do this) but involves a brief window where writes are unavailable. The promoted replica may also be slightly behind the old primary, meaning some recent writes could be lost in asynchronous replication setups.

Multi-Primary Replication

Multi-primary replication (also called multi-leader or active-active) allows writes on multiple servers. Each primary replicates its changes to the others. This enables write availability across regions but introduces write conflicts that must be detected and resolved.

Multi-primary replication is used when you need to accept writes in multiple geographic regions. A user in Jakarta writes to the Jakarta primary. A user in Singapore writes to the Singapore primary. Each primary sends its changes to the other. Reads are served locally, and writes do not need to cross the ocean before being acknowledged.

The fundamental problem is conflicts. If user A updates their email on the Jakarta primary and user B updates the same email field on the Singapore primary at the same time, which write wins? Conflict resolution strategies include:

graph LR subgraph "Primary-Replica" P1[("Primary")] -->|"one-way"| R1a[("Replica")] P1 -->|"one-way"| R1b[("Replica")] end subgraph "Multi-Primary" P2[("Primary A
Jakarta")] <-->|"bi-directional"| P3[("Primary B
Singapore")] P2 -->|"one-way"| R2[("Replica")] P3 -->|"one-way"| R3[("Replica")] end

Synchronous vs. Asynchronous Replication

When the primary commits a write, it can wait for replicas to confirm before acknowledging the client (synchronous), or it can acknowledge immediately and let replicas catch up later (asynchronous). This is a fundamental tradeoff between durability and latency.

Dimension Synchronous Asynchronous Semi-Synchronous
Write latency High (waits for replica ACK) Low (immediate ACK from primary) Medium (waits for 1 replica)
Data durability Strong (data on multiple nodes) Weak (data may exist only on primary) Moderate (data on at least 2 nodes)
Risk of data loss on failure None (all replicas confirmed) Possible (unsynced writes lost) Minimal (1 replica confirmed)
Impact of slow replica Blocks all writes No impact on writes Blocks if the sync replica is slow
Throughput Limited by slowest replica Limited only by primary Limited by one replica
Common use Financial systems, compliance Most web applications PostgreSQL (sync standby), MySQL Group Replication

Most production systems use asynchronous replication because the latency penalty of synchronous replication is too high, especially across data centers. A write that takes 2ms on the primary might take 50-200ms if it must wait for a replica in another region. Semi-synchronous replication is a common compromise: wait for at least one replica to confirm, then acknowledge the client.

Replication Lag

Replication lag is the delay between a write committed on the primary and that write becoming visible on a replica. In asynchronous replication, lag is typically under a second but can spike to minutes during high write load, long-running queries on replicas, or network congestion.

Replication lag causes stale reads. A user updates their profile on the primary, then immediately loads the profile page. If the read is routed to a replica that has not yet received the update, the user sees old data. This is the "read-your-own-writes" problem.

Lag is not constant. During normal operation, it might be 10-50 milliseconds. During a traffic spike that doubles write throughput, the replica's single-threaded apply process falls behind. Long-running analytical queries on replicas can also cause lag by competing for I/O and CPU resources.

The chart above simulates typical replication lag over a 24-hour period. During off-peak hours (midnight to 6 AM), lag stays under 20ms. During peak load (late morning and evening), lag spikes above the 200ms threshold. These spikes correlate with write volume: when the primary processes more writes per second, the replica falls further behind.

Handling Replication Lag

Several strategies mitigate the effects of replication lag:

Replication Strategy Comparison

Dimension Primary-Replica Multi-Primary
Write target Single primary only Any primary
Conflict potential None (single writer) Yes (concurrent writes on different primaries)
Write availability Lost during failover Available in all regions
Write latency Low if client is near primary Low (write to nearest primary)
Complexity Low High (conflict resolution logic)
Consistency Strong (reads from primary) Eventual (cross-region convergence)
Use case Most applications Multi-region, offline-capable, collaborative editing

Systems Thinking Lens

Replication creates a fundamental balancing loop between consistency and performance. Adding replicas increases read throughput (reinforcing loop), but each replica introduces replication lag (balancing loop). The faster you write, the further behind replicas fall, and the more stale reads your users experience.

The leverage point is not the replication strategy itself but the read/write ratio of your workload. If your application is 95% reads and 5% writes, primary-replica replication with asynchronous replication works well. The lag affects only a small fraction of operations. If your application is 50% writes, replication lag becomes a dominant concern, and you may need synchronous replication or architectural changes (like event sourcing) that decouple reads from writes entirely.

Multi-primary replication has a hidden reinforcing loop: the more primaries you add, the more conflict resolution logic you need, the more edge cases emerge, the more engineering time you spend debugging subtle data inconsistencies instead of building features.

Further Reading

Assignment

Your application has a primary database server in Jakarta and an asynchronous replica in Surabaya. A user in Jakarta updates their profile (changes their display name). One second later, the same user loads their profile page. The read is routed to the Surabaya replica for load balancing.

  1. What will the user likely see? Explain why, referencing replication lag and the asynchronous replication model.
  2. Describe the "read-your-own-writes" problem in your own words. Why is it particularly frustrating for the user who just made the change?
  3. Propose three different solutions to fix this problem. For each solution, explain the mechanism and the tradeoff (what it costs in terms of complexity, latency, or infrastructure).
  4. Under what circumstances would you choose synchronous replication instead? What would that do to the user's write latency, given the network distance between Jakarta and Surabaya?