CAP Theorem
Session 5.1 · ~5 min read
The Trade-Off You Cannot Escape
In 2000, Eric Brewer stood before an audience at the ACM Symposium on Principles of Distributed Computing and made a conjecture: a distributed data store cannot simultaneously provide more than two of three guarantees. Those three guarantees are Consistency, Availability, and Partition Tolerance. Two years later, Seth Gilbert and Nancy Lynch published a formal proof confirming it. The conjecture became a theorem.
This result is not a design recommendation. It is a constraint of physics and logic. When you build a system that spans multiple networked nodes, the CAP theorem tells you what combinations are possible. Understanding it prevents you from chasing impossible architectures.
CAP doesn't ask what you want. It asks what you're willing to lose.
The Three Guarantees
Consistency (C) means every read receives the most recent write or an error. All nodes see the same data at the same time. If you write a value to one node, any subsequent read from any node returns that value. This is linearizability, the strongest form of consistency.
Availability (A) means every request receives a non-error response, without the guarantee that it contains the most recent write. The system always responds, even if the data might be stale.
Partition Tolerance (P) means the system continues to operate despite arbitrary message loss or delay between nodes. Network partitions are not theoretical. They happen in every distributed system, whether from cable cuts, switch failures, or cloud provider issues.
Pick two during a partition"] C["Consistency
Every read gets latest write"] A["Availability
Every request gets a response"] P["Partition Tolerance
Works despite network splits"] CAP --- C CAP --- A CAP --- P CP["CP Systems
ZooKeeper, HBase,
etcd, Redis Cluster"] AP["AP Systems
Cassandra, DynamoDB,
CouchDB, Riak"] CA["CA Systems
Single-node PostgreSQL,
Single-node MySQL
(no partition = no choice)"] C --- CP C --- CA A --- AP A --- CA P --- CP P --- AP
Why Partition Tolerance Is Non-Negotiable
In any system distributed across a network, partitions will happen. You do not get to choose whether partitions occur. You only get to choose how your system behaves when they do. This means the real choice during a partition is between consistency and availability.
A CP system chooses consistency over availability. When a partition occurs, some nodes may refuse to serve requests rather than return stale data. ZooKeeper does this: if a node cannot reach a quorum, it rejects client requests. The data you get is always correct, but sometimes you get no data at all.
An AP system chooses availability over consistency. When a partition occurs, every node continues serving requests, even if different nodes have different versions of the data. Cassandra does this: writes succeed on whichever nodes are reachable, and the system reconciles conflicts after the partition heals.
A CA system provides both consistency and availability, but only because it does not tolerate partitions. A single-node PostgreSQL instance is CA. The moment you add replication across a network, you must confront partitions, and CA is no longer an option.
Classifying Real Databases
| Database | CAP Classification | Partition Behavior | Typical Use Case |
|---|---|---|---|
| ZooKeeper | CP | Refuses reads/writes without quorum | Configuration management, leader election |
| etcd | CP | Raft-based quorum required | Kubernetes cluster state |
| HBase | CP | Region server unavailable if master unreachable | Large-scale analytical storage |
| Redis Cluster | CP | Minority partition stops accepting writes | Caching, session store |
| MongoDB (default) | CP | Primary election required for writes | Document storage, general purpose |
| Cassandra | AP | All nodes accept reads/writes independently | Time-series, IoT, high write throughput |
| DynamoDB | AP | Eventually consistent reads by default | Serverless applications, key-value store |
| CouchDB | AP | Multi-master replication with conflict resolution | Offline-first mobile apps |
| Riak | AP | Sloppy quorum, hinted handoff | High-availability key-value store |
| PostgreSQL (single node) | CA | No partition tolerance (single node) | Relational workloads, OLTP |
The PACELC Extension
Daniel Abadi proposed PACELC in 2010 because CAP only describes behavior during partitions. Most of the time, the network is healthy. PACELC asks: even when there is no partition, does the system trade latency for consistency?
The framework reads as: if there is a Partition, choose Availability or Consistency. Else, when operating normally, choose Latency or Consistency.
Cassandra is PA/EL: during a partition, it chooses availability. During normal operation, it chooses low latency over strong consistency. MongoDB is PC/EC: it chooses consistency in both cases, accepting higher latency for correctness.
Comparing Three Databases
The radar chart below compares PostgreSQL (single-node, CA), Cassandra (AP), and MongoDB (CP) across five dimensions. Scores are relative, on a 1-to-5 scale, and reflect typical default configurations.
Common Misconceptions
CAP does not mean "pick any two." It means that during a network partition, you must choose between consistency and availability. When there is no partition, you can have both. The choice is forced only when things go wrong.
CAP consistency is not the same as ACID consistency. CAP consistency refers to linearizability across distributed nodes. ACID consistency refers to database constraints and invariants being preserved after a transaction. They are related concepts with different scopes.
Most systems are not purely CP or AP. Many databases offer tunable consistency. DynamoDB lets you request strongly consistent reads at the cost of higher latency. Cassandra lets you set consistency levels per query (ONE, QUORUM, ALL). The CAP classification describes the default or primary behavior, not the only mode.
Further Reading
- Gilbert, S. & Lynch, N. (2002). "Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services."
- Brewer, E. (2012). "CAP Twelve Years Later: How the 'Rules' Have Changed."
- Wikipedia: PACELC Theorem
- Abadi, D. (2012). "Consistency Tradeoffs in Modern Distributed Database System Design."
- Wikipedia: CAP Theorem
Assignment
You are building a chat application with servers in Jakarta and Surabaya. The network link between the two cities goes down. You have two choices:
- Option A (CP): Reject all writes on both servers until the partition heals, ensuring both cities always see the same message history.
- Option B (AP): Allow both servers to continue accepting messages independently, then merge the two message histories when the network recovers.
For each option, answer:
- What is the user experience during the outage?
- What happens when the partition heals?
- What data integrity risks exist?
- Which option would you choose for a chat app, and why?