Chapter 9: Consistency and Consensus
3 min readCore Concepts
Why Consensus Matters
- Coordinate actions across distributed nodes
- Ensure all nodes agree on values
- Maintain consistency
Consistency Models
| Model | Description | Performance |
|---|---|---|
| Linearizable | Strongest consistency | Slowest |
| Causal | Preserves causality | Medium |
| Eventual | Eventually consistent | Fastest |
Linearizability
Definition
- Every operation appears to execute atomically
- Once write completes, all subsequent reads see that value
- Equivalent to single-copy serial execution
How to Achieve
Single Leader Replication:
- Leader handles all writes
- Followers replicate synchronously
- Reads from leader (or synchronous follower)
Consensus Algorithms:
- Raft, Paxos
- Linearizable by design
Transactions:
- Two-phase locking (2PL)
- Serializable snapshot isolation (SSI)
Linearizability and Quorums
Problem: Quorum reads/writes may not be linearizable
Solution:
- Read from leader
- Or use synchronous replication
- Or use consensus algorithm
Performance Cost
Network delays:
- Synchronous replication adds latency
- Geographically distributed systems suffer
Trade-off: Consistency vs Performance
Causal Consistency
Definition
- Operations ordered by causality
- Happens-before relationship preserved
- More scalable than linearizability
Implementations
Version Vectors:
- Track causality across nodes
- Detect concurrent operations
Sequence Numbers:
- Lamport timestamps
- Vector clocks
Consistent Prefix Reads
- Read operations see causally consistent prefix
- Prevents seeing effects before causes
Total Order Broadcast
Definition
- Every node receives messages in same order
- Equivalent to consensus on ordering
Properties
- Reliable delivery: No message lost
- Totally ordered: Same order on all nodes
- Causal order: If message m1 sent before m2, m1 appears before m2
Implementations
- Raft: Leader-based, log replication
- Paxos: Leaderless, more complex
- Zab: ZooKeeper's protocol
Relationship to Consensus
Linearizable operations ↔ Total order broadcast ↔ Consensus
Using Consensus for Consistency
Compare-and-Set:
- Linearizable via consensus
- Reject stale operations
Uniqueness Constraints:
- Require consensus
- Prevent duplicate assignments
Distributed Transactions and Consensus
Two-Phase Commit (2PC)
Protocol:
- Prepare phase: Coordinator asks all participants to prepare
- Commit phase: Coordinator tells all to commit (or abort)
Problems:
- Blocking: Coordinator failure blocks system
- Single point of failure
- Poor performance
Three-Phase Commit (3PC)
- Attempts to reduce blocking
- Requires synchronous network
- Not used in practice
Consensus Algorithms
Raft:
- Leader-based
- Log replication
- Leader election
- Membership changes
Paxos:
- Leaderless
- More complex
- Used in Google systems
Epoch Numbering:
- Each leader gets unique epoch
- Prevents split brain
- Ensures single leader
Practical Consensus Systems
ZooKeeper
- Coordination service
- Consensus-based
- Used by Kafka, HBase, etc.
etcd
- Distributed key-value store
- Consensus-based
- Used by Kubernetes
Consul
- Service discovery
- Health checking
- Consensus-based
Key Takeaways
- Linearizability is strongest but expensive
- Causal consistency is more scalable
- Total order broadcast is equivalent to consensus
- 2PC is blocking and has single point of failure
- Modern consensus algorithms (Raft, Paxos) are practical
- ZooKeeper/etcd provide consensus as a service