Chapter 9: Consistency and Consensus

3 min read

Core Concepts

Why Consensus Matters

Coordinate actions across distributed nodes
Ensure all nodes agree on values
Maintain consistency

Consistency Models

graph LR
    subgraph Strongest
    L[Linearizable<br/>single-copy semantics]
    end
    subgraph Medium
    C[Causal<br/>happens-before order]
    end
    subgraph Weakest
    E[Eventual<br/>replicas converge]
    end
    L -->|"more scalable →"| C -->|"more scalable →"| E
    E -->|"← stronger consistency"| C -->|"← stronger consistency"| L

Model	Description	Performance
Linearizable	Strongest consistency	Slowest
Causal	Preserves causality	Medium
Eventual	Eventually consistent	Fastest

Linearizability

Definition

Every operation appears to execute atomically
Once write completes, all subsequent reads see that value
Equivalent to single-copy serial execution

How to Achieve

Single Leader Replication:

Leader handles all writes
Followers replicate synchronously
Reads from leader (or synchronous follower)

Consensus Algorithms:

Raft, Paxos
Linearizable by design

Transactions:

Two-phase locking (2PL)
Serializable snapshot isolation (SSI)

Linearizability and Quorums

Problem: Quorum reads/writes may not be linearizable

Solution:

Read from leader
Or use synchronous replication
Or use consensus algorithm

Performance Cost

Network delays:

Synchronous replication adds latency
Geographically distributed systems suffer

Trade-off: Consistency vs Performance

Causal Consistency

Definition

Operations ordered by causality
Happens-before relationship preserved
More scalable than linearizability

Implementations

Version Vectors:

Track causality across nodes
Detect concurrent operations

Sequence Numbers:

Lamport timestamps
Vector clocks

Consistent Prefix Reads

Read operations see causally consistent prefix
Prevents seeing effects before causes

Total Order Broadcast

Definition

Every node receives messages in same order
Equivalent to consensus on ordering

Properties

Reliable delivery: No message lost
Totally ordered: Same order on all nodes
Causal order: If message m1 sent before m2, m1 appears before m2

Implementations

Raft: Leader-based, log replication
Paxos: Leaderless, more complex
Zab: ZooKeeper's protocol

Relationship to Consensus

Linearizable operations ↔ Total order broadcast ↔ Consensus

Using Consensus for Consistency

Compare-and-Set:

Linearizable via consensus
Reject stale operations

Uniqueness Constraints:

Require consensus
Prevent duplicate assignments

Distributed Transactions and Consensus

Two-Phase Commit (2PC)

Protocol:

graph TD
    C[Coordinator] -->|"1. PREPARE<br/>can you commit?"| P1[Participant A]
    C -->|"1. PREPARE"| P2[Participant B]
    C -->|"1. PREPARE"| P3[Participant C]
    P1 -->|"YES"| C
    P2 -->|"YES"| C
    P3 -->|"YES"| C
    C -->|"2. COMMIT"| P1
    C -->|"2. COMMIT"| P2
    C -->|"2. COMMIT"| P3

Problems:

Blocking: Coordinator failure blocks system
Single point of failure
Poor performance

Three-Phase Commit (3PC)

Attempts to reduce blocking
Requires synchronous network
Not used in practice

Consensus Algorithms

Raft:

Leader-based
Log replication
Leader election
Membership changes

graph TD
    C[Client sends write] --> L[Leader receives write]
    L --> W[Append to local log<br/>uncommitted]
    W --> R[Replicate to followers]
    R --> M{"Majority<br/>acknowledged?"}
    M -->|Yes| X[Commit entry<br/>apply to state machine]
    M -->|No| N["Retry or<br/>trigger new election"]
    X --> A[Acknowledge client]
    N --> R

Paxos:

Leaderless
More complex
Used in Google systems

Epoch Numbering:

Each leader gets unique epoch
Prevents split brain
Ensures single leader

Practical Consensus Systems

ZooKeeper

Coordination service
Consensus-based
Used by Kafka, HBase, etc.

etcd

Distributed key-value store
Consensus-based
Used by Kubernetes

Consul

Service discovery
Health checking
Consensus-based

Key Takeaways

Linearizability is strongest but expensive
Causal consistency is more scalable
Total order broadcast is equivalent to consensus
2PC is blocking and has single point of failure
Modern consensus algorithms (Raft, Paxos) are practical
ZooKeeper/etcd provide consensus as a service

Core Concepts #

Why Consensus Matters #

Consistency Models #

Linearizability #

Definition #

How to Achieve #

Linearizability and Quorums #

Performance Cost #

Causal Consistency #

Definition #

Implementations #

Consistent Prefix Reads #

Total Order Broadcast #

Definition #

Properties #

Implementations #

Relationship to Consensus #

Using Consensus for Consistency #

Distributed Transactions and Consensus #

Two-Phase Commit (2PC) #

Three-Phase Commit (3PC) #

Consensus Algorithms #

Practical Consensus Systems #

ZooKeeper #

etcd #

Consul #

Key Takeaways #

Core Concepts

Why Consensus Matters

Consistency Models

Linearizability

Definition

How to Achieve

Linearizability and Quorums

Performance Cost

Causal Consistency

Definition

Implementations

Consistent Prefix Reads

Total Order Broadcast

Definition

Properties

Implementations

Relationship to Consensus

Using Consensus for Consistency

Distributed Transactions and Consensus

Two-Phase Commit (2PC)

Three-Phase Commit (3PC)

Consensus Algorithms

Practical Consensus Systems

ZooKeeper

etcd

Consul

Key Takeaways