Given 1 client and 3 server nodes, how do the nodes come to a consensus of receiving a message from the client?
Raft algorithm solves it by using Leader Election and Log Replication.
A node has 3 states: Follower, Candidate, and Leader.
If followers don't hear from a leader, they can become a candidate.
Then the candidate requests votes from other nodes.
The nodes replies with their votes.
The candidate has become the leader. From now, all changes go through the leader.
Every change from the client to the server is added as an entry in the node's log. This log entry is uncommitted so it won't update the node's value.
Before committing the entry, the node first replicates the log entry to the follower nodes.
The leader waits until a majority of nodes have written the entry and send back acknowledgment messages.
The leader then notifies the followers that the entry has been committed.
The cluster has come to a consensus about the system state.
There are two timeout settings that control the election.
The election timeout is the amount of time a follower waits until becoming a candidate.
The election timeout is randomized to be between 150ms and 300ms.
After the election timeout, the follower becomes a candidate. It starts a new election term and votes for itself.
then it sends vote requests to other nodes
If the receiving node hasn't voted yet in this term then it votes for the candidate. Then it resets its election timeout.
Once a candidate has a majority of votes, it becomes leader.
The leader starts sending out AppendEntries messages specified by the heartbeat timeout to its followers.
Followers then respond to each AppendEntries message.
This election term will continue until a follower stops receiving heartbeats and becomes a candidate.
Let's stop the leader.
After the heartbeat timeout reaches, a follower becomes a candidate. The leader election repeats.
Once we have a leader elected, we need to replicate all changes to our system to all nodes. This is done by using the same AppendEntries message that was used for heartbeats.
First a client sends a change to the leader. The change is appended to the leader's log.
Then the change is sent to the followers on the next heartbeat.
An entry is committed once a majority of followers acknowledge it.
A response is sent to the client.
- Visualisations: https://raft.github.io
- Specification: https://raft.github.io/raft.pdf