Distributed Consensus - Crash Failures Scenario
Theory
Consider the fail-stop model with n processes, where up to processes may fail (and stop). The task is to reach consensus on a particular variable, say (integer). If we are to tolerate up to failures, we run the algorithm for rounds. We argue, using in-line exercises, that at the end of the procedure, local value in all active machines is guaranteed to be the consensus value. Every machine executes the protocol synchronously. This case assumes non-existence of Byzantine processes - i.e. adversary processes which may try to "fool" the system.
Algorithm
Every process runs the following for f+1 iterations:
- If the current value of has not been broadcast then :
- yj ← value (if any) received from process in this round;
- x ← min∀j(x,yj) ;
Final value of x is the consensus value.
Analysis
Agreement
Since there are rounds, there exist at least one round, say , in which all good processes could broadcast their value i.e. no process failed in . As a result, latest value is broadcast and received by all, and all update their values to the minima.
Correctness
Holds because once consensus is achieved among good processes, value cannot change. Processes are non-byzantine, and a failed process never restarts.
Termination
Since is finite, and there are constant number of operations in every iteration, algorithm clearly terminates,