Reaching Agreement in the Presence of Faults

Reaching Agreement in the Presence of Faults M. Pease, R. Shotak and L. Lamport Sanjana Patel Dec 3, 2003

Introduction • The algorithm proposed by this paper offers the means by which independent processes can arrive at an exact mutual agreement. • The algorithm works for greater than or equal to 3m+1 total processes (where m processes are faulty)

Assumptions • There are n isolated processes and no more than m are faulty • Faulty processes need not be identified • Processors communicate by means of two-party message • The communication channel is fail-safe and has negligible delay • Sender of a message is identifiable

Goal • Devise an algorithm based on an exchange of messages that allows each non-faulty process to compute an interactive consistency vector (of n values) such that • The non-faulty processes compute the exact same vector • The elements of the vector corresponding to a given non-faulty process is the private value of that process • The above goal helps achieve interactive consistency • The vector corresponding to the faulty process may be arbitrary as long as all non-faulty processes compute the exact same value for any faulty process

No-Fault Case • If there are no faults, each process will have the same interactive consistency vector (i.e., Each process has an identical vector containing the private values of each process) {1,2,3,4} P1 P2 {1,2,3,4} 1 2 3 4 P3 P4 {1,2,3,4} {1,2,3,4}

Single-Fault Case • Consider obtaining interactive consistency for m=1 and n=4 • Two rounds of information exchange are required • Exchange private values in the first round • Exchange results of the first round in the second round • All non-faulty processes can record ‘NIL’ for the faulty process ICV value or the majority value for the faulty process is used

Single-Fault Case P2:{1,2,Z,4} P3:{1,B,3,4} P4:{1,2,Y,4} P1:{1,2,3,4} P3:{A,2,Z,4} P4:{1,2,Y,4} 1 2 P1 P2 3 Z P1:{1,2,3,4} P3:{1,2,Y,4} P2:{1,2,Z,4} Y {1,2,3,4} P3 P4 4 Based on Majority, ICV used will be {1,2,NIL,4} as there is no majority value for P3 (all processes have a different value for P3)

M-fault Case • m+1 rounds of information exchange are required to obtain interactive consistency in a system of m faulty processes • Either the majority or NIL is used for vector values • If broadcast is used for communication from round 2 onwards, a maximum of n*(m+1) messages are exchanged before an agreement is reached.

Impossibility for n < 3m+1 1 P2:{1,2,Z} P3:{1,B,3} P1 P1:{1,2,3} P3:{A,2,Z} 3 Z 2 P2 {1,2,3} P3 There is no majority value for any of the ICV values so no agreement can be reached.

Algorithm using Authenticators • The problem of reaching an agreement with n < 3m+1 is based on the assumption that a faulty process may refuse to pass-on or fabricate the values it received from other processes • Authentication can be used to guard against the above so that a faulty process may lie about it’s own value or refuse to send it’s own value but cannot relay altered values without other processes being able to identify it as faulty.

Algorithm using Authenticators • An authenticator is an argument appended to the data, that can be created by the sender only • The receiver should be able to use the authenticator to verify the sender and that the value was not altered. • Public Key/Private Key infrastructure can be used to achieve the above in combination with Message Hashing

Example 1 P2:{1,2,Z} P3:{1,2,3} P1 P1:{1,2,3} P3:{1,2,Z} 3 Z 2 P2 {1,2,3} P3 Since P3 cannot lie about P1 or P2’s values without reveling itself as faulty, an agreement. ICV value of {1,2,NIL} is used.

Conclusion • The problem of obtaining interactive consistency is fundamental to the design of distributed fault-tolerant systems • The algorithm is needed for at least three aspects of design • Synchronization of clocks • Stabilization of input from sensors • Agreement of results of diagnostic tests • Preliminary research assumed that a simple majority was sufficient. Realization that simple majorities were insufficient led to the results reported in this paper

Q&A?

Reaching Agreement in the Presence of Faults