SECOND PART: Algorithms for UNRELIABLE Distributed Systems (The consensus problem)

SECOND PART: Algorithms for UNRELIABLE Distributed Systems (The consensus problem)

Failures in Distributed Systems • Link failure: A link fails and remains inactive; the network may get disconnected • Processor Crash: At some point, a processor stops taking steps • Byzantine processor: processor changes state arbitrarily and sends messages with arbitrary content (name dates back to untrustable Byzantine Generals of Byzantine Empire, IV–XV century A.D.)

Link Failures a a Non-faulty links b b a c a c

a a Faulty link b b a c c Some of the messages are not delivered

Crash Failures a a Non-faulty processor b b a c a c

a a Faulty processor b b Some of the messages are not sent

Round 1 Round 2 Round 3 Round 4 Round 5 Failure After failure the processor disappears from the network

Byzantine Failures a a Non-faulty processor b b a c a c

Byzantine Failures a Faulty processor a *!§ç# *!§ç# %&/£ %&/£ Processor sends arbitrary messages, plus some messages may be not sent

Round 1 Round 2 Round 3 Round 4 Round 5 Round 6 Failure Failure After failure the processor may continue functioning in the network

Consensus Problem • Every processor has an input x є X • Termination: Eventually every non-faulty processor must decide on a value y. • Agreement: All decisions by non-faulty processors must be the same. • Validity: If all inputs are the same, then the decision of a non-faulty processor must equal the common input (this avoids trivial solutions).

Agreement Start Finish 0 2 1 3 3 3 3 3 2 3 Everybody has an initial value All non-faulty must decide the same value

Validity If everybody starts with the same value, then non-faulty must decide that value Finish Start 1 2 1 1 1 1 1 1 1 1

Negative result for link failures • It isimpossible to reach consensus in case of link failures, even in the synchronous case, and even if one only wants to tolerate a single link failure.

Consensus under link failures:the 2 generals problem • There are two generals of the same army who have encamped a short distance apart. • • Their objective is to capture a hill, which is possible only if they attack simultaneously. • • If only one general attacks, he will be defeated. • • The two generals can only communicate by sending messengers, which is not reliable. • • Is it possible for them to attack simultaneously?

The 2 generals problem Let’s attack B A

Impossibility of consensus under link failures • First of all, notice that it is needed to exchange messages to reach consensus (generals might have different opinions in mind!) • Assume the problem can be solved, and let Π be the shortest (i.e., with minimum number of messages) protocol for a given input configuration. • Suppose now that the last message in Π does not reach the destination. Since Π is correct, consensus must be reached in any case. This means, the last message was useless, and then Π could not be shortest!

Negative result for processor failuresin asynchronous systems • It isimpossible to reach consensus with crash failures in the asynchronous case, even if one only wants to tolerate a single crash failure.

Assumption on the communication modelfor crash and byzantine failures • Complete undirected graph • Synchronous network: we assume that messages are sent, delivered and read in the very same round

Overview of Consensus Results • Let f be the maximum number of faulty processors

A simple algorithm for fault-free consensus Each processor: • Broadcast its input to all processors • Decide on the minimum (only one round is needed)

Start 0 1 4 3 2

Broadcast values 0,1,2,3,4 0 0,1,2,3,4 0,1,2,3,4 1 4 0,1,2,3,4 3 2 0,1,2,3,4

Decide on minimum 0,1,2,3,4 0 0,1,2,3,4 0,1,2,3,4 0 0 0,1,2,3,4 0 0 0,1,2,3,4

Finish 0 0 0 0 0

1 1 1 1 1 1 1 1 1 1 This algorithm satisfies the validity condition Finish Start If everybody starts with the same initial value, everybody decides on that value (minimum)

Consensus with Crash Failures The simple algorithm doesn’t work Each processor: • Broadcast value to all processors • Decide on the minimum

Start fail 0 0 1 0 4 3 2 The failed processor doesn’t broadcast its value to all processors

Broadcasted values fail 0 0,1,2,3,4 1,2,3,4 1 4 0,1,2,3,4 1,2,3,4 3 2

Decide on minimum fail 0 0,1,2,3,4 1,2,3,4 0 1 0,1,2,3,4 1,2,3,4 0 1

Finish fail 0 0 1 0 1 No Consensus!!!

If an algorithm solves consensus for f failed (crashing) processors we say it is: an f-resilient consensus algorithm

An f-resilient algorithm Round 1: Broadcast my value Round 2 to round f+1: Broadcast any new received values End of round f+1: Decide on the minimum value received

Example: f=1 failures, f+1 = 2 rounds needed Start 0 1 4 3 2

Example: f=1 failures, f+1 = 2 rounds needed Round 1 0 fail 0 0,1,2,3,4 1,2,3,4 1 0 4 (new values) 0,1,2,3,4 1,2,3,4 3 2 Broadcast all values to everybody

Example: f=1 failures, f+1 = 2 rounds needed Round 2 0 0,1,2,3,4 0,1,2,3,4 1 4 0,1,2,3,4 0,1,2,3,4 3 2 Broadcast all new values to everybody

Example: f=1 failures, f+1 = 2 rounds needed Finish 0 0,1,2,3,4 0,1,2,3,4 0 0 0,1,2,3,4 0,1,2,3,4 0 0 Decide on minimum value

Example: f=2 failures, f+1 = 3 rounds needed Start 0 1 4 3 2

Example: f=2 failures, f+1 = 3 rounds needed Round 1 0 Failure 1 1,2,3,4 1,2,3,4 1 0 4 0,1,2,3,4 1,2,3,4 3 2 Broadcast all values to everybody

Example: f=2 failures, f+1 = 3 rounds needed Round 2 0 Failure 1 0,1,2,3,4 1,2,3,4 1 4 0 0,1,2,3,4 1,2,3,4 3 2 Failure 2 Broadcast new values to everybody

Example: f=2 failures, f+1 = 3 rounds needed Round 3 0 Failure 1 0,1,2,3,4 0,1,2,3,4 1 4 0,1,2,3,4 0,1,2,3,4 3 2 Failure 2 Broadcast new values to everybody

Example: f=2 failures, f+1 = 3 rounds needed Finish 0 Failure 1 0,1,2,3,4 0,1,2,3,4 0 0 0,1,2,3,4 0,1,2,3,4 3 0 Failure 2 Decide on the minimum value

Example: f=2 failures, f+1 = 3 rounds needed Start 0 1 4 3 2 Another example execution with 2 failures

Example: f=2 failures, f+1 = 3 rounds needed Round 1 0 Failure 1 1,2,3,4 1,2,3,4 1 0 4 0,1,2,3,4 1,2,3,4 3 2 Broadcast all values to everybody

Example: f=2 failures, f+1 = 3 rounds needed Round 2 0 Failure 1 0,1,2,3,4 0,1,2,3,4 1 4 0,1,2,3,4 0,1,2,3,4 3 2 Broadcast new values to everybody Remark: At the end of this round all processes know about all the other values

Example: f=2 failures, f+1 = 3 rounds needed Round 3 0 Failure 1 0,1,2,3,4 0,1,2,3,4 1 4 0,1,2,3,4 0,1,2,3,4 3 2 Failure 2 Broadcast new values to everybody (no new values are learned in this round)

Example: f=2 failures, f+1 = 3 rounds needed Finish 0 Failure 1 0,1,2,3,4 0,1,2,3,4 0 0 0,1,2,3,4 0,1,2,3,4 3 0 Failure 2 Decide on minimum value

If there are f failures and f+1 rounds then there is a round with no failed processors 2 3 4 5 6 1 Round Example: 5 failures, 6 rounds No failure

In the algorithm, at the end of the round with no failure: • Every (non-faulty) processor knows • about all the values of all other • participating processors • This knowledge doesn’t change until • the end of the algorithm

Therefore, at the end of the round with no failure: everybody would decide the same value However, we don’t know the exact position of this round, so we have to let the algorithm execute for f+1 rounds

SECOND PART: Algorithms for UNRELIABLE Distributed Systems (The consensus problem)