470 likes | 694 Views
Randomized Protocols for Asynchronous Consensus. James Aspnes Distributed Computing (2003) 16: 165 - 175. Consensus problem. Definition: given a group of n processes, they must agree on a value. Requirements: Agreement . All processes that decide choose the same value.
E N D
Randomized Protocols for Asynchronous Consensus James Aspnes Distributed Computing (2003) 16: 165 - 175
Consensus problem • Definition: given a group of n processes, they must agree on a value. • Requirements: • Agreement. All processes that decide choose the same value. • Termination. All non-faulty processes eventually decide. • Validity. The common output value is an input value of some process.
The FLP Impossibility Result • There is no deterministic protocol that satisfies the agreement, termination, and non-triviality conditions for an asynchronous message passing system in which any single process can fail undetectably [Fischer, Lynch, Paterson]. • Consensus is impossible in an asynchronous shared-memory system with at least one undetectable failure [Loui, Abu-Amara].
Circumventing the FLP Result • Timing assumptions. • Adding synchrony to an asynchronous system. • Failure detectors. • Faulty processors are removed. • Strong primitives. • Test-and-set, move-and-swap, etc • Randomized algorithms. • Coin-flip operation.
Randomized Algorithm • Addition of coin-flip to distributed model. • Adversary • A function from partial executions to operations. • Chooses which operations happens next. • Simulates the executing environment. • The distributed algorithm must resist a pernicious adversary.
Types of Adversaries • Strong adversary • can see the entire history of execution: outcomes of coin flips, internal states of processes, contents of messages and memory. • May be too strong… consensus may be too hard. • Weak adversary • Several different models • E.g.: Chooses for each state, which process executes next.
Rand. Consensus’ Requirements • No change to Agreement & Validity. • Termination – for all adversaries, the protocol terminates with probability 1. • Probability of non-termination can be made infinitely small.
Ben-Or’s Consensus Protocol • First protocol to achieve consensus with probabilistic termination in a model with a strong adversary (1983). • Tolerates t < n/2 crash failures. • Requires exponential expected time to converge in the worst case.
Ben-Or’s Consensus Protocol • Operates in rounds, each round has two phases. • Suggestion phase – each process transmits its value, and waits to hear from other processes. • Decision phase – if majority found, take its value. Otherwise, flip a coin to decide a new local value.
Ben-Or’s Consensus Protocol • If enough processes detected the majority, decide. • If I know that someone detected majority, switch to the majority’s value. • Terminates, because eventually, the majority of processes will flip coins correctly. • Pitfall – don’t wait for all processes, because they might be dead.
Ben-Or’s Consensus Protocol Input: Boolean initial consensus value Output: Boolean final consensus value Data: Boolean preference, integer round begin preference := input round := 1 while true do end end Body of while statement
Body of while Statement send (1, round, preference) to all processes wait to receive n – t (1, round, *) messages if received more than n / 2 (1, round, v) messages then send (2, round, v, ratify) to all processes else send (2, round, ?) to all processes end wait to receive n – t (2, round, *) messages If received a (2, round, v, ratify) message then preference = v if received more than t (2, round, v, ratify) messages then output = v end else preference = CoinFlip() end round = round + 1
Ben-Or’s Agreement • At most one value can receive majority in the first stage of a round. • If some process sees t + 1 (2, r, v, ratify), then every process sees at least one (2, r, v, ratify) message. • If every process sees a (2, r, v, ratify) message, every process votes for v in the first stage of r + 1 and decides v in second stage of r + 1 (if it hasn’t decided before).
Ben-Or’s Validity • If all process vote for their common value v in round 1, then all processes send (2, v, 1, ratify) and decide on the second stage of round 1. • Only preferences of one of the processes is sent in the first round.
Ben-Or’s Termination • If no process sees the majority value: • They will flip coins, and start everything again. • Eventually a majority among the non-faulty processes flips the same random value. • The non-faulty processes will read the majority value. • The non-faulty processes will propagate ratify messages, containing the majority value. • Non-faulty process will receive the ratify messages, and the protocol finishes.
Worst Case Probability • Termination: The probability of disagreement for an infinite time is 0. (It is equal to the probability that every turn there will be one 1 and one 0 forever). • Complexity: The chance of n coins to be all 1 (assuming a fair coin) is 2-n. Hence, the expected time of the protocol to converge is O(2n). • This worse case happens if there are (n-1)/2 faulty processes.
Shared-memory Protocols • When a process fails, all the data in the shared memory survives. • Wait free algorithm: resists to up to n-1 failures. • Reads / Writes are atomic operations. • Consensus problem becomes a problem of getting the shared memory into a state that unambiguously determines the selected value.
Chor-Israeil-Li Protocol • Basic idea: a “race” between processes. • Each process writes in the shared memory his turn and his preference. • If there is a process far enough ahead, take its value. • Flip a coin to choose if to advance a turn or not, with chance of 1 / 2n to advance.
Chor-Israeli-Li Protocol Input: Initial preference Output: consensus value Local data: preference, round, maxround Shared data: one single-writer multi-reader register for each process. begin preference := Initial preference; round := 1; end while statement
While Statement while true do write (preference, round) read all registers R maxround := maxR R.round if for all R where R.round >= maxround - 1, R.preference = v then return v else if exists v such that for all R where R.round = maxround, R.preference = v then preference := v end with probability 1/2n do round := round(round + 1, maxround - 2) end
Weak Consensus • Weak adversary can delay particular processes, but cannot identify in which phase each process is. • Weak adversary cannot prevent a winner from emerging. • Strong adversary can keep all the process in the same round. • Stops a process that incremented its value, until all other processes have also done it.
Strong-adversary Protocols • It is possible to solve consensus with a strong adversary. • With techniques similar to Ben-Or’s algorithm we get exponential time complexity. • Solution: Weak shared coin.
Weak Shared Coin Protocol • The protocol returns a bit, 0 or 1, to each process; • The probability that the same value is returned to all invoking processes is at least some fixed parameter ‘e’; • The probability that the protocol returns 1 to all processes equals the probability that the protocol returns 0 to all; • The protocol is wait-free;
Consensus based on WSC • Execute weak shared coin repeatedly, in a framework that detects when all processors agree. • Since probability of success is constant (e > 0), then the expectation of the number of rounds before agreement is reached is constant, e.g. O(1/e). • Overall complexity is dominated by weak shared coin’s complexity.
Consensus based on WSC • “Race” of processors. • “Leaders” are the processors which are the farthest ahead. • If leaders agree on a value, that value will be selected. • If leaders disagree, use weak shared coin for leaders. • Eventually they’ll agree.
Aspnes and Herlihy with Shared Coins while true do mark[p][r] = true if mark[1-p][r+1] then p’=1-p else if mark[1-p][r] then p’=SharedCoin(r) else if mark [1-p][r-1] then p’=p else return p end If mark[p][r+1] = false then p=p’ end r = r +1 end Input: Initial preference value Output: Consensus value Local data: Boolean preference p; Integer round r; Boolean new preference p’; Shared data: Boolean mark[b][i], b in {0, 1}, i in Z+, mark[0][0] = mark[1][0] = true; all other mark[][] are false. Subprotocol: SharedCoinr, r = 0, 1… begin p = input r = 1 end while statement
Protocol’s Method • Processor P registers that he reached turn r with preference p. • If P sees a mark in mark[1-p][r+1], it knows that it is behind, and switches to the preference 1-p. • If the latest mark that it sees is [1-p][r-1], it keeps its current preference. • It does not decide yet, because it may be some processor Q with the opposite preference about to set the mark [1-p][r]. • Q will believe it is tied with Q.
Protocol’s Method • If P sees no mark later than mark[1-p][r-2], then any Q with a different preference has not yet executed the first statement of algorithm in round r-1; • After Q does so, it will see the mark that P already put in [p][r]. • Q will switch to P’s preference. • P can safely decide p.
Protocol’s Method • What if a process S, with preference p, has just advanced to turn r+1? • P discards p’, and keeps p. • New agreement will be tried on turn r+1.
Aspnes and Herlihy Agreement • If some process has decided p, then in r+1, r, r-1 there are no processes that think 1-p. Therefore, even if a process is going to write 1-p in r-1, it will discover p in r and change its value to that value. Therefore, if some process decides a value, all processes decide that value.
Aspnes and Herlihy Validity • No process ever changes its preference, unless it first sees that some other process has a different value.
Aspnes and Herlihy Termination • If leaders agree, then everyone will agree with them. If they don’t agree, then after flipping a coin, they will agree with probability > 0. • Some leaders might miss the coin flipping • They read mark[1-p][r] before others write it. • They all agree, because advanced together. • Process flipping shared coin may agree with then with probability > 0.
Protocol’s Analysis • Each round takes O(1) times O(SharedCoin()). • The expected number of rounds is constant, because probability of agreement ‘e’ is a constant > 0. • Number of rounds = O(1/e); • Therefore, complexity is dominated by SharedCoin’s complexity. • Constant probability as opposed to a function of n, gives us the improvement.
weak shared coin • Bracha and Rachman’s Algorithm • Cast together many votes. • Adversary can stop up to n – 1 votes from being written. • If the number of votes is big enough, the n-1 votes cannot interfere in the majority.
Classification of Votes • Common votes: votes that all the process read. • Extra votes: votes that not all the processes read. • The votes have been written after some processes finished reading the pool. • Hidden votes : the votes not written • The voters have been killed by the adversary.
Vote Distribution • How to distribute votes so that the adversary cannot invalidate the election so often? • C = Common votes: • E = Extra votes: • H = Hidden votes: • Each processor can see: C + E - H
What can the adversary do? Common votes: Extra votes: P1 P2 Pn Pi Pj
Can the adversary do anything? Common votes: Extra votes: P1 P2 Pn Pi Pj
WSC: the Central Idea • Each processor can see: C + E - H • The reason the protocol works is that we can argue that the common votes have at least a constant probability of giving a majority large enough that neither the random drift of the extra votes, nor the selective pressure of the hidden votes is likely to change the apparent outcome.
Vote Distribution • How to distribute votes so that the adversary cannot invalidate the election so often? • C = Common votes: n2 • E = Extra votes: n2/logn • Each processor delivers bunches with n/log2 votes. There are n processors. • H = Hidden votes: n-1
Votes per Processor • Each processor votes several times. • What if a processor is voting alone? • Time complexity could be n2. • In order to reduce this burden, each time a processor votes, it delivers n/logn votes at every time it votes.
Shared Coin Algorithm Input: none Output: Boolean value Local data: Boolean preference p; integer round r; utility variables c, total, and ones Shared data: single-writer register r[p] for each process p, each of which holds a pair of integers (flips, ones), initially (0,0) begin if (total/ones >= 1/2) then return 1 else return 0 end Body of repeat statement
Body of Repeat Statement repeat for i = 1 to n / log n do c = CoinFlip() r[p] = (r[p].flips + 1, r[p].ones + c) end read all registers r[p] total = sum(r[p].flips) until total > n2 read all registers r[p] total = sum(r[p].flips) ones = sum(r[p].ones)
Complexity • Loop may run up to n * logn times (in case a single processor casts all the votes). • Each loop requires reading n registers, hence, the entire cost in time complexity will be n2 * logn. • What would be the complexity if processors wrote one vote at time?
Amortized Termination Test • Biggest contribution of Bracha and Rachman protocol. • Extra votes are not biased by the adversary. • It is possible to tolerate more of them. • processors write n/logn votes at a time. • If each processor wrote one vote per turn, the algorithm could carry out n3 operations.
Shared Coin Algorithm • Every one reads the common votes. • The extra votes are different from process to process, but won’t be more than n2 / logn. • The hidden votes are also different but won’t be more than n - 1. • Common and extra votes aren’t biased, since the adversary can control the order in which votes are written, but not which values are written.
Some Probabilities • Probability of [net majority of 1 in n2 votes is at least 3*n] is a constant p. • Probability of [n2 /logn additional votes, changing the net majority by more than 2*n] is bounded by 1/(n2). • Therefore, even if multiplied by the number of processes, we get a probability of 1/n. • Thus, we have probability of p*(1-1/n) that the hidden votes won’t change the result.