Consensus – Randomized Algorithm

Consensus – Randomized Algorithm Slides by Prof. Jennifer Welch

Randomized Consensus • To get around the negative results for asynchronous consensus, we can: • weaken the termination condition: nonfaulty processors must decide with some nonzero probability • keep the same agreement and validity conditions • This version of consensus is solvable, in both shared memory and message passing!

Motivation for Adversary • Even without randomization, in an asynchronous system there are many executions of an algorithm, even when the inputs are fixed, depending on when processors take steps, when they fail, and when messages are delivered. • To be able to calculate probabilities, we need to separate out variation due to causes other than the random choices • Group executions of interest so that each group differs only in the random choices • Perform probabilistic calculations separately for each group and then combine somehow

Adversary • Concept used to account for all variability other than the random choices is that of "adversary". • Adversary is a function that takes an execution prefix and returns the next event to occur. • Adversary must obey admissibility conditions of the revelant model • Other conditions might be put on the adversary (e.g., what information it can observe, how much computational power it has)

Probabilistic Definitions • An execution of a specific algorithm, exec(A,C0,R), is uniquely determined by • an adversary A • an initial configuration C0 • a collection of random numbers R • Given a predicate P on executions and a fixed adversary A and initial config C0, Pr[P] is the probability of {R : exec(A,C0,R) satisfies P} • Let T be a random variable (e.g., running time). For a fixed A and C0, the expected value of T is ∑ x Pr[T = x] x is a value of T

Probabilistic Definitions • We define the expected value of a complexity measure to be the maximum over all admissible adversaries A and initial configurations C0, of the expected value for that particular A and C0. • So this is a "worst-case" average: worst possible adversary (pattern of asynchrony and failures) and initial configuration, averaging over the random choices.

A Randomized Consensus Algorithm • Works in message passing model • Tolerates f crash failures • more complicated version handles Byzantine failures • Works in asynchronous case • circumvents asynchronous impossibility result • Requires n > 2f • this is optimal

Consensus Algorithm ensures a high level of consistency b/w what different procs get Code for processor pi: Initially r = 1 and prefer = pi 's input • while true do • votes := get-core(<VOTE,prefer,r>) • let v be majority of phase r votes • if all phase r votes are v then decide v • outcomes := get-core(<OUTCOME,v,r>) • if all phase r outcome values are w • then prefer := w • else prefer := common-coin() • r := r + 1 uses randomization to imitate tossing a coin

Properties of Get-Core • Executed by n processors, at most f of which can crash. • Input parameter is a value supplied by the calling processor. • Return parameter is an n-array, one entry per processor • Every nonfaulty processor's call to get-core returns. • There exists a set C of more than n/2 processors such that every array returned by a call to get-core contains the input parameter supplied by every processor in C.

Properties of Common-Coin • Subroutine implements an f-resilient common coin with bias . • Executed by n processors, at most f of which can crash. • No input parameter • Return parameter is a 0 or 1. • Every nonfaulty processor's call to common-coin returns. • Probability that a return value is 0 is at least . • Probability that a return value is 1 is at least .

Correctness of Consensus Algorithm • For now, don't worry about how to implement get-core and common-coin. • Assuming we have subroutines with the desired properties, we'll show • validity • agreement • probabilistic termination (and expected running time)

Unanimity Lemma Lemma (14.6): if all procs. that reach phase r prefer v, then all nonfaulty procs decide v by phase r. Proof: • Since all prefer v, all call get-core with v • Thus get-core returns a majority of votes for v • Thus all nonfaulty procs. decide v

Validity • If all processors have input v, then all prefer v in phase 1. • By unanimity lemma, all nonfaulty processors decide v by phase 1.

Agreement Claim: If pi decides v in phase r, then all nonfaulty procs. decide v by phase r + 1. Proof: Suppose r is earliest phase in which any proc. decides. • pi decides v in phase r • all its phase r votes are v • pi 's call to get-core(<VOTE,prefer,r>) returns more than n/2 non-nil entries and all are <VOTE,v,r> • all entries for procs. in C are <VOTE,v,r>

Agreement • Thus every pj receives more than n/2 <VOTE,v,r> entries • pj does not decide a value other than v in phase r • Also if pj calls get-core a second time in phase r, it uses input <OUTCOME,v,r> • Every pk gets only <OUTCOME,v,r> as a result of its second call to get-core in phase r • pk sets preference to v at end of phase r • in round r + 1, all prefer v and Unanimity Lemma implies they all decide v in that round.

Termination Lemma (4.10): Probability that all nonfaulty procs decide by any particular phase is at least . Proof: Case 1: All nonfaulty procs set preference in that phase using common-coin. • Prob. that all get the same value is at least 2  ( for 0 and  for 1), by property of common-coin • Then apply Unanimity Lemma (14.6)

Termination Case 2: Some processor does not set its preference using common-coin. • All procs. that don't use common-coin to set their preference for that round have the same preference, v (convince yourself) • Probability that the common-coin subroutine returns v for all procs. that use it is at least . • Then apply the Unanimity Lemma (14.6).

Expected Number of Phases • What is the expected number of phases until all nonfaulty processors have decided? • Probability of all deciding in any given phase is at least . • Probability of terminating after i phases is (1 - )i-1. • Geometric random variable whose expected value is 1/ .

Implementing Get-Core • Difficulty in achieving consistency of messages is due to combination of asynchrony and crash possibility: • a processor can only wait to receive n - f messages • the first n - f messages that pi gets might not be from the same set of processors as pj 's first n - f messages • Overcome this by exchanging messages three times

Get-Core First exchange ("round"): • send argument value to all • wait for n - f first round msgs Second exchange ("round"): • send values received in first round to all • wait for n - f second round msgs • merge data from second round msgs Third exchange ("round"): • send values received in second round to all • wait for n - f third round msgs • merge data from third round msgs • return result

Analysis of Get-Core • Lemmas 14.4 and 14.5 show that it satisfies the desired properties (termination and consistency). • Time is O(1) (using standard way of measuring time in an asynchronous system)

Implementing Common-Coin A simple algorithm: • Each processor independently outputs 0 with probability 1/2 and 1 with probability 1/2. • Bias  = 1/2n • Advantage: simple, no communication • Disadvantage: Expected number of phases until termination is 2n

A Common Coin with Constant Bias 0 with probability 1/n 1 with probability 1 - 1/n coins := get-core(<FLIP,c>) if there exists j s.t. coins[j] = 0 then return 0 else return 1 c :=

Correctness of Common Coin Lemma (14.12): Common-coin implements a (n/2 - 1)-resilient coin with bias 1/4. Proof: Fix any admissible adversary that is weak (cannot see the contents of messages) and any initial configuration. All probabilities are calculated with respect to them.

Probability of Flipping 1 • Probability that all nonfaulty processors get 1 for the common coin is at least the probability that they all set c to 1. • This probability is at least (1 - 1/n)n • When n = 2, this function is 1/4 • This function increases up to its limit of 1/e. • Thus the probability that all nonfaulty processors get 1 is at least 1/4.

Probability of Flipping 0 • Let C be the set of core processors (whose existence is guaranteed by properties of get-core). • If any processor in C sets c to 0, then all the nonfaulty processors will observe this 0 after executing get-core, and thus return 0. • Probability at least one processor in C sets c to 0 is 1 - (1 - 1/n)|C|. • This expression is at least 1/4 (by arithmetic).

Summary of Randomized Consensus Algorithm • Using the given implementations for get-core and common-coin, we get a randomized consensus algorithm for f crash failures with • n > 2f • O(1) expected time complexity • expected number of phases is 4 • time per phase is O(1)

Consensus – Randomized Algorithm