400 likes | 412 Views
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS. Spring 2014 Prof. Jennifer Welch. Problems Solvable in Failure-Prone Asynchronous Systems.
E N D
Set 19: Asynchronous Solvability CSCE 668DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch
Problems Solvable in Failure-Prone Asynchronous Systems • Although consensus is not solvable in failure-prone asynchronous systems (neither message passing nor read/write shared memory), there are some interesting problems that are solvable: • set consensus • approximate agreement • renaming • k-exclusion weakenings of consensus - "opposite" of consensus - fault-tolerant variant of mutex Set 19: Asynchronous Solvability
Model Assumptions • asynchronous • shared memory with read/write registers • heavy use of atomic snapshot objects • at most f crash failures of procs. • results can be translated to message passing if f < n/2 (cf. Chapter 10) • may be a few asides into message passing Set 19: Asynchronous Solvability
Set Consensus Motivation • By judiciously weakening the definition of the consensus problem, we can overcome the asynchronous impossibility • We've already seen a weakening of consensus: • weaker termination condition for randomized algorithms • How about weakening the agreement condition? • One weakening is to allow more than one decision value: • allow a set of decisions Set 19: Asynchronous Solvability
Set Consensus Definition Termination: Eventually, each nonfaulty processor decides. k-Agreement: The number of different values decided on by nonfaulty processors is at most k. Validity: Every nonfaulty processor decides on a value that is the input of some processor. new Set 19: Asynchronous Solvability
Set Consensus Algorithm • Uses a shared atomic snapshot object X • can be implemented with read/write registers • update your segment of X with your input • repeatedly scan X until there are at least n - f nonempty segments • decide on the minimum value appearing in any segment Set 19: Asynchronous Solvability
Correctness of Set Consensus Algorithm • Termination: at most f crashes. • Validity: every decision is some proc's input • Why does k-agreement hold? • We'll show it does as long as k > f. • Sanity check: When k = 1, we have standard consensus. As long as there is less than 1 failure, we can solve the problem. Set 19: Asynchronous Solvability
k-Set Agreement Condition • Let S be set of min values in final scan of each nf proc; these are the nf decisions • Suppose in contradiction |S| > f + 1. • Let v be largest value in S, the decision of pi. • So pi's final scan misses at least f + 1 values, contradicting the code. Set 19: Asynchronous Solvability
Synchronous vs. Asynchronous? • How does the previous, asynchronous, algorithm compare to the synchronous (message-passing) algorithm for k-set consensus from Chapter 5 homework? • Recall the synchronous algorithm runs in f/k + 1 rounds. Set 19: Asynchronous Solvability
Set Consensus Lower Bound Theorem: There is no asynchrounous algorithm for solving k-set consensus in the presence of f failures, if f ≥ k. • Straightforward extensions of consensus impossibility result fail; even proving the existence of an initial bivalent configuration is quite involved. • Original proof of set-consensus impossibility used concepts from algebraic topology • Textbook's proof uses more elementary machinery, but still very involved Set 19: Asynchronous Solvability
Approximate Agreement Motivation • An alternative way to weaken the agreement condition for consensus: • Require that the decisions be close to each other, but not necessarily equal • Seems appropriate for continuous-valued problems (as opposed to discrete) Set 19: Asynchronous Solvability
Approximate Agreement Definition Termination: Eventually, each nonfaulty processor decides. -Agreement: All nonfaulty decisions are within of each other. Validity: Every nonfaulty decision is within the range of the input values. new new Set 19: Asynchronous Solvability
Approximate Agreement Algorithm • Assume procs know the range from which input values are drawn: • let D be the length of this range • wait-free: up to n - 1 procs can fail • algorithm is structured as a series of "asynchronous rounds": • exchange values via a snapshot object, one per round • compute midpoint for next round • continue until spread of values is within , which requires about log2 (D/)rounds Set 19: Asynchronous Solvability
Approximate Agreement Algorithm Shared atomic snapshot objects ASO[1], ASO[2],... Initially local variable v = pi's input Initially local variable r = 1 while true do • update pi's segment of ASO[r] to be v • let scan be set of values obtained by scanning ASO[r] • v := midpoint(scan) • if r = log2 (D/) + 1then decide v and terminate • else r++ Set 19: Asynchronous Solvability
Analysis of Approx. Agreement Alg. Definitions w.r.t. a particular execution: • M = log2 (D/) + 1 • U0 = set of input values • Ur = set of all values ever written to ASO[r] Set 19: Asynchronous Solvability
Helpful Lemma Lemma (16.8): Consider any round r < M. Let u be the first value written to ASO[r]. Then the values written to ASO[r+1] are in this range: min(Ur) (min(Ur)+u)/2 u (max(Ur)+u)/2 max(Ur) elements of Ur+1 are in here Set 19: Asynchronous Solvability
Implications of Lemma • The range of values written to the ASO object for round r + 1 is contained within the range of values written to the ASO object for round r. • range(Ur+1) range(Ur) • The spread (max - min) of values written to the ASO object for round r + 1 is at most half the spread of values written to the ASO object for round r. • spread(Ur+1) ≤ spread(Ur)/2 Set 19: Asynchronous Solvability
Correctness of Algorithm • Termination: Each proc executes M asynchronous rounds. • Validity: The range at each round is contained in the range at the previous round. • -Agreement: spread(UM) ≤ spread(U0)/2M ≤ D/2M ≤ Set 19: Asynchronous Solvability
Handling Unknown Input Range • Range might not be known. • Actual range in an execution might be much smaller than maximum possible range. • First idea: have a preprocessing phase in which procs try to determine input range • but asynchrony and possible failures makes this approach problematic • Instead… Set 19: Asynchronous Solvability
Handling Unknown Input Range • Use just one atomic snapshot object • Dynamically recalculate how many rounds are needed as more inputs are revealed • Skip over rounds to try to catch up to maximum observed round • Only consider values associated with maximum observed round • Still use midpoint Set 19: Asynchronous Solvability
Unknown Input Range Algorithm shared atomic snapshot object A; initially all segments hold updatei(A,[x,1,x]), where x is pi's input // [original input, rd#, current estimate] repeat scan A let S be spread of all inputs from scan (ignore segments) if S = 0 then maxRound := 0 else maxRound := log2(S/) let rmaxbe largest round from scan (ignore segments) let values be set of estimates in segments with round number rmax updatei(A,[x,rmax+1,midpoint(values)]) until rmax ≥ maxRound decide midpoint(values) Set 19: Asynchronous Solvability
Analysis of Unknown Input Range Algorithm Definitions w.r.t. a particular execution: • U0 = set of all input values • Ur = set of all values ever written to A with round number r • M = largest r s.t. Ur is not empty With these changes, correctness proof is similar to that for known input range algorithm. Set 19: Asynchronous Solvability
Key Differences in Proof • Why does termination hold? • a proc's local maxRound variable can only increase if another proc wakes up and increases the spread of the observable inputs. This can happen at most n - 1 times. • Why does -agreement hold? • If pi's input is observed by pj the last time pj computes its maxRound, same argument as before. • Otherwise, when piwakes up, it ignores its own input and uses values from maxRound or later. Set 19: Asynchronous Solvability
Renaming • Procs start with unique names from a large domain • Procs should pick new names that are still distinct but that are from a smaller domain • Motivation: Suppose original names are serial numbers (many digits), but we'd like the procs to do some kind of time slicing based on their ids Set 19: Asynchronous Solvability
Renaming Problem Definition Termination:Eventually every nonfaulty proc pidecides on a new name yi Uniqueness:If pi and pj are distinct nonfaulty procs, then yi ≠ yj. We are interested in anonymous algorithms: procs don't have access to their indices, just to their original names. Code depends only on your original name. Set 19: Asynchronous Solvability
Performance of Renaming Algorithm • New names should be drawn from {1,2,…,M}. • We would like M to be as small as possible. • Uniqueness implies M must be at least n. • Due to the possibility of failures, M will actually be larger than n. Set 19: Asynchronous Solvability
Renaming Results • Algorithm for wait-free case (f = n –1) with M = n + f = 2n – 1. • Algorithm for general f with M = n + f. • Lower bound that M must be at least n + 1, for wait-free case. • Proof similar to impossibility of wait-free consensus • Stronger lower bound that M must be at least 2n – 1, for wait-free case if n satisfies a certain number-theoretic property • If n does not satisfy the property, there is a wait-free algorithm with M = 2n – 2. (includes n = 6, 10, 14,...) Set 19: Asynchronous Solvability
Wait-Free Renaming Algorithm Shared atomic snapshot object A; initially all segments hold s := 1 // suggestion for pi’s new name while true do update pi’s segment of A to be [x,s], where x is pi’s original name scan A if s is also someone else's suggestion (per scan result) then let r be rank of x among original names of non- segments let s be r-th smallest positive integer not currently suggested by another proc else decide on s for new name and terminate Set 19: Asynchronous Solvability
Analysis of Renaming Algorithm Uniqueness:Suppose in contradiction piand pj choose same new name, s. pi's last update before deciding: suggests s pj's last scan before deciding s pi's last scan before deciding s sees s as pi's suggestion and doesn't decide s contradiction! Set 19: Asynchronous Solvability
Analysis of Renaming Algorithm • New name space is {1, …, 2n– 1}. • Why? • rank of a proc pi's original name is at most n (the largest one) • worst case is when each of the n – 1 other procs has suggested a different new name for itself, so suggested names are {1, …, n– 1}. • Then pi suggests n + n – 1 = 2n – 1. Set 19: Asynchronous Solvability
Analysis of Renaming Algorithm Termination:Suppose in contradiction some set T of nonfaulty procs never decide in some execution. • Consider the suffix of the execution in which • each proc in T has already done at least one update and • only procs in T take steps (others have either already crashed or decided). Set 19: Asynchronous Solvability
Analysis of Renaming Algorithm • Let F be the set of new names that are free (not suggested at the beginning of by any proc not in T) • the trying procs need to choose new names from this set. • Let z1, z2,… be the names in F in order. • By the definition of , no proc wakes up during and reveals an additional original name, so all procs in T are working with the same set of original names during . • Let pi be proc whose original name has smallest rank (among this set of original names). Let r be this rank. Set 19: Asynchronous Solvability
Analysis of Renaming Algorithm • Eventually procs other than pistop suggesting zr as a new name: • After starts, every scan indicates a set of free names that is no larger than F. • Every trying proc other than pi has a larger rank and thus continually suggests a new name for itself that is larger than zr, once it does the first scan in . Set 19: Asynchronous Solvability
Analysis of Renaming Algorithm • Eventually pidoes suggest zr as its new name: • By choice of zr as r-thsmallest free new name, and fact that eventually other trying procs stop suggesting z1 through zr, eventually pisees zr as free name with r-th smallest rank. • Contradicts assumption that piis trying (i.e., stuck). • So termination holds. Set 19: Asynchronous Solvability
General Renaming • Suppose we know that at most f procs will fail, where f is not necessarily n - 1. • We can use the wait-free algorithm, but it is wasteful in the size of the new name space, 2n – 1, if f < n – 1. • We can do better (if f < n – 1) with a slightly different algorithm: • keep track in the snapshot object of whether you have decided • an undecided proc suggests a new name only if its original name is among the f + 1 lowest names of procs that have not yet decided. Set 19: Asynchronous Solvability
k-Exclusion Problem • A fault-tolerant version of mutual exclusion. • Processors can fail by crashing, even in the critical section (stay there forever). • Allow up to k processors to be in the critical section simultaneously. • If < k processors fail, then any nonfaulty processor that wishes to enter the critical section eventually does so. Set 19: Asynchronous Solvability
k-Exclusion Algorithm cf. paper by Afek et al. [5]. Set 19: Asynchronous Solvability
k-Assignment Problem • A specialization of k-Exclusion to include: • Uniqueness: Each proc in the critical section has a variable called slot, which is an integer between 1 and m. If piand pj are in the C.S. concurrently, then they have different slots. • Models situation when there is a pool of identical resources, each of which must be used exclusively: • k is number of procs that can be in the pool concurrently • m is the number of resources • To handle failures, m should be larger than k Set 19: Asynchronous Solvability
k-exclusion exit section k-Assignment Algorithm Schema k-assignment entry section k-exclusion entry section renaming using m = 2k-1 names what about repeated invocations? k-assignment exit section Set 19: Asynchronous Solvability
request-name for long-lived renaming using m = 2k-1 names release-name for long-lived renaming using m = 2k-1 names k-Assignment Algorithm Schema k-assignment entry section k-exclusion entry section k-assignment exit section k-exclusion exit section Set 19: Asynchronous Solvability