270 likes | 283 Views
Explore various election algorithms in distributed systems, compare with mutual exclusion, and learn about different types of election algorithms like probabilistic, static priority, and dynamic priority. Dive into the Bully Algorithm for leader election and self-stabilization concepts.
E N D
Lecture 4:Elections, Reset Anish Arora CSE 763 Notes include material from Dr. Jeff Brumfield
Reading Material • Hector Garcia-Molina, "Elections in a Distributed Computing System", IEEE Transactions on Computers, Vol. C-31, No. 1, January 1982, pp. 48-59 • E. W. Dijkstra, "Self Stabilizing Systems in Spite of Distributed Control," Communications of the ACM, Vol. 17, 1974, pp. 643-644 • A. Arora, M. Gouda, "Distributed Reset", IEEE Transactions on Computers, Vol. 43, No.9, September 1994, pp. 1026-1038 • Chapter 10 and 12 in Paul Sivilotti’s book
Election in Distributed Systems Problem Select a unique site from a set of candidate sites Selection scheme must not require a coordinator or leader Applications • Selection of a coordinator for mutual exclusion, deadlock detection, two-phase commit, etc • Selection of sites for location of replicated objects • Selection of a site to assume the duties of a failed server
Election Versus Mutual Exclusion Similarities • Both election algorithms and mutual exclusion algorithms select one site from a set of candidate sites • Both types of algorithms must function correctly in the presence of failures Differences • In an election, fairness may not be important. In mutual exclusion, every site should eventually be selected • Every site must know the identity of the site that wins an election. Other sites do not need to know the site selected by a mutual exclusion algorithm
Types of Election Algorithms Probabilistic • Each competing site is equally likely to win the election Static Priority • Each site has a unique predefined priority • The site having the highest priority should win the election Dynamic Priority • Each site has a priority that varies over time • The site having the highest priority at the beginning of the election should win the election
A Probabilistic Algorithm Algorithm (to be carried out by each process) • Generate a random integer, b, uniformly distributed in the interval [0, N-1], where N is the number of processes • Send the selected value to every other process • When the values bi for i = 0,…,N-1 have been received from all other processes, compute k := (i : 0iN-1 : bi) mod N Process k wins the election
A Probabilistic Algorithm Assumptions • Processes participating in the election are known a-priori and are numbered 0,1,…,N-1 • Processes do not fail or send inconsistent information Analysis • Number of messages required in an election are N2-N • If a process follows this algorithm, its probability of winning is 1/N, regardless of values selected by other processes • All processes determine the same winner
Variant of the Probabilistic Algorithm Unknown number of participants N • Generate N-1 values in the intervals [0,N-1], [0,N-2], … , [0,1] • Exchange values with other participants • When number of participants is determined, use appropriate set of values as in previous algorithm
The Bully Algorithm Assumptions • Each process is assigned a unique priority number • The highest priority active process should always win the election • Every process knows of the existence of every other process and its priority number • Process may fail during the election • Failed process may subsequently recover
The Bully Algorithm The Algorithm Send election message to each higher priority process Delay for time T If no responses received then take over as leader inform each lower priority process of change Else (* response received *) delay for time T’ If “I am leader” message received record this fact Else restart the algorithm
The Bully Algorithm Run this algorithm if • we receive no response from the leader • we receive an election message from a lower priority process • we have just recovered from failure Analysis O(N2) messages maybe required
Self Stabilization • A system is self-stabilizing if, regardless of its initial state, it is guaranteed to arrive at a legitimate state in a finite number of steps • If a failure occurs in a self-stabilizing system, the system will correct itself without any form of outside intervention
Assumptions • Each site has a unique site number • Sites can communicate directly with neighboring sites • Each site maintains knowledge of its functioning neighbors
Objectives • The functioning site having the highest site number is the leader • Every functioning site knows the identity of the leader • Every functioning site knows a functioning path to the leader
Perturbations A perturbation in the system can be caused by a failure, a recovery from a failure, or an enhancement or reconfiguration of the system Possible perturbations to a system: • A site can fail or be removed from the system • A site can recover from failure or be added to the system • A communications link can fail or be removed from the system • A communications link can recover from failure or be added to the system • A variable in a site's local memory can be changed
Arora and Gouda’s Algorithm Each site maintains three variables: • leader - the identity of the site believed to be the leader • parent - the identity of the next node in a path to the leader • dist - the distance to the leader, measured in number of links
Algorithm Structure This version of the algorithm assumes that a site's local variables cannot be corrupted begin (our leader < self) or (we can’t communicate with parent) our leader := self our parent := self ▯ (parent’s leader our leader) our leader := parent’s leader ▯ (a neighbor’s leader > our leader) our leader := neighbor’s leader our parent := neighbor end
Simplified Algorithm begin (leader.i < i) or (parent.ineighbor.i [i]) leader.i, parent.i := i, i ▯ parent.i = j and j neighbor.i and leader.i leader.j leader.i := leader.j ▯ j neighbor.i and leader.i < leader.j leader.i, parent.i := leader.j, j end
Formation of cycles • The corruption of a site's local variables can produce a cycle in the parent graph • The algorithm must be extended to automatically break cycles • Let K be an upper bound on the number of sites in the system
Complete Algorithm begin (leader.i < i) or (parent.i = i and (leader.i i or dist.i 0)) or (parent.ineighbor.i [i]) or (dist.i K) leader.i, parent.i, dist.i := i, i, 0 ▯ parent.i = j and j neighbor.i and dist.j < K and (leader.i leader.j or dist.i dist.j+1) leader.i, dist.i := leader.j. dist.j+1 ▯ leader.i < leader.j and j neighbor.i and dist.j < K leader.i, parent.i, dist.i := leader.j, j, dist.j+1 end
Fairness • Minimal: If some program action is enabled, then some enabled action is executed • Weak: If some program action is continuously enabled, then that program action is eventually executed • Process: If some process actions are continuously enabled, then some enabled action of the process is eventually executed • Strong: If some program action is infinitely often enabled, then that program action is infinitely often executed Hyperfairness, extreme fairness, … Reference: “Fairness”, by Nissim Francez, Springer Verlag 1986
Fairness Theorem: The Arora-Gouda protocol is correct under minimal fairness Corollary: The Arora-Gouda protocol is correct under weak fairness, process fairness, … • Fake Leader values disappear: Fake leader values of minimum distance “disappear”: • These values are non-decreasing • These values eventually increase • K is an upper bound for these values
Fairness • Process with highest priority elects itself as leader, by executing its first action: • Let the highest priority up process be k • Unless leader.k=k dist.k=0 parent.k=k holds, by (1),(2),(3) the leader value k will disappear, and leader.k<k will be continuously enabled until the first action of k is executed • By induction on d – the distance of a process from process k – argue that all processes at distance d will eventually “correctly join” the tree routed at k: • Assuming that the tree till depth d-1 is correctly formed, the second or the third action of a process at distance d is continuously enabled unless the process correctly joins the tree