170 likes | 344 Views
Election. Why Election?. Example 1: A Bank maintains multiple servers, but one special server elected to manage each bank account Example 2: (2 classes ago) In the sequencer-based algorithm for total ordering of multicasts What happens if the “special” sequencer process fails?
E N D
Why Election? • Example 1: A Bank maintains multiple servers, but one special server elected to manage each bank account • Example 2: (2 classes ago) In the sequencer-based algorithm for total ordering of multicasts • What happens if the “special” sequencer process fails? • Leader used to coordinate group of processes • But leader may fail (crash) • Some process detects this • Then what? • Focus of this lecture: Election
Principles of Election • Any process can call for an election. • A process can call for at most one election at a time. • There could be concurrent calls for the same election. • The result of an election does not depend on which process calls for it. • Processes involved in an election are called the “participants” of the election. • The non-failed process with the best election attribute value (e.g., highest id or address, fastest cpu, etc.) is elected. • A run of the election algorithm must always guarantee at the end: • Safety: P (P’s elected = (q: non-failed process with the best attribute value) or ) • Liveness: election( (election terminates) & P: non-crashed participant of election, P’s elected is not )
Ring Election • Processes are organized in a logical ring. • pi has a communication channel to p(i+1) mod N. • All messages are sent clockwise around the ring. • Any process that discovers a coordinator has failed initiates an “election” message that contains its id:attr. • When a process receives an election message, it compares the attr in the message with its own. • If the arrived attr is greater, the receiver forwards the message. • If the arrived attr is smaller and the receiver has not forwarded an election message earlier, it substitutes its own id:attr in the message and forwards it. • If the arrived id:attr is that of the receiver, then this process’s attr must be the greatest, and it becomes the new coordinator. This process then sends an “elected” message to its neighbor announcing the election result. • When a process pi receives an elected message, it • sets its variable electedi id of the message. • forwards the message if it is not the new coordinator.
33 17 4 24 9 1 15 24 28 A Ring-Based Election in Progress • The worst-case scenario occurs when the counter-clockwise neighbor has the highest id. • A total of N-1 messages is required to reach the new coordinator-to-be. • Another N messages are required until the new coordinator-to-be ensures it is the new coordinator. • Another N messages are required to circulate the elected messages. Note: The election was started by process 17.The highest process identifier encountered so far is 24. (final leader will be 33)
Ring-based Election Assume – no failures happen during the run of the election algorithm • Why is Safety satisfied? • Why is Liveness satisfied?
P1 P1 P1 P2 P2 P2 P0 P0 P0 P5 P5 P5 P3 P3 P3 P4 P4 2. P2 receives “election”, P4 dies P4 3. Election: 4 is forwarded for ever? 1. P2 initiates election Example: Ring Election Election: 4 Election: 2 Election: 4 Election: 3 Election: 4 May not work when process failure occurs during the election! Consider above example where attr==highest id
Modification to Ring Election • Processors are organized in a logical ring. • Any processor that discovers a coordinator has failed initiates an “election” message. • The message is circulated around the ring, bypassing failed nodes. • Each node adds its id to the message as it passes it to the next node. • Once the message gets to the initiator, it elects the node with the best election attribute value. • It then sends a “coordinator” message with the id of the newly-elected coordinator. Again, each node adds its id to the end of the message. • Once “coordinator” message gets back to initiator, • election is over if “coordinator” is in id-list. • else the algorithm is repeated.
P1 P1 P1 P1 P1 P1 P2 P2 P2 P2 P2 P2 P0 P0 P0 P0 P0 P0 P5 P5 P5 P5 P5 P5 P3 P3 P3 P3 P3 P3 P4 P4 P4 P4 P4 2. P2 receives “election”, P4 dies P4 3. P2 selects 4 and announces the result 1. P2 initiates election 4. P2 receives “Coord”, but P4 is not included 5. P2 re-initiates election 6. P3 is finally elected Example: Ring Election Election: 2, 3,4,0,1 Election: 2 Coord(4): 2 Election: 2,3 Election: 2,3,4 Coord(4): 2,3 Coord(4) 2, 3,0,1 Coord(3): 2, 3,0,1 Election: 2, 3,0,1 Coord(3): 2,3,0 Election: 2,3,0 Coord(3): 2 Election: 2 Election: 2,3 Coord(3): 2,3
Modified Ring Election • Requires reconfiguration of ring upon failures • Ok if all processes “know” about all other processes in the system • Supports concurrent elections – an initiator with a lower id blocks other initiators’ election message • How would you redesign the algorithm to be fault-tolerant to an initiator’s failure?
Election by the Bully Algorithm • Assumptions: • Synchronous system • All messages arrive within Ttrans units of time. • A reply is dispatched within Tprocess units of time of the receipt of a message. • if no response is received in 2Ttrans + Tprocess, the node is assumed to be dead. • attr=id • Each process knows all the other processes in the system (and thus their id’s) • If a process knows it has the highest id, it can elect itself as the coordinator, it simply sends a coordinator message to all processes with lower identifiers. • A node initiates election by sending an “election” message to the higher numbered nodes. • If no answer, announce itself to lower nodes as coordinator. • if any answer, there is a higher node active, and the node waits for announcement of the new coordinator. • A node that receives an “election” message replies OK and begins another election – unless it has begun one already.
The Bully Algorithm The election of coordinator p2, after the failure of p4 and then p3
P1 P1 P1 P2 P0 P2 P2 P0 P0 Election OK Election Election Election OK P5 P5 P5 P3 P3 P3 Election Election P4 P4 P4 3. P3 & P4 initiate election 2. P2 receives “replies 1. P2 initiates election P1 P1 P1 P2 P2 P0 P0 P2 P0 coordinator P5 P5 P3 P3 P5 OK P3 P4 P4 P4 5. P4 announces itself 5. P4 receives no reply 4. P3 receives reply Example: Bully Election
Performance of Bully Algorithm • Best case scenario: The process with the second highest id notices the failure of the coordinator and elects itself. • N-2 coordinator messages are sent. • Turnaround time is one message transmission time. • Worst case scenario: When the process with the least id detects the failure. • N-1 processes altogether begin elections, each sending messages to processes with higher ids. • The message overhead is O(N2). • Turnaround time is approximately 5 message transmission times.
Summary • Coordination requires a leader process, e.g., sequencer for total ordering in multicasts, Busey bank database example • Leader process might fail • Need to (re-) elect leader process • Three Algorithms • Ring algorithm • Modified Ring algorithm • Bully Algorithm • Is there always a sequence of failures so that no process is ever marked as leader? • Consensus Problem: next class, read the paper on website (see Lectures page) + section 11.1 (you don’t need to read section 11.5)
Voting V.S. Election • Maekawa’s algorithm is an example of voting • Processors are not aware of the result of their vote • Failure does not figure in the voting • Elections are initiated to select a coordinator or to grant special privileges to a process • All processes are informed of the result. • Election is usually initiated when processor failure occurs. P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16 Can this lead to a deadlock? Voting set for processor P7