290 likes | 507 Views
Leader Election. Election Algorithms. Many distributed algorithms need one process to act as coordinator Doesn’t matter which process does the job, just need to pick one Election algorithms : technique to pick a unique coordinator (aka leader election )
E N D
Leader Election CS 271
Election Algorithms • Many distributed algorithms need one process to act as coordinator • Doesn’t matter which process does the job, just need to pick one • Election algorithms: technique to pick a unique coordinator (aka leader election) • Types of election algorithms: Bully and Ring algorithms CS 271
Bully Algorithm • Each process has a unique numerical ID • Processes know Ids and address of all other process • Communication is assumed reliable • Key Idea: select process with highest ID • Process initiates election if it just recovered from failure or if coordinator failed • 3 message types: election, OK, I won • Processes can initiate elections simultaneously • Need consistent result CS 271
Bully Algorithm Details • Any process P can initiate an election • PsendsElection messages to all process with higher Ids and awaits OK messages • If noOKmessages, Pbecomes coordinator & sends I won to all process with lower Ids • If it receives OK, it drops out & waits for I won • If a process receivesElectionmsg, it returns OK and starts an election • If a process receivesI won then sender is coordinator CS 271
Bully Algorithm Example • Process 4 holds an election • Process 5 and 6 respond, telling 4 to stop • Now 5 and 6 each hold an election CS 271
Bully Algorithm Example • Process 6 tells 5 to stop • Process 6 wins and tells everyone CS 271
Simple Ring-based Election • Processes have unique Ids and arranged in a logical ring • Each process knows its neighbors • Select process with highest ID as leader • Begin election if just recovered or coordinator has failed • Send Election to closest downstream node that is alive • Sequentially poll each successor until a live node is found • Each process tags its ID on the message • Initiator picks node with highest ID and sends a coordinator message • Multiple elections can be in progress—no harm. CS 271
Ring Algorithm Example CS 271
Ring Algorithm Example CS 271
Comparison • Assume n processes and one election in progress • Bully algorithm • Worst case: initiator is node with lowest ID • Triggers n-2 elections at higher ranked nodes: O(n2)msgs • Best case: immediate election: n-2 messages • Ring • 2 (n-1) messages always CS 271
Highlights of Leader Election • Basic idea: each process has a unique process-id. • Once leader is discovered died, elect process with highest (lowest) process-id. CS 271
Broadcast Protocols CS 271
Broadcast Protocols • Why Broadcast protocols? • Data replication • Highly available servers • Cluster management • Distributed logging • …… • Sometimes, message is received, but delivered later to satisfy some order requirements. CS 271
Ordering properties: FIFO(Cornell) • Fifoor sender orderedmulticast: fbcast Messages are delivered in the order they were sent (by any single sender) a e p q r s CS 271
Ordering properties: FIFO a e p q r s b c d delivery of c to p is delayed until after b is delivered CS 271
Limitations of FIFO Broadcast Scenario: • User A broadcasts a message to a mailing list • B delivers that message • B broadcasts reply • C delivers B’s response without A´s original message • and misinterprets the message CS 271
Ordering properties: Causal • Causalor happens-beforeordering: cbcast If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations a p q r s b CS 271
Ordering properties: Causal a p q r s b c delivery of c to p is delayed until after b is delivered CS 271
Ordering properties: Causal a e p q r s b c delivery of c to p is delayed until after b is delivered e is sent (causally) after b CS 271
Ordering properties: Causal a e p q r s b c d delivery of c to p is delayed until after b is delivered delivery of e to r is delayed until after b&c are delivered CS 271
Limitation of Causal Broadcast Causal broadcast does not impose any order on unrelated messages. Two replicas can deliver operations/request in different order. CS 271
Ordering properties: Total • Totalor locally totalmulticast: atomic bcast Messages are delivered in same order to all recipients (including the sender) a e p q r s b d c all deliver a, b, c, d, then e CS 271
Simple Causal broadcast protocol • Each broadcast messagecarries all causally preceding messages • Before delivery, ensure causality by delivering any missed causally preceding messages. CS 271
Isis Causal Broadcast • Each process maintains a time vector of size n. • Initially VT[i] = 0. • When p sends a new message m: VT[p]++ • Each message is piggybacked with VTm which is the current VT of the sender. • When p delivers a message, p updates its vector: for k in 1..n: • VTp[k] = max{ VTp[k], VTm[k] }. CS 271
Isis Causal Order • Requirement for delivery at node j: • VTsender[sender] = VTreceiver[sender]+1 • This is the next message from sender • VTsender[k] =< VTreceiver[k] for all k not sender • Receiver has received all causally preceding messages VTsender sender receiver VTreceiver CS 271
Total order • Different classes of total order broadcast: • Fixed sequencer • Moving sequencer using Token • Dstributed agreement using Timestamp CS 271
Using Sequencer (Amoeba) • Delivery algorithm similar to FIFO except for using a special “sequencer” to order messages • Sender attaches unique id ito each message m and sends <m,i> to the sequencer as well as to all destinations • Sequencer maintains sequence number S (consecutive and increasing) and broadcast<i, S> to all destinations. • Message(k) is delivered • if all messages(j) (0 j < k) are received CS 271
Distributed Total Order Protocol (ISIS) • Processes collectively agree on sequence numbers (priority) in three rounds • Sender sends message <m, id> to all receivers; • Receiverssuggest priority (sequence number) and reply to sender with proposed priority; • Sendercollects all proposed priorities; decides on final priority (breaking ties with process ids), and resends the agreed final priority for message m • Receivers deliver message m according to decided final priority CS 271
ISIS algorithm for total ordering P 2 1 Message 3 2 P 2 4 2 Proposed Seq 1 3 Agreed Seq 1 2 Group g: P1, P2, P3, P4 P 1 3 P 3 CS 271