200 likes | 208 Views
This article discusses the efficient implementation of the Omega failure detector in a distributed system, including the necessary conditions and algorithms for its implementation. The article also explores the limitations and challenges of implementing Omega in different communication models.
E N D
Packet Efficient Implementation of the Omega Failure Detector x Quentin Bramas Dianne Foreback Mikhail Nesterenko SébastienTixeuil w y Ω z Lyon, France November 8, 2016
What is Omega? Oracle • Abstract construct that enables a solution to an otherwise unsolvable problem in a particular distributed system model • Encapsulates the portion of solution that is impossible to implement Consensus requires non-crashed (correct) processes to agree on a value • FLP [11]: Consensus is impossible to solve in asynchronous system model even a if single process may crash: impossible to know if a process crashed or is merely slow • requires an oracle Omega - an oracle that eventually outputs the same leader at every correct process • Weakest failure detector to solve consensus in the asynchronous system model [6] • Encapsulates the least amount of synchrony for a solution to Consensus
Notation • Message – contents to be distributed to other processes (using packets) • Packet – portion of data transmitted across channel • Fair-lossy channel - if an infinite number of messages are sent along this channel, an infinite number of messages are received implementation efficiency metrics • Message Efficient Implementation: only a single process sends an infinite number of messages and all but finitely many messages are of constant size • Packet Efficient Implementation: if all but finitely many messages are transmitted using O(n) packets. • Packets of different messages may potentially use different channels ⇒ the number of used channels is not limited • Super Packet Efficient Implementation: requires only a O(n) channels to transmit packets of an infinite number messages ⇒ the number of used channels is limited
Omega Implementations Implementation conditions • Omega is impossible to implement in the asynchronous system model • Must strengthen the system model with added synchrony and reliability assumptions • What is the least restrictive communication model to implement Omega? Aguilera et al. [1] algorithm • Requires at least one process to have eventually timely channels to all correct processes • Requires a (possibly different) single process to have fair-lossy channels to/from all others • Message efficient Delporte-Gallot et al. [9] algorithm determines timely channel graph • Can be used to construct Omega • Requires at least one process to have timely paths to all correct processes • Requires all channels to be reliable • Super packet efficient timely channel drops all packets If channel timeliness/reliability is arbitrary, what are the necessary&sufficient conditions for Omega implementation?
Single-hop vs. Multi-hop Theorems 1 and 2 (summarized): If timely channel probability isthen, as network sizengrows, the probability of leader existence using • Single-hop direct communication: approaches zero exponentially fast • Multiple-hop communication: approaches oneexponentially fast Theorems 3 and 4 (summarized): The probability of a leader persisting while the timeliness of channels change • Single-hoptends to zero • Multiple-hop tends to infinity Practically speaking, a multi-hop omega implementation is far more likely to succeed in establishing a persistent leader v u w x t z y
Our Deterministic Results • Prove necessary conditions for Omega • At least one process (leader) needs to have a timely path to all correct processes • All processes have to have a fair-lossy path to the leader • Present an algorithm MPO that implements Omega under these conditions ⇒ the conditions are necessary and sufficient • Prove cannot have super packet-efficient algorithm ⇒ Algorithm of Delporte-Gallotet al. channel reliability requirement is necessary
Infinite Number of Messages Theorem 5 and Corollary 1 (summarized): In an implementation of Omega, at least one process needs to send infinitely many timely messages. This process must be the leader. x Intuition: The leader needs to periodically inform other processes of its correctness or they will not be able to detect its crash Why?:Cannot distinguish between crashed and slow process thus toggling between leaders may occur • Let process x be the leader • Let v, w and z crash • If x stops sending timely messages, y thinks x crashed and elects itself as leader • Send messages arbitrarily and now x picks y as leader • Repeat as above but with y y w X X X v z
No Channel Reliability/Timeliness Lemma 1 and Corollary 2 (summarized): If no channel reliability/timeliness properties, to timely deliver a message • Each recipient must send it across every outgoing channel (except possibly the channels to the leader and sender) • packets are needed (not packet efficient) Intuition: Cannot establish properties, have to send packets everywhere otherwise might skip be the channel that contains the only timely path leading to some process x y w v z X
Fair-Lossy Path Lemma 2: In any message efficient implementation of Omega, each correct process must have a fair-lossy path to the leader Intuition: must be able to provide feedback if the leader is no longer timely. • Let process x be the leader • If y no longer considers x its leader and gets no other messages, y elects itself leader • y does not send fair-lossy messages so others do not know of y’sdecision • Thus, no agreement on leader x y w v z
Eventually Timely Paths Theorem 6: For a message and packet efficient implementation • At least one process x must have an eventually timely path to all correct processes • every correct process has a fair-lossy path to the leader Intuition: • The leader xmust send an infinite number of messages (so others do not think it crashed) • If a proportional number of processes do not have timely incoming channels, then all processes must forward each message • ⇒ (per Lemma 2) number of packets required for each message is in - contradiction x y w v z
Our Deterministic Results • Prove necessary conditions for Omega • At least one process (leader) needs to have a timely path to all correct processes • All processes have to have a fair-lossy path to the leader • Present an algorithm MPO that implements Omega under these conditions ⇒ the conditions are necessary and sufficient • Prove cannot have super packet-efficient algorithm ⇒ Algorithm of Delporte-Gallot channel reliability requirement is necessary
No Super Packet Efficiency Theorem 7: Even if there the leader has an eventually timely path to every correct process and every correct process has a fair-lossy path to the leader, there does not exist a message and super packet efficient implementation of Omega S2 S1 eventuallytimely channel L1 L2 x y leader leader w z timelycomponent eventuallyreliablechannel timelycomponent Intuition: • Assume there are an equal number of processes in each component • If there is no communication between processes of the two components initially, processes in S1 assume those in S2are crashed and vice versa • Processes in S1 elect L1 from S1as their leader and processes in S2 elect L1from S2as their leader • The eventuality of the timely and reliable channels occur after the leaders are elected Thus, the conditions of the theorem exist. Yet, disagreement on the leader occurs
Our Deterministic Results • Prove necessary conditions for Omega • At least one process (leader) needs to have a timely path to all correct processes • All processes have to have a fair-lossy path to the leader • Present an algorithm MPO that implements Omega under these conditions ⇒ the conditions are necessary and sufficient • Prove cannot have super packet-efficient algorithm ⇒ Algorithm of Delporte-Gallot channel reliability requirement is necessary
MPO Algorithm: Features w x alive(x, phase#, shoutID) • Leader candidate estimates reliability of channels by building arboresence of timely channels • Sends packets over these channels • Gets failed messages if untimely • Process with lowest weight arborescence wins • Weight used to evaluate channel reliability • Proceeds in phases to preserve efficiency • Messages sent an infinite number of times are of constant size x y w X 4 y v z w v z startPhase(x, phase#, arb)
x Takes Leadership • x calculated weight of arbs from all processes and it has the lowest arb • x broadcasts startPhase sending the arb to limit the channels being used x y w startPhase(x, phase#, arb) v z
Leader x Sends alive Goal: alive timely received by all correct processes according to arborescence per phase#, otherwise, x keeps track of failed to build new arb should it lose leadership w x alive(x, phase#, shoutID) x y w X 4 y w v z 5 v z failed(x,y,z) Not timely received by z • z broadcastsfailed, and increases timer to gauge delay • All other processes broadcast z’sfailed once • x receives failed for the first time, increases yzweight—used to calculate arbs for next phase (should it lose leadership) Timely received via the arborescence: • w, y timely receive x’s alive from parent per current phase# (ignore if earlier phase#) • y forward alive according to arborescence or shouts (broadcast) • w‘s turn to shout alive • v first receives from non-parent w, waits to receive from parent y (or turns on timer if it was off)
x timeout Expires : Is x Still the Leader? xcalculates new arb considering updated edge weights of untimely channels • x still a leader: If arb is smaller than other processes arbs • Message efficient: uses old arb and never updates phase # (no need to send a new arb to all) • x loses leadership: • Increases its phase# broadcastingstopPhase • Message Efficient: stopPhase with newer phase# used by other processes to turn off their timer for x and to quit sending failed • x stops sending alive • x keeps own timer on to check if regains leadership x x y w y w 5 v z v z • Kept leadership • Use old arb, send alive • Timeoutexpires • Calculate new arb • Lost leadership • Broadcast stopPhase x stopPhase(x; phase#) y w v z
x Regains Leadership x arb is once again smaller than others • Other leader must have sent stopPhase due to non-timely channels x x startPhase(x; phases[x]; arbs[x]) alive(x, phase#, shoutID) y w y w v z v z • x timeout expires (process’ own timer is always on) • x kept collecting failed messages updating its edges—calculates its new arb • x broadcasts startPhasewith new arb to all other processes • So other processes will await alive messages and send failed if not timely received from parent according to new arb
MPO Algorithm Message and Packet Efficient Message Efficient: Only a single process x sends an infinite number of messages and all but finitely many messages are of constant size x x w alive(x, phase#, shoutID) Non-constant message, startPhase, sent finite times y y w w v v z z Packet Efficient: All but finitely many messages are transmitted using O(n) packets • Packets of different messages may potentially use different channels • Thus, the number of used channels is not limited w alive(x, phase#, shoutID) Eventually, only packets sent Broadcast precludes super packet efficiency of O(n) channels
Packet Efficient Implementation of the Omega Failure Detector Thank you! x w y Ω z Questions? Lyon, France November 8, 2016