1 / 65

Around Self-Stabilization

Around Self-Stabilization. Part 2 : Strengthened Forms of Self-Stabilization Stéphane Devismes Post-Doc CNRS at the LRI (Paris VII). Roadmap. Self-Stabilization (recall) Motivation Tolerating more types of fault FTSS Enhance the convergence Snap-Stabilization Conclusion.

cseeger
Download Presentation

Around Self-Stabilization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Around Self-Stabilization Part 2:Strengthened Forms of Self-Stabilization Stéphane Devismes Post-Doc CNRS at the LRI (Paris VII)

  2. Roadmap • Self-Stabilization (recall) • Motivation • Tolerating more types of fault • FTSS • Enhance the convergence • Snap-Stabilization • Conclusion Computer Science Department, University of Osaka

  3. Self-Stabilization (recall) • [Dijkstra 1974] • General approach for recovering from the effect of any transient faults Computer Science Department, University of Osaka

  4. Motivation • Self-Stabilization includes several advantages: • Tolerance to any transient fault: • No hypothesis on the nature of extent of transient faults • Recovers from the effects of those faults in a unified manner • No initialization: • Large scale systems • Dynamicity: • Self-organization in sensor and ad hoc networks Computer Science Department, University of Osaka

  5. Motivation • But also several drawbacks: • Impossibility results • Some fundamental problems have no self-stabilizing solution • Overhead • Self-stabilizing protocols can make use of a large amount of resources • Usually not tolerant for other kinds of fault • Eventual safety • During the convergence, almost nothing is guaranteed Weakened Forms StrengthenedForms Computer Science Department, University of Osaka

  6. Motivation • Strengthened Forms for: • Tolerating more types of faults • Enhance the convergence property • Converging quickly in some (frequent) cases • Ensure some weak safety property when there are faults Computer Science Department, University of Osaka

  7. Tolerating more types of faults • Types of faults: • Transient • Intermittent • Crash • Byzantine Computer Science Department, University of Osaka

  8. Tolerating more types of fault • Transient Faults: • Usually treated by the Self-Stabilization • Duration: finite • Periodicity: rare • Effect: alter the contain of some component(s) of the network (processes and/or links) • E.g., memory/message corruption, crash-recover, lose of messages… Computer Science Department, University of Osaka

  9. Tolerating more types of fault • Intermittent Faults: • Duration: finite • Periodicity: frequent • Effect: alter the contain of some component(s) of the network (processes and/or links) • E.g., memory/message corruption, crash-recover, lose of messages… • Some paper deals with both self-stabilization and certain types of intermittent fault, e.g., [Delaët and Tixeuil, JPDC’02] • Fair lose of message + finite number of message corruption Computer Science Department, University of Osaka

  10. Tolerating more types of fault • Crash Failures: • Duration: definitive • Effect: some component(s) of the network (processes and/or links) definitively stops working • E.g., process crash, link removal • Fault-Tolerant Self-Stabilization (FTSS) [Gopal and Perry, PODC’93] • Usually consider process crash only. Computer Science Department, University of Osaka

  11. Tolerating more types of fault • Byzantine Failures: • Duration: unlimited • Effect: some component(s) of the network (usually processes) work in an arbitrary manner • E.g., processes hit by an attack • Byzantine-Tolerant Self-Stabilization[Dolev and Welch, PODC’95] • Restriction on the number of Byzantine processes and/or • Some synchrony assumptions Computer Science Department, University of Osaka

  12. LIAFA Robust Stabilizing Leader Election Carole Delporte-Gallet (LIAFA) Stéphane Devismes (CNRS, LRI) Hugues Fauconnier (LIAFA)

  13. Topics • Designing Leader Election protocols in message-passing model that are • Crash tolerant • Self-Stabilizing • Communication-Efficient • With weak synchrony assumption Computer Science Department, University of Osaka

  14. 1 2 3 4 Model • Fully-connected network • Communications using messages • Link : • Unidirectional • No order on the delivers • May be synchronous • Process : • Synchronous or crashed • With identifier • State initially arbitrary Computer Science Department, University of Osaka

  15. 1 2 3 4 Communication-Efficiency [Larrea, Fernandez, and Arevalo, 2000]: « An algorithm is communication-efficient if it eventually only uses n - 1 unidirectional links » Computer Science Department, University of Osaka

  16. Related Works • [Gopal and Perry, PODC’93] • [Anagnostou and Hadzilacos, WDAG’93] • [Beauquier and Kekkonen-Moneta, JSS’97] Communication-Efficiency never considered Computer Science Department, University of Osaka

  17. Self-Stabilizing Leader Electionin a full timely network? Yes + communication-efficiency Computer Science Department, University of Osaka

  18. Algorithm (1/4) • Each process p periodically sends ALIVE,p to each other if Leader = p ALIVE,1 1 4 Leader=1 ALIVE,1 ALIVE,1 ALIVE,2 ALIVE,2 3 2 Leader=2 Leader=2 ALIVE,2 Computer Science Department, University of Osaka

  19. 1 4 3 2 Algorithm (2/4) • When an alive process p such that Leader = p receives ALIVE from process q, • Leader := qif q < p ALIVE,1 Leader=1 4 ALIVE,1 ALIVE,1 ALIVE,2 ALIVE,2 Leader=2 Leader=2 Leader=1 ALIVE,2 Computer Science Department, University of Osaka

  20. 1 4 3 2 Algorithm (3/4) • Each alive process q such that Leader ≠ q always chooses as leader the process from which it receives ALIVEthe most recently ALIVE,1 Leader=1 4 ALIVE,1 ALIVE,1 Leader=2 Leader=1 Leader=1 Computer Science Department, University of Osaka

  21. 1 4 3 2 Algorithm (4/4) • On Time out, each alive process p sets Leader to p ALIVE,1 Leader=3 Leader=1 4 ALIVE,1 ALIVE,1 ALIVE,2 ALIVE,2 Leader=2 Leader=4 Leader=2 ALIVE,2 Computer Science Department, University of Osaka

  22. Communication-EfficientSelf-Stabilizing Leader Election in a system where at most one link is asynchronous? No Computer Science Department, University of Osaka

  23. Impossibility of Communication-Efficiency in a system with at most one asynchronous link • Claim: Any process p such that Leader ≠ p must periodically receive messages within a bounded time otherwise it chooses another leader The process chooses another leader Computer Science Department, University of Osaka

  24. Self-Stabilizing (non communication-efficient) Leader Election in a system where some links are asynchronous? Yes Computer Science Department, University of Osaka

  25. Self-Stabilizing Leader Election in a system with a timely routing overlay • For each pair of alive processes (p,q), there exists at least two paths of timely links: • From p to q • From q to p Computer Science Department, University of Osaka

  26. Algorithm • Each process computes the set of alive processes and chooses as leader the smallest process of this set • To compute the set: • Each process pperiodically sends ALIVE,p to every other process • Any ALIVE,p message is repeated n- 1 times (any other process periodically receives such a message) Computer Science Department, University of Osaka

  27. Self-Stabilizing Leader Electionin a system without timely routing overlay ? No Computer Science Department, University of Osaka

  28. Conclusion • Obtaining algorithms that are both self-stabilizing and crash tolerant is highly desirable • But designing communication-efficient solution requires strong synchrony assumption even if the network is fully-connected • Solution:FTPS (Fault-Tolerant Pseudo-Stabilization) Computer Science Department, University of Osaka

  29. Enhance The Convergence • Fault-containing Self-Stabilization • Time-Adaptive Self-Stabilization • Safe-Converging Self-Stabilization • Superstabilization • Snap-Stabilization Computer Science Department, University of Osaka

  30. Fault-Containing Self-Stabilization • [Ghosh et al, PODC’96] • Self-stabilizing + if there is a few number of faults: • Spatial containment: a few number of processes can be contaminated by the faults • Fast convergence time Computer Science Department, University of Osaka

  31. Time-Adaptive Self-Stabilization • [Kutten & Patt-Shamir, PODC’97] • Self-stabilizing and if f<k processes are faulty: • The output of the algorithm stabilizes in O(f) Faults hit f processes The output is stabilized The state is stabilized Computer Science Department, University of Osaka

  32. Safe-Converging Self-Stabilization • [Kakugawa & Masuzawa, IPDPS’06] • Self-stabilizing and fast convergence to a weaker (useful) predicate • E.g. Minimal Dominating Set (MDS): Arbitrary initial configuration DS MDS Computer Science Department, University of Osaka

  33. Superstabilization • [Dolev & Herman, CJTCS’97] • A Superstabilizing Algorithm • Must be self-stabilizing • Must preserve a “passage predicate” • Passage Predicate- Defined with respect to a class of topology changes (A topology change falsifies legitimacy and therefore the passage predicate must be weaker than legitimacy but strong enough to be useful). Passage Predicate Topological change Computer Science Department, University of Osaka

  34. Passage Predicate - Example In a token ring: A processor crash can lose the token but still not falsify the passage predicate Computer Science Department, University of Osaka

  35. Snap-Stabilization • [Bui et al, WSS’99] • A snap-stabilizing algorithm immediately operates correctly after the end of the faults • Request-based algorithm and user-centric point of view: • Each time a user initiates a request, it obtain a correct result for its request Computer Science Department, University of Osaka

  36. Snap-Stabilization Computer Science Department, University of Osaka

  37. Self vs. Snap N.X 1.X 2.X Computer Science Department, University of Osaka

  38. Self vs. Snap 1.X Computer Science Department, University of Osaka

  39. Snap-Stabilization in Message-Passing Systems Sylvie Delaët(LRI) StéphaneDevismes(CNRS, LRI) Mikhail Nesterenko(Kent State University) Sébastien Tixeuil (LIP6)

  40. 1 2 m2 3 4 ma ma mb mb m1 m3 m3 m Message-Passing Model • Network bidirectional and fully-connected • Communications by messages • Links asynchronous, fair, and FIFO • Ids on processes • Transient faults Computer Science Department, University of Osaka

  41. Related Works in message-passing(reliable communication in self-stabilization) • [Gouda & Multari, 1991] • Deterministic + Unbounded Capacity => Unbounded Counter • Deterministic + Bounded Capacity => Bounded Counter • [Afek & Brown, 1993] • Probabilistic + Unbounded Capacity + Bounded Counter ? <How old are you, Captain?> ? <I’m 21> <I’m 12> <I’m 60> Computer Science Department, University of Osaka

  42. Related Works in message-passing (self-stabilization) • [Varghese, 1993] • Deterministic + Bounded Capacity • [Katz & Perry, 1993] • Unbounded Capacity, deterministic, infinite counter • [Delaët et al] • Unbounded Capacity, deterministic, finite memory • Silent tasks Computer Science Department, University of Osaka

  43. Related Works (snap-stabilization) • Nothing in the Message-Passing Model • Only in State Model: • Locally Shared Memory • Composite Atomicity • [Cournier et al, 2003] Computer Science Department, University of Osaka

  44. Snap-Stabilization in Message-Passing Systems

  45. Case 1: unbounded capacity links • Impossible for safety-distributed specifications Computer Science Department, University of Osaka

  46. Safety-distributed specification A p B q Example : Mutual Exclusion Computer Science Department, University of Osaka

  47. Safety-distributed specification sp A p m1 m2 m3 m4 m5 sq B q m’1 m’2 m’3 m’4 Computer Science Department, University of Osaka

  48. Safety-distributed specification sp A p m1 m2 m3 m4 m5 sq B q m’1 m’2 m’3 m’4 Computer Science Department, University of Osaka

  49. Case 2: bounded capacity links • Problem to solve: Reliable Communication • Starting from any configuration, if Tintin sends a question to CaptainHaddock, then: • Tintin eventually receives good answers • Tintin only delivers the good answers ? ? Computer Science Department, University of Osaka

  50. Case 2: bounded capacity links • Case Study: Single-Message Capacity 0 or 1 message 0 or 1 message Computer Science Department, University of Osaka

More Related