CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

Self Stabilization CSCE 668DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch

Reference • Self-Stabilization, Shlomi Dolev, MIT Press, 2000. • Chapter 2 • Slides prepared for the book by Shlomi Dolev • available at http://www.cs.bgu.ac.il/~dolev/book/slides.html Self Stabilization

Self-Stabilization • A powerful form of fault-tolerance. • Starting from an arbitrary system configuration, the algorithm is able to start working properly all on its own • Arbitrary system configuration is caused by some transient failure: message loss, corrupted memory, processor failure, loss of synchrony,… • As long as system is well-behaved sufficiently long, the algorithm can correct itself. • Paradigm has been applied to both shared memory and message passing models Self Stabilization

Definitions • Execution no longer defined to start with an initial configuration • instead can start with an arbitrary configuration • Depending on the problem to be solved, certain executions are considered legal, forming the set LE. • A configuration C is safeif every admissible execution starting with C is in LE. • An algorithm is self-stabilizingif every admissible execution reaches a safe configuration. Self Stabilization

… … … … … … … … … … … Self-Stabilization Definition arbitrary configuration safe configuration legal execution … Self Stabilization

Communication Model • A "hybrid" of message passing and shared memory • Communication topology is represented as an undirected graph • not necessarily fully connected • Processors correspond to vertices • Corresponding to each edge (pi,pj) are two shared read/write registers: • Rij: written by piand read by pj • Rji : written by pjand read by pi Self Stabilization

p0 p1 p3 p2 Communication Model R21 R01 R12 R10 R23 R13 R32 R31 Self Stabilization

Self-Stabilizing Spanning Tree Definition • Every processor has a variable parent in its local state. • There is a distinguished root processor. • LE consists of all executions in which, in every configuration, the parent variables do not change and form a spanning tree rooted at root. Self Stabilization

SS Spanning Tree Algorithm • Each processor has local variables • parent, id of neighbor who is parent • dist, estimated distance to root • Root sets dist to 0, and writes 0 into all its “outgoing” registers • Non-root reads neighbors' states from “incoming” registers, adopts as its parent the neighbor with the smallest distance, sets dist to one more, and writes value of dist into all its “outgoing” registers • Processors perform these actions repeatedly Self Stabilization

SS Spanning Tree Algorithm Code for root p0: while true do parent :=  dist := 0 for each neighbor pi do R0i:= 0 // write shared variable endfor Self Stabilization

SS Spanning Tree Algorithm Code for non-root pi: while true do for each neighbor pjdo neigh-dist[j] := Rji // read shared variable dist := 1 + min{neigh-dist[j] : pjis a neighbor} foundParent := false for each neighbor pj do if (!foundParent) and (neigh-dist[j] = dist – 1) then parent := j; foundParent := true endif Rij := dist // write shared variable endfor endwhile storage of negative values is not allowed Self Stabilization

Output of Spanning Tree Algorithm root 0 3 1 1 2 1 2 2 numbers are distances red arrows indicate parents black edges are non-tree edges Self Stabilization

Correctness Proof of SS ST Alg Definition: Executions are partitioned into asynchronous rounds, which are the shortest segments containing at least one step by each processor. Definition: is the degree (maximum number of neighbors) of the communication graph. Definition:D is the diameter of the communication graph. Self Stabilization

Correctness Proof of SS ST Alg Lemma: Consider any admissible execution. There exists T1 < T2 < … < TD such that after asynchronous round Tk: (a) every proc. at distance ≤ k from root has dist = shortest path distance to root and parent variables form a BFS tree (b) every proc. at distance > k from root has dist ≥ k. Self Stabilization

Correctness Proof of SS ST Alg Proof: By induction on k. Basis (k = 1): Let T1 = 5. • Initially all distances are nonnegative. • Procs might start with program counter in the middle of an iteration of the outer while loop; after at most 2 rounds, partial iterations are done. • After next  rounds, all non-root procs have completed read for-loop at least once and computed dist: all are > 0 • After next  rounds, all non-root procs have completed write for-loop at least once • After next  rounds, all non-root procs have completed read for-loop at least once and computed dist: every neighbor of root reads 0 from root and > 0 from every other node, so sets dist to 1 and parent to root. Self Stabilization

Correctness Proof of SS ST Alg Induction (k > 1): Assume for k - 1 and show for k. Let Tk = Tk-1 + 2. • Consider the execution just after end of asynchronous round Tk-1. • After next  rounds, all non-root nodes have executed write for-loop at least once (and written their dist values). • After next  rounds, all non-root nodes have executed read for-loop at least once. • Suppose piis at distance d ≤ k from root. • pi has at least one neighbor pjat distance d-1 ≤ k-1 from root, and no neighbor that is closer to the root. • By inductive hypothesis, pj's register has correct value in it and all other neighbors of pi have registers with values ≥ d-1. • Thus picorrectly computes dist and parent. Self Stabilization

Correctness Proof of SS ST Alg • Suppose pi is at distance > k from root. • Every neighbor of pi is at distance ≥ k from root. • By inductive hypothesis, all their registers have values ≥ k-1. • Thus pi computes dist to be ≥ k. Self Stabilization

Correctness Proof of SS ST Alg • Since every processor is at most distance D from root, previous lemma implies that a correct breadth-first spanning tree has been constructed after O(D) asynchronous rounds, no matter what the starting configuration. Self Stabilization

Another Classic SS Algorithm • Proposed by Dijkstra • Suggested for mutual exclusion • we will view it as a "token circulation" algorithm • Uses a stronger model of computation • in one atomic step, a proc can read all its "incoming" registers and write all its "outgoing" registers Self Stabilization

p0 p1 p2 p3 R3 R2 R1 R0 Ring Communication Topology • Procs are arranged in a unidirectional ring. • Only need one register for each proc. p0 writes into R0, p1 reads from R0, etc. Self Stabilization

Processor's States • Each processor's state consists solely of an integer, ranging from 0 to K - 1 (for suitable value of K) • Actually, processor just stores this information in its register. Self Stabilization

Definition of Holding the Token • Proc p0holds the tokenif R0 = Rn-1. • Proc pi(other than p0) holds the tokenif Ri ≠ Ri-1. Self Stabilization

Self-Stabilizing Token Circulation Definition • LE consists of all executions in which • in every configuration only one processor holds the token and • every processor holds the token infinitely often (Note resemblance to mutual exclusion problem.) Self Stabilization

Dijkstra's Algorithm code for p0: while true do if R0 = Rn-1 then R0:= (R0 + 1) mod K endif endwhile code for pi, i ≠ 0: while true do if Ri≠Ri-1 then Ri:= Ri-1 endif endwhile executes atomically Self Stabilization

p0 p1 p2 p3 Analysis of Dijkstra's Algorithm Lemma: If all registers are equal in a configuration, then the configuration is safe. Proof: Suppose K = 5. 3 1 0 4 4 0 3 3 4 0 4 0 3 Self Stabilization

Analysis of Dijkstra's Algorithm • If execution begins with arbitrary values between 0 and K-1 in the registers, how can we show that eventually all the values will be the same (i.e., reach a safe state)? • Depends on K being large enough. • Suppose K = n+1 (so there are n+1 different values). • Lemma 1: In every configuration, there is at least one integer in {0,…,K-1} that does not appear in any register. • true because there are only n different registers Self Stabilization

Analysis of Dijkstra's Algorithm Lemma 2: In every admissible execution (starting from any configuration), p0 holds the token, and thus changes R0, at least once during every n rounds. Proof: Suppose in contradiction there is a segment of n rounds in which p0 does not change R0. • Once p1 takes a step in the first round, R1 = R0, and this equality remains true. • Once p2takes a step in the second round, R2 = R1 = R0, and this equality remains true. • … • Once pn-1 takes a step in the (n-1)-st round, Rn-1 = Rn-2 = … = R0. • So when p0 takes a step in the n-th round, it will change R0. Self Stabilization

Analysis of Dijkstra's Algorithm Theorem: In any admissible execution starting at any configuration C, a safe configuration is reached within O(n2) rounds. Proof: Letjbe a value not in any register in C. • By Lemma 2, p0 changes R0 (by incrementing it)at least once every n rounds. • Thus eventually R0 holds j, in configuration D, after at most O(n2) rounds. • Since other procs only copy values, no register holds j between C and D. • After at most n more rounds, the value j propagates around the ring to pn-1. Self Stabilization

What about Reducing K? • Easy to see that K = n (n different values) suffices: either there is a missing value or p0's value is unique. • Can also show that K = n - 1 (n-1 different values) suffices. • But if K < n - 1 (less than n-1 different values), then there is a counter-example. • If the strong atomicity model is weakened to our familiar read/write atomicity, then K > 2n - 2 suffices. Self Stabilization

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS