460 likes | 559 Views
Stabilization and Refinement. Consider nonmasking Fault-Tolerance. Invariant Fault-Span Program computation that starts from fault-span is guaranteed to reach invariant? What if Fault-span = set of all states? Such systems are called self-stabilizing. Defining stabilization.
E N D
Consider nonmasking Fault-Tolerance • Invariant • Fault-Span • Program computation that starts from fault-span is guaranteed to reach invariant? • What if Fault-span = set of all states? • Such systems are called self-stabilizing
Defining stabilization • Starting from an arbitrary state, program eventually recovers to states from where subsequent computations are legitimate (i.e., meet the specification)
Example • Consider a ring of processes 0..n • Each process has a variable x • Variable of j is x.j • Suppose x.j is an integer for now
Actions • At process j, j > 0 • x.j x.(j-1) x.j = x.(j-1) • At process 0 • x.0 = x.N x.0 = x.N+1 • Let initial state be such that all x values are 0
What if faults change value of x? • Can we show that recovery will be guaranteed from an arbitrary state • Where values of x are arbitrary • Assume that no processes actually fail.
Correctness Argument • What if process 0 never executes • Program will stabilize to legitimate states • Assume that process 0 executes infinitely often • X.0 will keep on increasing • Eventually x.0 will be larger than all other x values • Now, before x.0 executes again, all x values must be equal • The state is one of the legitimate states
What if we restrict domain of x? • Let x be from 0..M-1 • Change action at 0 as • x.0 = x.N x.0 = (x.N+1) mod M • What if M =2 (Assume N is arbitrary) • What if M = N+1?
Correctness Argument (finite domain) • What if process 0 never executes • Program will stabilize to legitimate states • Assume that process 0 executes infinitely often • X.0 will keep on changing • Eventually x.0 will be different than all other x values • Now, before x.0 executes again, all x values must be equal • The state is one of the legitimate states
Mutual Exclusion on Bidirectional Array • Processes 0..n-1 • s.0 : {1, 3} • s.(n-1) : {0, 2} • s.j : {0, 1, 2, 3} otherwise Program Actions for j (k is a neighbor of j) • For j in 0, n-1 • s.k = s.j + 1 (mod 4) s.j = s.j + 2 (mod 4) • For j in 1..n-2 • s.k = s.j+ 1 (mod 4) s.j = s.k
Normal Execution • Begin in a state where s.(n-1) = 2 and all other s values equal 1
Stabilization • Always at least one action enabled • Not all values can be odd • Not all values can be even • There are neighbors where one node has an odd value and another has an even value
Observations • Total number of enabled processes never increase • If process j does not execute then • Evaluate effect of actions of process j-1 • Blue actions • Red actions • Black actions • Total number of enabled actions eventually in 0..j-1 eventually reduce • Same for enabled actions in j+1..n • Hence, each process must execute infinitely often
Observations • Total number of enabled actions eventually reduce to 1 • Intuition • Consider variant function • <# of enabled actions, #pairs with difference of 2> • Every • moves right until • It meets • It meets • Reaches process n-1
Stabilizing Graph Coloring for Planer Graphs • ALL = set of all colors (total 6) • c.j = color of j • nbc.j = colors of neighbors of j • Basic Action (for process j, k is nbr of j) c.j = c.k c.j = b, where b ALL – nbc.j
Problem • If j has more than 5 neighbors then there is a problem • Partition number of neighbors into predecessors and successors • Ensure that no more than 5 successors • sc.j = colors of successors of j • Revised Action | succ(j)| <= 5 & k succ(j) & c.j = c.k c.j = b, where b ALL – sc.j
Creating Successors • Each node j maintains x.j • j k iff (x.j < x.k or x.j = x.k & j < k) • Observation • In any planer graph, there is at least one node with degree <= 5 • Action | succ(j) | > 5 x.j = maxx.j + 1, where maxx.j is the maximum x value in then neighbors of j Eventually, each node has 5 or less successors
Principle • A node with less than 6 neighbors will never change its x value • Consider a graph after removing these nodes • Still a planer graph, still at least one node of degree less than 6 • These nodes can change their x value only once
Principle of Layered Recovery • This algorithm consists of • One layer that fixes successors • Another layer that fixes colors • Use of superposition • Allows us to pretend as if the lower layer has already stabilized
Stabilizing Tree Correction • Goal: Build a tree rooted at a fixed node r • Assume that node r does not fail • Others could fail/recover
Variables • d.j = distance of j (in the tree) from the root • d.r = 0 (by definition) • Maximum distance = n-1
Constraints • C1: (d.j < n) => d.j = d.(P.j)+1 • C2: d.(P.j) = n => d.j = n • C3: d.j < n
Actions (Guess 1) ! C1 d.j = d.(P.j)+1
Actions ! C1 & (d.(P.j) != n-1) d.j = d.(P.j)+1 ! C2 d.j = n
Actions d.j = n & d.k < n-1 (k is a neighbor) P.j = k, d.j = d.k+1
Correctness Argument • Eventually, nodes closest to the root stabilize • Their neighbors stabilize and so on
Refinement • So far, we used shared memory model • In one action, a process could read the state of its neighbors and write its own state. • We also assumed interleaving semantics • Only one action (non-deterministically) executed at a time. • What if the underlying model does not guarantee this?
Refining shared memory model • An action reads the state of neighbors and writes its own state. • To implement, one could • Read the state of neighbors • Evaluate guard • Execute action
Refinement may not preserve properties of interest • Example • Consider program with two processes. P1 has variable x, P2 has variable y • Action of P1 • x y x = y • Action of P2 • x y y = x
Refinement of previous program • Actions of P1 • true cy = y • x cy x = cy • Actions of P2 • true cx = x • cx y y = cx
Read/Write Model • Intuitively, in each action, a process can • Read the state of one neighbor, or • Write its own state, • But not both • Each action can be thought of as a read action or a write action
Read/Write Model • Process has public variables and private variables. • In read action, a process can • Read public variables of ONE neighbor and write its private variables • In write action, a process can • Read all its variables and write all its variables
Preservation of Stabilization during Refinement • Stabilization not always preserved during refinement • Previous example on slides
Related Problem • Local Mutual Exclusion • Combine shared memory program with local mutual exclusion algorithm • Can graph coloring algorithm from earlier slides be used? • Problem: Graph coloring algorithm itself is in shared memory • Algorithm for local mutual exclusion must be stabilizing in the read/write model
Simple Example of Local Mutual Exclusion Algorithm • Process j maintains x.j • Process j is allowed to enter critical section iff • (x.j, j) is smallest value in its neighborhood • After completion, it increments x.j
0 1 2 3 4 5 3 2 2 1 1 0 A Sample Execution on a Line Consider system with following processes: Numbers represent IDs Let x values in such a system be as follows
Observation • If all enabled process execute simultaneously then number of enabled processes is maximal • A computation model where all enabled processes execute simultaneously is called synchronous or maximum-parallelism semantics
Observation • Value of x is unbounded in the implementation • This is undesirable in the context of stabilization • Why?
What if We want to Bound the Value of x • Lets observe the line:
More Generally • Suppose x value is bounded to be between 0..B-1 • How do you decide which x value is smaller? • One approach • (x `behind’ y) iff (y-x) mod B < n • (x `far’ y) iff NOT( (x `behind y) && (y `behind’ x) )
Algorithm • Action 1 • If x.j is `behind’ all its neighbors, increment x.j • If j has a neighbor k such that x.j is `far’ from x.k • If x.j > x.k set x.j = 0
Issues • Each process must participate whether it wants to execute or not • Very strong fairness
Other Models and Transformations • Sensor Networks • Communication model: Broadcast with collision • Can be modeled as `Write-All-With-Collision’ • Local mutual exclusion algorithm can be used for transformation to this model • The local mutual exclusion algorithm must itself be correct under this model • Easy to construct for • Fixed topology • With Randomization