1 / 73

Selected topics in distributed computing

Selected topics in distributed computing. Shmuel Zaks zaks@cs.technion.ac.il. Part 1: Lower bounds and impossibility results Part 2: On two fundamental notions: snapshot in a network, cost of synchronization Part 3 : Self stabilization.

bjorn
Download Presentation

Selected topics in distributed computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Selected topics in distributed computing Shmuel Zaks zaks@cs.technion.ac.il

  2. Part 1:Lower bounds and impossibility results Part 2: On two fundamental notions: snapshot in a network, cost of synchronization Part 3: Self stabilization l'Aquila, Distributed Computing, 12-14/1/09

  3. A new approach to fault tolerance • A unified approach to transient failures by formally incorporating them into the design model l'Aquila, Distributed Computing, 12-14/1/09

  4. Part 3.1:Clock synchronization (of Gouda and Herman) clock synchronization shared memory, synchronous x x x x 6 7 8 8 7 6 x x 8 7 6 x x x x 6 7 8 8 7 6 l'Aquila, Distributed Computing, 12-14/1/09

  5. Program at each processor p : choose a neighbor x of p ; if t(p) = t(x) then t(p):= t(p)+1; Stabilizing if all start with the same time, however … l'Aquila, Distributed Computing, 12-14/1/09

  6. if t(p) = t(x) then t(p):= t(p)+1; 6 7 6 7 4 Is it stabilizing? no! (may even deadlock) l'Aquila, Distributed Computing, 12-14/1/09

  7. Program at each processor p : choose a neighbor x of p ; if t(p) = t(x) then t(p):= t(p)+1; if t(p) < t(x) then t(p):= t(x); Is it stabilizing? l'Aquila, Distributed Computing, 12-14/1/09

  8. if t(p) = t(x) then t(p):= t(p)+1; if t(p) < t(x) then t(p):= t(x); x x x x 8 7 6 7 7 8 x x 8 7 6 x x x x 7 7 8 8 7 4 Is it stabilizing? yes! l'Aquila, Distributed Computing, 12-14/1/09

  9. if t(p) = t(x) then t(p):= t(p)+1; if t(p) < t(x) then t(p):= t(x); x x x x 8 7 6 7 8 9 x x 8 7 6 x x x x 8 7 9 8 7 4 Is it stabilizing? no! l'Aquila, Distributed Computing, 12-14/1/09

  10. Program at each processor p : choose a neighbor x of p if t(p) = t(x) then t(p):= t(p)+1; fairly; if t(p) < t(x) then t(p):= t(x); yes! Is it stabilizing? l'Aquila, Distributed Computing, 12-14/1/09

  11. proof l'Aquila, Distributed Computing, 12-14/1/09

  12. 6 7 6 S: 7 4 n=5, range(S)=7-4=3, top(S)=2 l'Aquila, Distributed Computing, 12-14/1/09

  13. x x 7 6 7 7 x 7 6 S1: x x 7 7 7 4 n=5, range(S1)=7-7=0, top(S1)=5 l'Aquila, Distributed Computing, 12-14/1/09

  14. x x 7 6 7 8 x 7 6 S2: x x 8 7 7 4 n=5, range(S2)=8-7=1, top(S2)=2 l'Aquila, Distributed Computing, 12-14/1/09

  15. x x x x 8 7 6 7 8 9 x x 8 7 6 S3: x x x x 8 9 7 8 7 4 n=5, range(S3)=9-8=1, top(S3)=2 l'Aquila, Distributed Computing, 12-14/1/09

  16. hw: prove. 6 7 S : 6 7 4 Use for: correctness (especially termination) + complexity l'Aquila, Distributed Computing, 12-14/1/09

  17. Part 3.2:Mutual exclusion (Dijkstra’s 1st algorithm) • distributed control • ring of (finite-state machine) processes • for each machine define privileges (when a transition is enabled) • nodes are influenced only by neighbors • legitimate states • in each legitimate state there is at least one privileged machine • in each legitimate state each possible move will bring the system again to a legitimate state l'Aquila, Distributed Computing, 12-14/1/09

  18. A system is self-stabilizing if , regardless of the initial state and regardless of the privilege selected each time for the next move, at least one privilege will always be present and the system is guaranteed to find itself in a legitimate state after a finite number of moves. l'Aquila, Distributed Computing, 12-14/1/09

  19. Finite number of transitions non-stable states stable states l'Aquila, Distributed Computing, 12-14/1/09

  20. In Dijkstra’s model: • Token ring. • N machines, denoted 0 through N-1. • A machine is priviliged (“has token”) or not priviliged. • daemon/schedulerdetermines which of the priviliged machines will make the next move • Legitimate States: all states with exactly one privileged machine. l'Aquila, Distributed Computing, 12-14/1/09

  21. Notes to consider: scheduler (daemon) : • points at each time to one of the privileged machines, or • point at each time to one of the machines. • centralized scheduler, distributed scheduler fairness: • Do we need fairness, and of what kind? • Unconditional Fairness:Each machine gets pointed to infinitely often. • Type of fairness: - each machine is pointed at infinitely often, - round robin, or - other. l'Aquila, Distributed Computing, 12-14/1/09

  22. Dijkstra’s 1st algorithm • N processors 0,1,…,N-1 • Solution with k-state Machines • machine P0 if x[0] = x[N-1] then x[0] := x[0]+1 mod k • machine Pi ( i= 1,2,…,N-1) if x[i] ≠ x[i-1] then x[i] := x[i-1] l'Aquila, Distributed Computing, 12-14/1/09

  23. N=4, K=3 State : 0,1,2 p0 2 0 0 p1 p3 1 p2 l'Aquila, Distributed Computing, 12-14/1/09

  24. machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k p0 2 0 0 p1 p3 1 p2 l'Aquila, Distributed Computing, 12-14/1/09

  25. machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] p0 2 0 0 p1 p3 1 p2 l'Aquila, Distributed Computing, 12-14/1/09

  26. So far: machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] p0 2 0 0 p1 p3 1 p2 l'Aquila, Distributed Computing, 12-14/1/09

  27. Example 1 machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] p0 0 2 K=3 2 1 0 0 0 2 0 p1 p3 2 0 1 p2 Etc… l'Aquila, Distributed Computing, 12-14/1/09

  28. Example 2 machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] p0 2 1 0 K=2 0 2 0 1 1 0 0 2 1 1 0 0 p1 p3 2 1 0 1 p2 Etc… l'Aquila, Distributed Computing, 12-14/1/09

  29. Example 3 machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] p0 2 0 1 0 0 K=2 1 0 0 0 2 0 1 0 1 2 0 0 1 1 1 0 p1 p3 0 2 1 1 1 1 p2 Etc… l'Aquila, Distributed Computing, 12-14/1/09

  30. Note: machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] Informally: “If machine is privileged, then make it non-privileged”. l'Aquila, Distributed Computing, 12-14/1/09

  31. machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] Legitimate States: exactly one privileged machine p0 0 1 2 1 K=2 0 1 2 0 1 2 1 0 p1 p3 1 1 2 p2 l'Aquila, Distributed Computing, 12-14/1/09

  32. Legitimate States 0 0 0 0 0 …… 1 0 0 0 0 1 1 0 0 0 0 0 K–1 K–1 K–1 …… 1 1 1 1 1 0 K-1 K–1 K–1 K–1 2 1 1 1 1 … 2 2 2 2 2 … K–1 K–1 K–1 K–1 K–1 l'Aquila, Distributed Computing, 12-14/1/09

  33. Intuition: 0 1 2 3 4 0 2 4 1 0 1 2 4 1 0 1 2 4 1 1 2 2 4 1 1 2 2 2 1 1 0 1 2 3 4 0 3 2 1 0 1 3 2 1 0 1 1 2 1 0 1 1 2 2 0 1 1 2 2 2 Goal: Show that from an arbitrary state, the system stabilizes to a legitimate state in a finite number of steps. N=5, K = 4 N=5, K = 5 l'Aquila, Distributed Computing, 12-14/1/09

  34. 0 1 2 3 4 1 1 0 2 1 2 1 0 2 1 2 1 0 2 2 2 1 0 0 2 2 1 1 0 2 2 2 1 0 2 0 2 1 0 2 0 2 1 0 0 0 2 1 1 0 0 2 2 1 0 0 0 2 1 0 N=5, K = 3 • 0 1 2 3 4 • 0 1 2 1 0 • 0 0 2 1 0 • 1 0 2 1 0 • 1 0 2 1 1 • 1 0 2 2 1 • 0 0 2 1 hw: which configurations have the same property? (Check also for all other examples.) hw: can you get to this configuration? l'Aquila, Distributed Computing, 12-14/1/09

  35. Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 . l'Aquila, Distributed Computing, 12-14/1/09

  36. machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] Lemma 2: (liveness) At every configuration, at least one machine is privileged. Lemma 1: If all states are equal then the system is stabilized. Lemma 3: Starting from any configuration, P0will eventually make a move. (Hw) Corollary: Starting at any configuration, P0 will make an infinite number of steps. (this still does not guarantee stabilization.) l'Aquila, Distributed Computing, 12-14/1/09

  37. machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] Definition: Pi is called special in a configuration if Lemma 4: If P0 is special in a configuration, then the system will eventually stabilize. Lemma 5: If in a configuration one of the states is missing, then the system will eventually stabilize. l'Aquila, Distributed Computing, 12-14/1/09

  38. Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 . Proof: Case 1:K< N-1 Case 2:K = N-1 Case 3:K = NCase 4:K > N l'Aquila, Distributed Computing, 12-14/1/09

  39. Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 . Proof: Case 1. k < N-1 (e.g., N=4, k=2) l'Aquila, Distributed Computing, 12-14/1/09

  40. Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 . Proof: Case 4.k > N … Lemma 5 l'Aquila, Distributed Computing, 12-14/1/09

  41. Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 . Proof: Case 3.k = N Either one state is missing, or all states are distinct. … Lemma 5 … Lemma 4 l'Aquila, Distributed Computing, 12-14/1/09

  42. Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 . Proof: Case 2.k = N - 1 Either one state is missing, or one state occurs twice. … Lemma 5 l'Aquila, Distributed Computing, 12-14/1/09

  43. So, one state occurs twice. Either P0 is special, or x[0] = x[i] … Lemma 4 i < N-1 P0 is not enabled, and after any other move there is a mising state, and … Lemma 5. i = N-1 If Pi, i 0 makes a move, there is missing state, and done by Lemma 5. If P0 makes a move, we get back to the previous case. l'Aquila, Distributed Computing, 12-14/1/09

  44. Theorem: Assuming a distrubted scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-1 . hw: prove l'Aquila, Distributed Computing, 12-14/1/09

  45. Use of a potential function S = a configuration (a system state) r(S) = number of maximal runs of equal states Example: S = 2222011000, f(S) = 4 r(S) + 1 if first(S) = last(S) f(S) = r(S) otherwise. Note: S is legitimate iff f(S) = 2. l'Aquila, Distributed Computing, 12-14/1/09

  46. Show: If K > N and f(S) > 2, then f(S) never increases and is eventually decreased. Recall: A common technique to prove termination and correctness Hw: 1. finish the proof. 2. use f to analyze complexity l'Aquila, Distributed Computing, 12-14/1/09

  47. Applications Some general comments on self stabilization • Distributed data structures • Digital and analog circuits • Genetic algorithms • Network protocols • Sensor networks l'Aquila, Distributed Computing, 12-14/1/09

  48. Self stabilization • is a special kind of fault tolerance • guarantees an automatic recovery from a transient failure • as a design goal is a too strong property and thus either too difficult to achieve or achieved at the expense of other goals l'Aquila, Distributed Computing, 12-14/1/09

  49. Back to Dijkstra’s algorithm Uniform protocol : same program at each processor machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1] Theorem: There is no uniform protocol that solves the mutual exclusion problem (under either a centralized or a distributed scheduler) l'Aquila, Distributed Computing, 12-14/1/09

  50. p0 0 3 p1 p6 7 0 7 0 p5 3 0 3 0 p2 0 7 p3 l'Aquila, Distributed Computing, 12-14/1/09

More Related