110 likes | 124 Views
Understand stability in distributed systems, termination detection, program design, proofs, refining, and correctness validation.
E N D
Lecture 3:State, Detection Anish Arora CSE 763
The Stability Detection Problem • A stableproperty of a distributed system is one that persists: once a stable property is true it remains true thereafter • Examples: • “the computation has terminated” • “the system is deadlocked” • “all tokens in a token ring have disappeared” • Solution • Determine the global state of the system • Test the global state to see if the stable property holds
Termination Detection • Processes 0..N-1 arbitrarily connected by channels • Each process either idle or active • An active process can become idle spontaneously • An idle process can become active only upon receiving a message The Problem : Detect that all processes are idle and all channels are empty
Program and Proof (hand-in-hand) Design • Step 0 : How to count messages in channels. process j {send msg} c.j := c.j + 1 ▯ {receive msg} c.j := c.j - 1 Proof : Invariant I1 (Sum j :: c.j) = # of messages in channels
Refining the program • Step 1 : How to detect that all processes are idle. Consider a logical ring 0 -> … N-1 -> … 0 and pass a token Let t denote the location of the token process j {send msg} c.j := c.j + 1 ▯ {receive msg} c.j := c.j - 1 ▯ {propagate token} t := t – 1 j 0 t = j idle.j ; q := q + c.j ▯ {retransmit token} t := N – 1 j = 0 t = j idle.j ; q := 0 (q + c.0 = 0)
Refining the proof Proof : We begin with an idealized Invariant I1 Q, where Q (j : t<j j<N : idle.j) (q = (Sum j : t<j j<N : c.j)) However Q is not preserved by one of the actions (the receive action for j, t < j j < N) But when Q is violated, R becomes true, where R q + (Sum j : 0 j j t : c.j) > 0 So, we weaken Invariant I1 (Q R) However R is not preserved by one of the actions (the receive action for j, 0 j and j t)
Refining the program again • Step 2 : How to abort a detection when unsure that the token traversal was uninterrupted. process j {send msg} c.j := c.j + 1 ▯ {receive msg} c.j := c.j – 1; ; blacken j ▯{propagate token} t := t – 1 j 0 t = j idle.j ; q := q + c.j ; whiten j ▯{retransmit token} t := N – 1 j = 0 t = j idle.j ; q := 0 (q + c.0 = 0 0 is white) ; whiten j
Iterated refinement Proof : Invariant I1 (Q R S) where S (j:0 j jt:j is black) However S is not preserved by one of the actions (the propagate action at a black node) So we introduce a color for the token and get the final program program of process j {send msg} c.j := c.j + 1 ▯ {receive msg} c.j := c.j – 1; ; blacken j ▯ {propagate token} t := t – 1 j 0 t = j idle.j ; q := q + c.j ; if black j then blacken token ; whiten j ▯ {retransmit token} t := N – 1 j = 0 t = j idle.j ; q := 0 (q + c.0 = 0 ; whiten token token is white 0 is white) ; whiten j
Termination Detection Predicate Termination (j :: idle.j) # of msgs sent - # of msgs received = 0 Invariant (Sum j:: c.j) = # of msgs sent - # of msgs received (Q R S T) Q (j : t<j j<N : idle.j) (q=( j : t<j j<N : c.j)) R q + ( j : 0 j j t : c.j) > 0 S (j : 0 j j t : j is black) T token is black
Proof of correctness • Invariant t=0 O is white idle.0 q+c.0=0 token is white Termination • Invariant Termination leads-to t = 0 0 is white idle.0 q + c.0 = 0 token is white
Termination Detection Proof of (1): • O is white t = 0 S • q + c.0 = 0 t = 0 R • token is white T • Hence the antecedent implies Invariant Q q + c.0 = 0 i.e., the antecedent implies Termination Proof of (2): • If termination has occurred, only the propagation and retransmission actions can execute • After the first complete traversal of the ring by the token, all processes are white and the token is white • At the end of the next traversal, when t = 0, the algorithm detects the termination of the underlying computation