1 / 35

CPSC 668 Distributed Algorithms and Systems

CPSC 668 Distributed Algorithms and Systems. Fall 2006 Prof. Jennifer Welch. p 0. p 0. m 0. m 1. m 0. m 1. p 1. p 1. Logical Clocks Motivation. In an asynchronous system, often cannot tell which of two events occurred before the other: Example A Example B.

signa
Download Presentation

CPSC 668 Distributed Algorithms and Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPSC 668Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch Set 12: Causality

  2. p0 p0 m0 m1 m0 m1 p1 p1 Logical Clocks Motivation • In an asynchronous system, often cannot tell which of two events occurred before the other: Example A Example B Set 12: Causality

  3. Logical Clocks Motivation • In Example A, processors cannot tell which message was sent first. Probably not important. • In Example B, processors can tell which message was sent first. Might be important. • Let's try to determine relative ordering of some (not all)events. Set 12: Causality

  4. Happens Before Partial Order • Given an execution, computation event ahappens before computation event b, denoted a  b, if • a and b occur at same processor and a precedes b • a results in sending m and b includes receipt of m • there exists computation event c such that a  c and c  b (transitive closure) Set 12: Causality

  5. p0 m0 m1 p1 Happens Before Partial Order • Happens before means that information can flow from a to b, i.e., that a might cause b. a b a a Set 12: Causality

  6. Concurrent Events • If a does not happen before b, and b does not happen before a, then a and b are concurrent, denoted a || b. Set 12: Causality

  7. Happens Before Example Rule 1: a  b, c  d  e  f, g  h i Rule 2: a  d, g  e, f  i h || e, … Rule 3: a  e, c  i, … Set 12: Causality

  8. Logical Clocks • Logical clocks are values assigned to events to provide some information about the order in which events happen. • Goal is to assign an integer L(e) to each computation event e in an execution such that if a  b, then L(a) < L(b). Set 12: Causality

  9. Logical Timestamps Algorithm • Each pi keeps a counter (logical timestamp) Li, initially 0 • Every message pi sends is timestamped with current value of Li • Li is incremented at each step to be greater than • its current value • the timestamps on all messages received at this step • If a is an event at pi, then assign L(a) to be the value of Liat the end of a. Set 12: Causality

  10. 1 2 1 2 3 4 1 2 5 Logical Timestamps Example a  b : L(a) = 1 < 2 = L(b) f  i : L(f) = 4 < 5 = L(i) a  e : L(a) = 1 < 3 = L(e) etc. Set 12: Causality

  11. Getting a Total Order • If a total order is required, break ties using ids. • In the example, L(a) = (1,0), L(c) = (1,1), etc. • Timestamps are ordered lexicographically. • In the example, L(a) < L(c). Set 12: Causality

  12. Drawback of Logical Clocks • a  b implies L(a) < L(b), but L(a) < L(b) does not necessarily imply a  b. • In previous example, L(g) = 1 and L(b) = 2, but g does not happen before b. • Reason is that "happens before" is a partial order, but logical clock values are integers, which are totally ordered. Set 12: Causality

  13. Vector Clocks • Generalize logical clocks to provide non-causality information as well as causality information. • Implement with values drawn from a partially ordered set instead of a totally ordered set. • Assign a value V(e) to each computation event e in an execution such that a  b if and only if V(a) < V(b). Set 12: Causality

  14. Vector Timestamps Algorithm • Each pi keeps an n-vector Vi, initially all 0's • Entry j in Vi is pi 's estimate of how many steps pj has taken • Every msg pi sends is timestamped with current value of Vi • At every step, increment Vi[i] by 1 • When receiving a message with vector timestamp T, update Vi 's components j ≠ i so that Vi[j] = max(T[j],Vi[j]) • If a is an event at pi, then assign V(a) to be value of Vi at end of a. Set 12: Causality

  15. Manipulating Vector Timestamps Let V and W be two n-vectors of integers. Equality:V = W iff V[i] = W[i] for all i. Example: (3,2,4) = (3,2,4) Less than or equal:V ≤ W iff V[i] ≤ W[i] for all i. Example: (2,2,3) ≤ (3,2,4) and (3,2,4) ≤ (3,2,4) Less than: V < W iff V ≤ W but V ≠ W. Example: (2,2,3) < (3,2,4) Incomparable:V || W iff !(V ≤ W) and !(W ≤ V). Example:(3,2,4) || (4,1,4) Set 12: Causality

  16. Manipulating Vector Timestamps • The partial order on n-vectors just defined is not the same as lexicographic ordering. • Lexicographic ordering is a total order on vectors. • Consider (3,2,4) vs. (4,1,4) in the two approaches. Set 12: Causality

  17. Vector Timestamps Example (1,0,0) (2,0,0) (0,1,0) (1,2,0) (1,3,1) (1,4,1) (0,0,1) (0,0,2) (1,4,3) V(g) = (0,0,1) and V(b) = (2,0,0), which are incomparable. Compare with logical clocks L(g) = 1 and L(b) = 2. Set 12: Causality

  18. Correctness of Vector Timestamps Theorem (6.5 & 6.6): Vector timestamps implement vector clocks. Proof: First, show a  b implies V(a) < V(b). Case 1:a and b both occur at pi, a first. Since Vi increases at each step, V(a) < V(b). Set 12: Causality

  19. Correctness of Vector Timestamps Case 2:a occurs at pi and causes m to be sent, while b occurs at pj and includes the receipt of m. • During b, pj updates its vector timestamp in such a way that V(a) ≤ V(b). • pi 's estimate of number of steps taken by pj is never an over-estimate. Since m is not received before it is sent, pi 's estimate of the number of steps taken by pj when a occurs is less than the number of steps taken by pj when b occurs. So V(a)[j] < V(b)[j]. • Thus V(a) < V(b). Set 12: Causality

  20. Correctness of Vector Timestamps Case 3: There exists c such that a  c and c  b. By induction (from Cases 1 and 2) and transitivity of <, V(a) < V(b). Next show V(a) < V(b) implies a  b. Equivalent to showing !(a  b) implies !(V(a) < V(b)) Set 12: Causality

  21. Correctness of Vector Timestamps • Suppose a occurs at pi, b occurs at pj, and a does not happen before b. • Let V(a)[i] = k. • Since a does not happen before b, there is no chain of messages from pi to pjoriginating at pi 's k-th step or later and ending at pj before b. • Thus V(b)[i] < k. • Thus !(V(a) < V(b)). Set 12: Causality

  22. Size of Vector Timestamps • Vector timestamps are big: • n components in each one • values in the components grow without bound • Is there a more efficient way to implement vector clocks? • Answer is NO, at least under some conditions. Set 12: Causality

  23. Vector Clock Size Lower Bound Theorem (6.9): Any implementation of vector clocks using vectors of real numbers requires vectors of length n (number of processors). Proof: For any value of n, consider this execution: Set 12: Causality

  24. Example Bad Execution For n = 4: Set 12: Causality

  25. Vector Clock Size Lower Bound Claim 1:ai+1 || bi for all i (with wraparound) Proof: Since each proc. does all sends before any receives, there is no transitivity. Also pi+1 does not send to pi. Claim 2:ai+1 bj for all j ≠ i. Proof: If j = i+1, obvious. If j ≠ i+1, then pi+1 sends to pj: Set 12: Causality

  26. Vector Clock Size Lower Bound • Suppose in contradiction, there is a way to implement vector clocks with k-vectors of reals, where k < n. • By Claim 1, ai+1 || bi => V(ai+1) and V(bi) are incomparable => V(ai+1) is larger than V(bi) in some coordinate h(i) => h : {0,…,n-1}  {0,…,k} Set 12: Causality

  27. Vector Clock Size Lower Bound • Since k < n, the function h is not 1-1. So there exist distinct i and j such that h(i) = h(j). Let r be this common value of h. • So V(ai+1) is larger than V(bi) in coordinate r and V(aj+1) is larger than V(bj) in coordinate r also. • V(aj+1)[r] > V(bj)[r] by def. of r ≥ V(ai+1)[r] by Claim 2 (ai+1bj) & correct. ≥ V(bi)[r] by def. of r • Thus V(aj+1) !< V(bi), contradicting Claim 2 (aj+1bj) and assumed correctness of V. Set 12: Causality

  28. Application of Causality: Consistent Cuts • Consider an asynchronous message passing system with • FIFO message delivery per channel • at most one msg received per computation step • Number the computation steps of each processor 1,2,3,… • A cut of an execution is K = (k0,…,kn-1), where ki indicates number of computation steps taken by pi Set 12: Causality

  29. Consistent Cuts some cuts In a consistent cut K = (k0,…,kn-1), if step s of pj happens before step ki of pi, then s ≤ pj. (1,3) and (1,4) are consistent. (3,6) is inconsistent: step 4 by p0 happens before step 6 of p1. Set 12: Causality

  30. Finding a Recent Consistent Cut Problem Version 1: Processors all given a cut K and must find a maximal consistent cut that is ≤ K. Application: Logging-based crash recovery. • Procs periodically write their state to stable storage • When a proc recovers from a crash, it tries to recover to latest logged state, but needs to coordinate with other procs Set 12: Causality

  31. Vector Clocks Solution • Implement vector clocks using vector timestamps appended to application msgs. • Store the vector clock of each computation step in a local array store when pi is given input cut K: for x := K[i] downto 1 do if store[x] ≤ K then return x return x (entry i of global answer) Set 12: Causality

  32. What About Channel State? • Processor states are not sufficient to capture entire system state. • Messages in transit must be calculated. • Solution here requires • additional storage (number of messages) • additional computation at recovery time (involving replaying original execution to capture messages sent but not received) Set 12: Causality

  33. Another Take on Recent Consistent State Problem Version 2: A subset of procs initiate (at arbitrary times) trying to find a consistent cut that includes the state of at least one of the initiators when it started. Called a distributed snapshot. Application: termination detection Set 12: Causality

  34. Marker Algorithm • Instead of adding extra information on each application message, insert control messages ("markers") into the channels. initially answer = -1 and num = 0 when app msg arrives: num++; do app action when marker arrives or start: if answer = -1 then answer := num (part of final answer) send marker to all neighbors Set 12: Causality

  35. What About Channel States? • pi records sequence of msgs received from pj between the time pi records its answer and the time pi gets the marker from pj • These are the msgs in transit from pj to pi in the cut returned by the algorithm. Set 12: Causality

More Related