1 / 116

Global Predicate Detection and Event Ordering

Global Predicate Detection and Event Ordering. Our Problem. To compute predicates over the state of a distributed application. Model. Message passing No failures Two possible timing assumptions: Synchronous System Asynchronous System No upper bound on message delivery time

zihna
Download Presentation

Global Predicate Detection and Event Ordering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Global Predicate Detection • and Event Ordering

  2. Our Problem To compute predicates over the state of a distributed application

  3. Model • Message passing • No failures • Two possible timing assumptions: • Synchronous System • Asynchronous System • No upper bound on message delivery time • No bound on relative process speeds • No centralized clock

  4. Asynchronous systems • Weakest possible assumptions • Weak assumptions ´ less vulnerabilities • Asynchronous  slow • “Interesting” model w.r.t. failures

  5. s c Client-Server • Processes exchange messages using • Remote Procedure Call (RPC) A client requests a service by sending the server a message. The client blocks while waiting for a response

  6. #!?%! s c Client-Server • Processes exchange messages using • Remote Procedure Call (RPC) A client requests a service by sending the server a message. The client blocks while waiting for a response The server computes the response (possibly asking other servers) and returns it to the client

  7. Deadlock!

  8. Goal • Design a protocol by which a processor can determine whether a global predicate (say, deadlock) holds

  9. Wait-For Graphs • Draw arrow from pi to pj if pj has received a request but has not responded yet

  10. Wait-For Graphs • Draw arrow from pi to pj if pj has received a request but has not responded yet • Cycle in WFG ) deadlock • Deadlock )¦ cycle in WFG

  11. The protocol • p0 sends a message to p1 p3 • On receipt of p0 ‘s message, pi replies with its state and wait-for info

  12. An execution

  13. An execution

  14. Ghost Deadlock! An execution

  15. We have a problem... • Asynchronous system • no centralized clock, etc. etc. • Synchrony useful to • coordinate actions • order events

  16. Events and Histories • Processes execute sequences of events • Events can be of 3 types: local, send, and receive • epi is the i-th event of process p • The local historyhp of process p is the sequence of events executed by process • hpk : prefix that contains first k events • hp0 : initial, empty sequence • The historyH is the set hp0 [ hp1 [ … hpn-1 NOTE: In H, local histories are interpreted as sets, rather than sequences, of events

  17. time Ordering events • Observation 1: • Events in a local history are totally ordered

  18. Ordering events • Observation 1: • Events in a local history are totally ordered • Observation 2: • For every message m, send(m) precedes receive(m) time time time

  19. Happened-before(Lamport[1978]) • A binary relation defined over events • if eik,eil2hi and k<l , then eik !eil • if ei=send (m) and ej=receive(m), then ei!ej • if e!e’ and e’!e‘’, then e!e‘’

  20. Space-Time diagrams • A graphic representation of a distributed execution time

  21. Space-Time diagrams • A graphic representation of a distributed execution time

  22. Space-Time diagrams • A graphic representation of a distributed execution time

  23. Space-Time diagrams • A graphic representation of a distributed execution time

  24. H and impose a partial order Space-Time diagrams • A graphic representation of a distributed execution time

  25. Space-Time diagrams • A graphic representation of a distributed execution time H and impose a partial order

  26. Space-Time diagrams • A graphic representation of a distributed execution time H and impose a partial order

  27. Space-Time diagrams • A graphic representation of a distributed execution time H and impose a partial order

  28. Runs andConsistent Runs • A run is a total ordering of the events in H that is consistent with the local histories of the processors • Ex: h1,h2, … ,hn is a run • A run is consistent if the total order imposed in the run is an extension of the partial order induced by! • A single distributed computation may correspond to several consistent runs!

  29. Cuts • A cut C is a subset of the global history of H

  30. Cuts • A cut C is a subset of the global history of H • The frontier of C is the set of events

  31. Global states and cuts • The global state of a distributed computation is an tuple of n local states •  = (1 ... n ) • To each cut (1c1, ... ncn) corresponds a global state

  32. Consistent cuts and consistent global states • A cut is consistent if • A consistent global state is one corresponding to a consistent cut

  33. What sees

  34. What sees • Not a consistent global state: the cut contains the event corresponding to the receipt of the last message by p3 but not the corresponding send event

  35. Our task • Develop a protocol by which a processor can build a consistent global state • Informally, we want to be able to take a snapshot of the computation • Not obvious in an asynchronous system...

  36. Our approach • Develop a simple synchronous protocol • Refine protocol as we relax assumptions • Record: • processor states • channel states • Assumptions: • FIFO channels • Each m timestamped with with T(send(m))

  37. Snapshot I • 1. p0 selects tss • 2.p0 sends “take a snapshot at tss” to all processes • 3. when clock of pi reads tssthen • records its local state i • starts recording messages received on each of incoming channels • stops recording a channel when it receives first message with timestamp greater than or equal to tss

  38. Snapshot I • 1. p0 selects tss • 2.p0 sends “take a snapshot at tss” to all processes • 3. when clock of pi reads tssthen • records its local state i • sends an empty message along its outgoing channels • starts recording messages received on each of incoming channels • stops recording a channel when it receives first message with timestamp greater than or equal to tss

  39. Correctness Theorem:Snapshot I produces a consistent cut Proof: Need to prove < Definition > < 0 and 1> < 5 and 3> < Assumption > < Property of real time> < Definition > < Assumption > < 2 and 4>

  40. Clock Condition < Property of real time> Can the Clock Condition be implemented some other way?

  41. Lamport Clocks • Each process maintains a local variable LC • LC(e) = value of LC for event e

  42. Increment Rules Timestamp m with

  43. Space-Time Diagrams and Logical Clocks 3 2 6 7 8 7 1 8 9 4 5 6

  44. A subtle problem • whenLC=tdo S • doesn’t make sense for Lamport clocks! • there is no guarantee that LC will ever be t • S is anyway executed after • Fixes: • If e is internal/send and LC = t-2 • execute e and then S • If e = receive(m) Æ (TS(m) ¸ t) Æ (LC · t-1) • put message back in channel • re-enable e; set LC=t-1; execute S

  45. An obvious problem • No tss ! • Choose large enough that it cannot be reached by applying the update rules of logical clocks

  46. An obvious problem • No tss! • Choose large enough that it cannot be reached by applying the update rules of logical clocks • Doing so assumes • upper bound on message delivery time • upper bound relative process speeds • Better relax it

  47. Snapshot II • p0selects  • p0 sends “take a snapshot at tss” to all processes; it waits for all of them to reply and then sets its logical clock to  • when clock of pi reads  then pi • records its local state i • sends an empty message along its outgoing channels • starts recording messages received on each incoming channel • stops recording a channel when receives first message with timestamp greater than or equal to 

  48. empty message: take a snapshot at monitors channels records local state Process does nothing for the protocol during this time! sends empty message: Relaxing synchrony Use empty message to announce snapshot!

  49. Snapshot III • Processor p0 sends itself “take a snapshot “ • when pireceives “take a snapshot” for the first time from pj: • records its local state i • sends “take a snapshot” along its outgoing channels • sets channel from pj to empty • starts recording messages received over each of its other incoming channels • when receives “take a snapshot” beyond the first time from pk: • pi stops recording channel from pk • when pi has received “take a snapshot” on all channels, it sends collected state to p0 and stops.

  50. Snapshots: a perspective • The global state s saved by the snapshot protocol is a consistent global state

More Related