1 / 34

Coverage

Explore the nature and significance of causality in distributed systems, including modeling distributed events, logical clocks, and practical applications. Learn how causality helps ensure system reliability, consistency, and optimization.

Download Presentation

Coverage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coverage • Nature of Causality • Causality: Why is it Important/Useful? • Causality in Life vs. Causality in Distributed Systems • Modeling Distributed Events - Defining Causality • Logical Clocks • General Implementation of Logical Clocks • Scalar Logical Time • Demo: Scalar Logical Time with Asynchronous Unicast Communication between Multiple Processes • Conclusions • Questions / Reference

  2. Nature of Causality • Consider a distributed computation which is performed by a set of processes: • The objective is to have the processes work towards and achieve a common goal • Processes do not share global memory • Communication occurs through message passing only

  3. Process Actions • Actions are modeled as three types of events: • Internal Event: affects only the process which is executing the event, • Send Event: a process passes messages to other processes • Receive Event: a processes gets messages from other processes b d g i m o P1 a c j l n r P2 f h P3 e k p q

  4. Causal Precedence • Ordering of events for a single process is simple: they are ordered by their occurrence. a b c d P • Send and Receive events signify the flow of information between processes, and establish causal precedence between events at the sender and receiver   a d P1 P2  c  b

  5. Distributed Events • The execution of this distributed computation results in the generation of a set of distributed events. • The causal precedence induced by the send and receive events establishes a partial order of these distributed events: • The precedence relation in our case is “Happened Before”, e.g. for two events a and b, a  b means “a happened before b”. a  b (Event a precedes event b) a P1 b P2

  6. Causality:Why is it important/useful? • This causality among events (induced by the “happened before” precedence relation) is important for several reasons: • Helps us solve problems in distributed computing: • Can ensure liveness and fairness in mutual exclusion algorithms, • Helps maintain consistency in replicated databases, • Facilitates the design of deadlock detection algorithms in distributed systems,

  7. Importance of Causality (Continued) • Debugging of distributed systems: allows the resumption of execution. • System failure recovery: allows checkpoints to be built which allow a system to be restarted from a point other than the beginning. • Helps a process to measure the progress of other processes in the system: • Allows processes to discard obsolete information, • Detect termination of other processes

  8. Importance of Causality • Allows distributed systems to optimize the concurrency of all processes involved: • Knowing the number of causally dependent events in a distributed system allows one to measure the concurrency of a system: All events that are not causally related can be executed concurrently.

  9. Causality:Life vs. Distributed Systems • We use causality in our lives to determine the feasibility of daily, weekly, and yearly plans, • We use global time and (loosely) synchronized clocks (wristwatches, wall clocks, PC clocks, etc.)

  10. Causality (Continued) • However, (usually) events in real life do not occur at the same rate as those in a distributed system: • Distributed systems’ event occurrence rates are obviously much higher, • Event execution times are obviously much smaller. • Also, distributed systems do not have a “global” clock that they can refer to, • There is hope though! We can use “Logical Clocks” to establish order.

  11. Modeling Distributed Events:Defining Causality and Order • Distributed program as a set of asynchronous processes p1, p2, …, pn, who communicate through a network using message passing only. Process execution and message transfer are asynchronous. P3 P1 P4 P2

  12. Modeling Distributed Events • Notation: given two events e1 and e2, • e1 e2 : e2 is dependent on e1 • if e1 e2 and e2 e1 then e1 and e2 are concurrent: e1 || e2 e1 e1 P1 P1 P2 P2 e2 e2

  13. Logical Clocks • In a system of logical clocks, every process has a logical clock that is advanced using a set of rules P2 P1 P3

  14. Logical Clocks - Timestamps • Every event is assigned a timestamp (which the processes use to infer causality between events). P1 P2 Data

  15. Logical Clocks - Timestamps • The timestamps obey the monotonicity property. e.g. if an event a causally affects an event b, then the timestamp for a is smaller than b. Event a’s timestamp is smaller than event b’s timestamp. a P1 b P2

  16. Formal Definition of Logical Clocks • The definition of a system of logical clocks: • We have a logical clock C, which is a function that maps events to timestamps, e.g. For an event e, C(e) would be its timestamp P1 e P2 Data C(e)

  17. Formal Definition of Logical Clocks • For all events e in a distributed system, call them the set H, applying the function C to all events in H generates a set T: e  H, C(e)  T a b d P1 P2 c H = { a, b, c, d } T = { C(a), C(b), C(c), C(d) }

  18. Formal Definition of Logical Clocks • We define the relation for timestamps, “<“, to be our precedence relation: “happened before”. • Elements in the set T are partially ordered by this precedence relation, i.e.: The timestamps for each event in the distributed system are partially ordered by their time of occurrence. More formally, e1 e2 C(e1) < C(e2)

  19. Formal Definition of Logical Clocks • What we’ve said so far is, “If e2 depends on e1, then e1 happened before e2.” • This enforces monotonicity for timestamps of events in the distributed system, and is sometimes called the clock consistency condition.

  20. General Implementation of Logical Clocks • We need to address two issues: • The data structure to use for representing the logical clock and, • The design of a protocol which dictates how the logical clock data structure updated

  21. Logical Clock Implementation:Clock Structure • The structure of a logical clock should allow a process to keep track of its own progress, and the value of the logical clock. There are three well-known structures: • Scalar: a single integer, • Vector: a n-element vector (n is the number of processes in the distributed system), • Matrix: a nn matrix

  22. Logical Clock Implementation:Clock Structure • Vector: Each process keeps an n-element vector C1 C2 C3 Process 1’s Logical Time Process 1’s view of Process 2’s Logical Time Process 1’s view of Process 3’s Logical Time • Matrix: Each process keeps an n-by-n matrix C1 C2 C3 C1´ C2 ´ C3 ´ C1 ´´ C2 ´´ C3 ´´ Process 1’s view of Process 3’s view of everyone’s Logical Time Process 1’s Logical Time and view of Process 2’s and Process 3’s logical time. ...

  23. Logical Clock Implementation:Clock Update Protocol • The goal of the update protocol is to ensure that the logical clock is managed consistently; consequently, we’ll use two general rules: • R1: Governs the update of the local clock when an event occurs (internal, send, receive), • R2: Governs the update of the global logical clock (determines how to handle timestamps of messages received). • All logical clock systems use some form of these two rules, even if their implementations differ; clock monotonicity (consistency) is preserved due to these rules.

  24. Scalar Logical Time • Scalar implementation – Lamport, 1978 • Again, the goal is to have some mechanism that enforces causality between some events, inducing a partial order of the events in a distributed system, • Scalar Logical Time is a way to totally orders all the events in a distributed system, • As with all logical time methods, we need to define both a structure, and update methods.

  25. Scalar Logical Time: Structure • Local time and logical time are represented by a single integer, i.e.: • Each process pi uses an integer Cito keep track of logical time. P1 C1 P2 C2 P3 C3

  26. Scalar Logical Time:Logical Clock Update Protocol • Next, we need to define the clock update rules: • For each process pi: • R1: Before executing an event, pi executes the following: Ci = Ci + d (d > 0) d is a constant, typically the value 1. • R2: Each message contains the timestamp of the sending process. When pi receives a message with a timestamp Cmsg, it executes the following: Ci = max(Ci, Cmsg) Execute R1

  27. Scalar Logical Time:Update Protocol Example C1 = 0 d = 1 C2 = 0 P1 P2 C2 = 1 (R1) C1 = 1 (R1) C2 = 2 (R1) C2 = max(2, 1) (R2) C2 = 3 (R1) C1 = 2 (R1) C2 = 4 (R1) C2 = 5 (R1) C2 = 6 (R1) C1 = 3 (R1) C1 = max (3, 6) (R2) C2 = 7 (R1) C1 = 7 (R1)

  28. Scalar Logical Time: Properties • Properties of this implementation: • Maintains monotonicity and consistency properties, • Provides a total ordering of events in a distributed system.

  29. Scalar Logical Time: Pros and Cons • Advantages • We get a total ordering of events in the system. All the benefits gained from knowing the causality of events in the system apply, • Small overhead: one integer per process. • Disadvantage • Clocks are not strongly consistent: clocks lose track of the timestamp of the event on which they are dependent on. This is because we are using a single integer to store the local and logical time.

  30. Demo - Simple Scalar Logical Time Application • Consists of several processes, communicating asynchronously via Unicast, • Only Send and Receive events are used; internal events can be disregarded since they only complicate the demo (imagine processes which perform no internal calculations), • Scalar logical time is used, • Written in Java.

  31. Demo: Event Sequence • Start one process (P1) • P1 uses a receive thread to process incoming messages asynchronously. • P1 will sleep for a random number of seconds • Upon waking, P1 will attempt to send a message to a random process, emulating asynchronous and random sending. P1 repeats this process. • Start process 2 (P2). The design of the application allows processes to know who is in the system at all times. • P2 performs the same steps as P1…

  32. Conclusions • Logical time is used for establishing an ordering of events in a distributed system, • Logical time is useful for several important areas and problems in Computer Science, • Implementation of logical time in a distributed system can range from simple (scalar-based) to complex (matrix-based), and covers a wide range of applications, • Efficient implementations exist for vector and matrix based scalar clocks, and must be considered for any large scale distributed system.

More Related