300 likes | 315 Views
Explore the concepts of logical clocks and vector clocks for causal delivery in distributed systems, understand event ordering, and multicast message timestamping in this comprehensive PhD study on computer networks.
E N D
Computer Networks PhD. Saúl Pomares Hernández Causal Delivery
Clocks, events and process states • A distributed system is defined as a collection P of N processes pi, i = 1,2,… N • Each process pi has a state si consisting of its variables (which it transforms as it executes) • Processes communicate only by messages (via a network) • Actions of processes: • Send, Receive, change own state • Event: the occurrence of a single action that a process carries out as it executes e.g. Send, Receive, change state • Events at a single process pi, can be placed in a total ordering denoted by the relation i between the events. i.e. eie’ if and only if e occurs before e’ at pi • A history of process pi: is a series of events ordered by i history(pi)= hi =<ei0, ei1, ei2, …>
Logical time and logical clocks Happened-before(Lamport 1978) • Instead of synchronizing clocks, event ordering can be used • If two events occurred at the same process pi (i = 1, 2, … N) then they occurred in the order observed by pi, that is • when a message, m is sent between two processes, send(m) happened before receive(m) • The happened before relation is transitive Not all events are related by consider a and e (different processes and no chain of messages to relate them) they are not related by ; they are said to be concurrent; write as a || e ab(at p1) cd (at p2) bc because of m1 also df because of m2
Lamport’s logical clocks • A logical clock is a monotonically increasing software counter. It need not relate to a physical clock. • Each process pi has a logical clock, Li which can be used to apply logical timestamps to events • LC1: Li is incremented by 1 before each event at process pi • LC2: • (a) when process pi sends message m, it piggybacks t = Li • (b) when pj receives (m,t) it sets Lj := max(Lj, t) and applies LC1 before timestamping the event receive (m) ee’ implies L(e)<L(e’) The converse is not true, that is L(e)<L(e') does not imply ee’
Vector clocks • Vector clock Vi at process pi is an array of N integers • VC1:initially Vi[j] = 0 for i, j = 1, 2, …N • VC2:before pi timestamps an event it sets Vi[i] := Vi[i] +1 • VC3:pi piggybacks t = Vi on every message it sends • VC4:when pi receives (m,t) it sets Vi[j] := max(Vi[j] , t[j]) j = 1, 2, …N ( then before next event adds 1 to own element using VC2) Note that ee’ implies V(e)<V(e’). The converse is also true
A B C Initial vector (0,0,0) (0,0,0) (0,0,0) T i m e m1 (1,0,0) m2 (1,1,0) The message m2 arrives but can not be delivered (1,0,0) Only after the delivery of m1 , The message m2 also can be delivered. Causal Order deliver [BIR91] Causal Ordering: Ifsend(m) send(m’), thenk g deliveryk(m) deliveryk(m’) Delivery condition if(VT(m’ )[i] = VT(pj)[i] +1 andVT(m’ )[k] VT(pj)[k] (k i, k=1…n) then delivery(m)
Causal Order deliver • The general algorithm of vector time for causal delivery is as follows: • Initially, VT(pi)[j] = 0 j=1…n. • For each event send(m) at pi, • VT(pi)[i] = VT(pi)[i] + 1. • Each multicast message by process pi is timestamped with the updated value of VT(pi). • For each event deliveredj(m’), pj modifies its vector time in the following manner: • VT(pj)[k]=max(VT(m’)[k], VT(pj)[k]) ; VT(pj)[i]=VT(pj)[i] + 1 For each receptionreceive(m’) à pj , ij, m’=(i,VT(m’),message) • To enforce a causal delivery of m’ • i. Delivery condition • if not(VT(m’)[i] = VT(pj)[i] +1 andVT(m’)[k] VT(pj)[k] (k i, k=1…n) • then • wait • else • ii delivery(m)
m1 m2 g3 m3 p3 p1 g1 g2 p2 Delivery of event m3 must be delayed. Causal Order deliver, Multigroup Case If sendi(m,g) sendj(m’,g’), thenk g g’ deliveryk(m) deliveryk(m’) p3 p1 g3 g1={p1, p2} g2={p2, p3} g3={p1, p3} g2 (1,0,x) (x,0,1) (1,x,0) (1,0,x) (x,0,0) (0,x,0) (1,0,x) (x,0,0) (1,x,0) g1 m2 m3 m1 p2 Delivery condition - VTa(m)[i] = VTa(pj)[i] +1 - k : (pk gaΛki): VTa(m)[k]≤VT(pj)[k] and - g : (g Gj): VTg(m) ≤VTg(pj) t p2g1 g2
e1 e3 e2 e4 e5 Delivery of event e5 must be delayed. Exercise ch3 p3 p1 p3 ch3 p1 ? ch2 e1 ch1 p2 ch1 ch2 p2 . . . ? ch1={?} ch2={?} ch3={?} e5 p2ch1 ch2
The Basic Principles (cont.) The causal relation, denoted by : 1. x, ay, bifx=ya < b 2. x, ay, bifx, a is the sending of an event and y, bis the delivery of that event . 3. x, ay, bifz, c | (x, az, cz, cy, b)
Immediate Dependency Relation • The problem with causal ordering : • The amount of control information emitted for large values of n = |G| is prohibitively high. Immediate Dependency Relation : ee’[ (e e’) e” E, (e e” e’)]
Immediate Dependency Relation (cont.) Causal Intra-Channel Ordering: Ifsend(e) send(e’), thenk c deliveryk(e) deliveryk(e’) Proposition 1: Ife,e’ Esend(e) send(e’), thenk c deliveryk(e) deliveryk(e’)
e e 2 2 S S 1 1 e e e S S 3 3 2 2 3 S S e e e 3 3 4 4 4 S S 4 4 e 2 Serial Events p p p p p p 1 2 3 4 4 5 5 e e3e4 1 Immediate Dependency t t e1 e2 e3 e4 IDR Graph Immediate Dependency Relation : ee’[ (e e’) e” E, (e e” e’)]
e S S e 1 1 2 e 3 3 S S 2 1 S S 3 1 S S 4 1 e2 e6 S S 5 1 e S S e 5 1 6 5 e1 S S 7 1 e3 e4 e5 S S 1 8 e 2 Concurrent Events p p p p p 1 2 3 4 5 (e2|| ( e3 e4) ) e6 e 1 Immediate Dependency e t 4 e 6 IDR Graph Immediate Dependency Relation : ee’[ (e e’) e” E, (e e” e’)]
e2 e6 e1 e3 e4 e5 Immediate Dependency Relation : ee’[ (e e’) e” E, (e e” e’)]. Concurrent Relation || : e || e’(e e’ e’ e) Observation : (e’ee”e) (ee’ ee”)e’ || e”
Problem m 1 m which events immediately precede m5? 2 m 3 t m 4 m 5 Immediate Dependency Relation : ee’[ (e e’) e” E, (e e” e’)]
For Multi-Channel Case Immediate Inter-Channel Dependency Relation : (e,c)(e’,c’) [((e,c) (e’, c’))(e”, c’’)E, ((e,c) (e”, c’’) (e’, c’) c’’cc’’c’)] Observation: Ifonly one channel exists in the system, then =
Causal Inter-Channel Ordering: Ifsend(e,c) send(e’,c’), thenk cc’ deliveryk(e) deliveryk(e’) Proposition 2: Ife,e’ Esend(e,c) send(e’,c’), thenk cc’ deliveryk(e) deliveryk(e’)
e1 e2 ((e1,ch1)(e2,ch3))(e3,ch2) e3 Events with IICDR to e3 Delivery of event e3 must be delayed. Inter-channel Dependency p3 p1 ch3 Proposition 2: Ife,e’ Esend(e,c) send(e’,c’), thenk cc’ deliveryk(e) deliveryk(e’) ch2 ch1 p2 Immediate Inter-Channel Dependency Relation : (e,c)(e’,c’) [((e,c) (e’, c’))(e”, c’’)E, ((e,c) (e”, c’’) (e’, c’) c’’cc’’c’)] t p2ch1 ch2
m1 m4 Exercise p3 p1 ch3 Immediate Inter-group Dependency Relation : (m,c)(m’,c’) [((m,c) (m’, c’))(m”, c’’)M, ((m,c) (m”, c’’) (m’, c’) c’’cc’’c’)] ch2 ch1 p2 m3 m2 Which events have immediate inter-group dependency relation with m5? m5 (((m2,ch1)||(m3,ch1))↑(m4,ch3))↑(m5,ch2) p2ch1 ch2
Implementation JSDT 1 * Consistent Session * Session 1 1 1 * * Causal Ordering Channel Channel 1 1 * * * * Participant * * The MCP General Structure
Membership Membership service The rest of Participants Participant pk The only no causal message req_join(ch, pk) Wait for serv_join serv_join(ch,pk,np) Memory reservation for a new participant pk • Wait for np-1 messages init_join. • Actualization of its VT init_join(pk, pi,VT(pi)[i]) join(ch, pk) Only after the reception of join, we consider pk like a member of channel ch Join Procedure
Membership Participant pk Membership service The rest of Participants req_leave(ch, pk) Wait for serv_leave serv_leave(ch,pk) leaving notification of pk leave(ch, pk) Only after the reception of leave, on efface toute information concernant pk Leave Procedure
Implementation (cont.) Cooperative Distributed Engineering System Multi-Group Causal Protocol Java Shared Data Toolkit Light Reliable Multicast Protocol . . . g channels 1 2 Network The MCP Architecture
Partial Ordering at Interval Level • Intervals: Each interval AI is a contiguous set of integers A = [a-,a+] = { x N : a- x a+ }, where a- and a+ denote the left and right endpoints of A. • Each interval A is related to a participant p=Part(A) by the mapping Part:I→P • Its elements are associated to events involving p by a one-to-one monotonic mapping Interv : A → Ep, i.e. for any x,y A, we have,x<y Interv(x) →Interv(y). • Me define the sent messages M(A) of an interval A by M(A) = { mM : send(m) Interv(A) }.
Partial Ordering at Interval Level • Definition. The relation “ I ” on the set of intervals of a system is the smallest relation satisfying the following two conditions: • 1. AIB if (m,m’) M(A) M(B) mm’ • 2. AIB if C | (AICCIB) • Proposition 1. The relation “ I ” is accomplished if satisfy the following two conditions: • 1. AIB if a+M’ b- • 2. AIB if C | (AICCIB)
Homework • What does it mean “smallest relation” in the previous causal definition? • Proof of the Proposition 1.
Definition. Two intervals A, B are said to be simultaneous “ ||| “ if the following condition is satisfied: A ||| B ( a- || b-a+ || b+ : a-b+b-a+ ) • The simultaneous ||| relation is de complement relation for the causal interval relation • a b denotes ab a || b, means that a could occur before b, or a parallel b