Synchronization in Message Passing Systems

Synchronization in Message Passing Systems by Ye Su Advisor: Dr. Gurdip Singh Department of Computing and Information Sciences Kansas State University Ph. D. Thesis Defense

Outline • Introduction to region synchronization problem. • Brief review of the aspect oriented-based methodology. • Correctness criteria for the region synchronization algorithm in distributed systems. • Algorithm to map the coarse-grained solution to fine-grained solution in the point-to-point networks. • Algorithm Optimizations in the point-to-point networks. • Algorithm to map the coarse-grained solution to fine-grained solution in the CAN based Systems. • Integrating our solutions to SyncGen toolset • solving a complicated example by using the approach we proposed. • Summary and future work Ph.D. Thesis Defense

P1 P2 P3 - Processes - Regions - Synchronization 1. Introduction • An aspect oriented approach for developing synchronization for shared memory systems is proposed by Mizuno, Singh and Neilsen. • Similar as their approach, we focus on the technique to derive algorithms for synchronization in message passing systems. Ph.D. Thesis Defense

2. Overview of the aspect oriented-based methodology Ph.D. Thesis Defense

Overview of the aspect oriented-based methodology • Identifying synchronization regions void reader() { void writer(){ while (true) { while (true) { ...other computation.... …other computation /*** Region-Enter: Reader ***/ /*** Region-Enter: Writer ***/ ...read shared variables.... …write shared variables… /*** Region-Exit: Reader ***/ /*** Region-Exit: Writer ***/ ...other computation... ...other computation... } } } } Ph.D. Thesis Defense

Overview of the aspect oriented-based methodology • Global invariant specification • A global invariant I is a predicate defined using in and out counters with arithmetic inequalities, arithmetic operators and boolean connectives. Reader Writer In[R] In[W] RR RW out[R] out[W] ((in[R]=out[R])(in[W]=out[W]))(in[W]-out[W]≤1) Ph.D. Thesis Defense

Overview of the aspect oriented-based methodology • Generation of coarse-grained solution • Two types of synchronization constructs, <S> and <await B S>, are used in a coarse-grained solution. Reader region: Entry: < await (in[W] = out[W])  in[R]++ > Exit: < out[R]++ > Writer region: Entry: < await ((in[R] = out[R]) /\ (in[W] = out[W]))  in[W]++ > Exit: < out[W]++ > Ph.D. Thesis Defense

Overview of the aspect oriented-based methodology • Translation to synchronization code • fine-grained synchronization code in a target programming language or platform is obtained from the coarse-grained solution. • Techniques to map coarse-grained solutions to multi-threaded programs based on monitors [And91] and Java synchronized blocks [Miz99] have been proposed. • In this thesis, we will focus on how to map a coarse-grained solution to fine-grained solutions in message passing based systems. Ph.D. Thesis Defense

Overview of the aspect oriented-based methodology • Weaving the code • The final step in the methodology is to weave the synchronization code to functional code. • For example, in the active monitor approach, the monitor code and the code for the proxies are generated automatically. Furthermore, appropriate method calls are inserted at appropriate points in the functional code. Ph.D. Thesis Defense

3. Correctness Criteria • In a distributed program, we define a synchronization statement Syni, associated with entry as well as exit for each region. • Syni is one of the forms: • < Ci++ > • < await (Bi) Ci++ > Where Ci is the in[x] or out[x] for some region Rx and Bi is composed of local variables. Ph.D. Thesis Defense

P1 P2 Pn …… request Central Site (Pc) Correctness Criteria • A simple centralized solution reply Ph.D. Thesis Defense

request reply Correctness Criteria • A distributed solution P1 P2 Pn …… Mon1 Mon2 Monn Message passing system Ph.D. Thesis Defense

Correctness Criteria • A counter example P1 P2 P3 Real Time reqa t reqb reqc Sync Sync Sync Syna Syna Synb Synb Synb inconsistent Sync Syna Syna Synb Ph.D. Thesis Defense

Correctness Criteria • P’: A virtual process executes every process’ synchronization statement in real time. Local counter variable P’ Pi Pj Auxiliary shared variable I((in[R]=out[R])(in[W]=out[W]))(in[W]-out[w]≤1) In[R]’++ In[R]++ out[R]’++ out[R]++ I’((in[R]’=out[R]’)(in[W]’=out[W]’))(in[W]’-out[w]’≤1) In[W]’++ In[W]++ • Definition: An algorithm A solves the region synchronization problem for invariant I if I' is an invariant of A. Ph.D. Thesis Defense

4. The algorithm for a point-to-point network • Happened Before ()Ea Eb if • Ea and Eb are events in the same process, and Ea occurred before Eb. • Ea is the event of sending a message in a process, Pi, and Eb is the event of receiving the same message in another process Pj. • Ea Ec, and Ec Eb. • Total ordering of events. • Eatm Eb, if and only if (TMEa < TMEb) or (TMEa = TMEb) /\ i < j) where TMEa is the timestamp for event Ea. • Eat Eb, if Ea occurs before Eb in real time Ph.D. Thesis Defense

Definitions • Seq is a sequence of statements, each of which increments a counter. • Seq1 || ... || Seqn denotes the concurrent execution of the sequences. • {P}Seq{Q} holds if, whenever the execution of Seq begins in a state satisfying P and the execution of Seq terminates the resulting state satisfies Q. ……Seq……. P Q Ph.D. Thesis Defense

Definitions • The weakest precondition, wp(Seq,Q), is a predicate defining the largest set of states such that the execution of Seq in any state satisfying wp(Seq,Q) results in a state satisfying Q. • The strongest postcondition, sp(P,Seq), is a predicate defining the smallest set of states such that the execution of Seq with precondition P results in a state satisfying sp(P,Seq). Ph.D. Thesis Defense

Definitions • Let Seqi and Seqj be two sequences in P and I be a global invariant of P. • If there exists Pi such that Pi wp(Seqi, I) is true but {Pi} Seqi || Seqj {I} does not hold then we say Seqjconflicts withSeqi with respect to I and we denote it by SeqjcfSeqi. • If there exists Pi such that Piwp(Seqi, I) is false but Pi wp(Seqj;Seqi,I) is true then we say SeqjenablesSeqi with respect to I and we denote it by Seqjen Seqi. ……Seqi…… ……Seqj;Seqi…… Pi I=true? Pi I=true ……Seqi…… ……Seqj;Seqi…… Pi I=true Pi I≠true Ph.D. Thesis Defense

Notations • reqi and reqjare requests to execute Syni and Synj respectively. • ex_reqj,x denotes the event of Px executing reqi, and ex_reqi denotes the event of local execution of reqi. P’ Px Py reqk reqi ex_reqk,x ex_reqk ex_reqk ex_reqi,y ex_reqi ex_reqi Ph.D. Thesis Defense

The algorithm for a point-to-point network • Let reqi be a request issued by Pk. We now define a set of rules that a process may follow to execute this request. • R1: reqj, where j ≠ i, if reqjcf reqi(reqjen reqi ), then ex_reqjt ex_reqi ex_reqj,kt ex_reqi. • R2: reqj, where j ≠ i, if reqjen reqi (reqjcf reqi) then ex_reqj,kt ex_reqi ex_reqjt ex_reqi. • R3:  reqj, where j ≠ i, if reqjcf reqi reqjen reqi, then ex_reqj,kt ex_reqi ex_reqjt ex_reqi. Ph.D. Thesis Defense

The algorithm for a point-to-point network • R1: reqj, where j ≠ i, if reqjcf reqi(reqjen reqi ), then ex_reqjt ex_reqi ex_reqj,kt ex_reqi. P’ Pk Pl Pk  ex_reqj ex_reqj ex_reqj,k ex_reqi ex_reqi ex_reqi Ph.D. Thesis Defense

The algorithm for a point-to-point network • R2: reqj, where j ≠ i, if reqjen reqi (reqjcf reqi) then ex_reqj,kt ex_reqi ex_reqjt ex_reqi. Pk P’ Pk Pl  ex_reqj ex_reqj ex_reqj,k ex_reqi ex_reqi ex_reqi Ph.D. Thesis Defense

The algorithm for a point-to-point network • R3:  reqj, where j ≠ i, if reqjcf reqi reqjen reqi, then ex_reqj,kt ex_reqi ex_reqjt ex_reqi. Pk P’ Pk Pl  ex_reqj ex_reqj ex_reqj,k ex_reqi ex_reqi ex_reqi Ph.D. Thesis Defense

The algorithm for a point-to-point network • Theorem: If all execution sequences of P satisfy R1, R2 and R3, then P is consistent. P’ Pk • I is true after ex_reqi at Pk. • Pk satisfies R1, R2 and R3. ex_reqi ex_reqi Question: Is I’ true after ex_reqi at P’ ? I’=true ? I=true Ph.D. Thesis Defense

The algorithm for a point-to-point network • Let {Pi}Seqi{Qi} holds. If Seqi' is any sequence obtained by reordering statements in Seqi and {Pi}Seqi'{Qi} also holds, then we say that the triple {Pi}Seqi{Qi} is order-free. • For example, the triple, {x=3  y=1} x=7; y=y+1 {x=7  y=2}, is order-free. Ph.D. Thesis Defense

The algorithm for a point-to-point network • Lemma 1: If {Pk}Seqk{Pi}Sti{I} holds and  Stj in Seqk where (StjcfSti) (StjenSti), then {Pk}Seqk'{Pi}Sti{I} still holds where Seqk' is obtained by removing Stj from Seqk. Stk … Stj-1 Stj Stj+1 ... Sti Stk … Stj-1 Stj+1 ... Sti Pk Pk (StjcfSti) (StjenSti) Seq’k Seqk  To proof lemma 1, we need use the definitions of order-free, conflict and enable conflict Pi Pi I I Ph.D. Thesis Defense

The algorithm for a point-to-point network • Lemma 2: If {Pk}Seqk{Pi}Sti{I} holds and  Stj where (StjenSti) (StjcfSti), then {Pk}Seqk'{Pi}Sti{I} still holds where Seqk' is obtained by adding Stj in Seqk. Stk … Stj-1 Stj+1 ... Sti Stk … Stj-1 Stj Stj+1 ... Sti Pk Pk (StjenSti) (StjcfSti) Seqk Seq’k  To proof lemma 2, we will need the definitions of order-free, conflict and enable Pi enable Pi I I Ph.D. Thesis Defense

The algorithm for a point-to-point network • The proof of theorem • R1+R2+R3+Lemma1+Lemma2  Theorem Ph.D. Thesis Defense

The algorithm for a point-to-point network • It is possible that an algorithm satisfying R1, R2 and R3 may not be starvation free. Pk reql ex_reqk,l reqi (Bi=false) reqj ex_reqk,j enablel (Bi=false) req? R4:  reqj, where j ≠ i, if reqjcf reqi(reqjen reqi ), then reqjtm reqi ex_reqjt ex_reqi. Ph.D. Thesis Defense

The algorithm for a point-to-point network • The general idea of the algorithm • If a process wants to execute a conflicting statement, it sends a request to all processes and waits the ack messages. • If a process wants to execute an enabling statements, a request should be sent out. • If a process receives a request from other process, the request should be executed. Ph.D. Thesis Defense

The algorithm for a point-to-point network • The algorithm (part 1): reqjcf reqi(reqjen reqi ) and we assume that reqjtm reqi. Real Time Px Py Px Py reqj reqi reqj conflict ex_reqj,x ack reqi ex_reqj ex_reqj ack ex_reqi,y ex_reqi ex_reqi This part implements R1 and R4 Ph.D. Thesis Defense

The algorithm for a point-to-point network • The algorithm (part 2): reqjen reqi (reqjcf reqi) Px Py Real time Px Py reqi ex_reqj reqj ex_reqj enable reqj ex_reqx,j reqi ex_reqi ex_reqi This part implements R2 Ph.D. Thesis Defense

The algorithm for a point-to-point network • The algorithm (part 3): reqjcf reqi reqjen reqi • reqjcan be handled as conflicting request and reqjwill not be sent out as enable request after the execution of reqj. Px Py Real time Px Py reqi ex_reqj reqj ex_reqj reqj conflict & enable ex_reqx,j reqi ex_reqi ex_reqi This part implements R3 Ph.D. Thesis Defense

The algorithm for a point-to-point network • Our algorithm satisfies the rules R1-R4, so it is consistent. • The complexity of messages less than 3XN where the N is the number of processes. • Message passing takes place only between those processes that need to synchronize. For example, in readers/writers problem, readers only send requests for entering Reader region to the writers instead of all the processes. Ph.D. Thesis Defense

The algorithm for a point-to-point network • Fault Tolerance • We consider only node failures (or process failure), and no link failures. To make things simple, we also assume that the node does not crash while sending a message. • The status of a process (whether it is in a region or not) can be identified by its last request. Each process Pk has a new variable, LASTk,j, to record the last request received from each process Pj. Ph.D. Thesis Defense

The algorithm for a point-to-point network • Node failure • Py captured the crashed node Pz • Py checks Lasty,z, if reqi is request for entering some region, then fakeREQ is the request to exit the same region. Otherwise, it is empty. • Py sends SomebodyDied (Pz,fakeREQ) to tell Px and Pyitself that Pz is died. • Px and Py treats the fakeREQ as the request from Pz then remove the Pz from list. • However, the algorithm has some limitation. For example, consider the case of Pj entering R1 followed by R2 in a nested manner. If Pj fails after exiting R1 and before entering R2, then other processes wanting to enter R1 may be blocked for ever since Pj will never exit R2. Px Py Pz reqi SomebodyDied(Pz,fakeREQ) Ph.D. Thesis Defense

The algorithm for a point-to-point network • Node recovery • When a process, Pj recovers from failure, it sends Join(Pj) message to all other processes and waits for Agree messages from them. • When a process, Pi, receives the Join(Pj) message, it adds Pj to the list. Pi then sends Agree message along with the latest request that it executed. • When Pj receives the Agree message from Pi, it checks the latest request Pi executed. If the latest request is for entering Rx, Pj increases the entry counter for Rx, In[x], by one, otherwise, it does nothing. Then, Pj can executes the requests from Pi after it gets the Agree message. After Pj gets all the Agree messages from other processes it can sends its own request to execute. Ph.D. Thesis Defense

5. Algorithm Optimizations • The algorithm that we have proposed is for the general problem of region synchronization. • We would like our general algorithm to match the message complexity of algorithms that have been designed for specific synchronization problems. • For example, our algorithm uses the same number of messages for the distributed mutual exclusion problem as the algorithm proposed by Lamport (3 X (N-1) messages). However, Ricart-Agrawala algorithm only requires 2 X (N-1) messages. • Chandy gives an algorithm requiring 0-2d messages for dining philosophers problem while our algorithm needs 3d messages where d is the number of neighbors of a philosopher. Ph.D. Thesis Defense

Pl Pk Pl Pk Pl Pk Syncx Syncx Syncx conflict Syncz reqz Syncy reqz Syncy enable Syncy ack(incl,k,y) ack(incl,k,y) Syncz (a) (c) (b) Algorithm Optimizations • Optimization to handle remote requests • incl,k,y keeps track of the number of times Pl has executed Syncy until a request for Syncz from Pk arrives. Syncz Ph.D. Thesis Defense

Pl Pk Pl Pk reqx Syncx ack conflict Syncx Syncz Syncy reqz conflict Syncy ack(incl,k,y) (a) (b) Algorithm Optimizations • Optimization to handle local requests Syncz Ph.D. Thesis Defense

Pl Pk Pl Pk Syncx Syncy reqy Syncx ack Syncy Syncz conflict reqz Syncz ack(incl,k,x) (a) (b) Algorithm Optimizations • Using Application Structure to optimize performance Ph.D. Thesis Defense

Algorithm Optimizations • Summary: • We have proposed a general algorithm for synchronization in point-to-point system. • We also show that our algorithm, by the optimizing, has performance comparable to known algorithms for specific synchronization problems . Ph.D. Thesis Defense

6. The algorithm for CAN based System • Introduction to CAN network • Control Area Network (CAN) is well designed as a serial data communications bus that supports distributed control systems by sending and receiving short real-time control messages. • CAN is a broadcast bus. • The message is identified by message identifier. The identifier not only filters upon reception but also sets the priority of the messages. • CAN bus behaves like a large AND-gate for all bit sent at the same time. Ph.D. Thesis Defense

P1 P2 P3 Real Time reqa t reqb reqc Sync Sync Sync Syna Syna Synb Synb Synb inconsistent Sync Syna Syna Synb The algorithm for CAN based System • Review the correctness criteria Ph.D. Thesis Defense

The algorithm for CAN based System • When a process (node) wants to execute a synchronization statement, Syni, it must satisfy the following rules: • C1: If ( j, Synicf Synj Synjcf Syni) and ( k, (Synien Synk)), then a request is sent out. Ci' and Ciare incremented when Bi is true locally. • C2: If  j, Synien Synj and  k, (Synkcf Syni  Synicf Synk), then a request is sent out after Bi is true locally. Thus, Ci' and Ci are incremented before the request is sent. • C3: If  j, Synicf Synj Synjcf Syni and  k, Synien Synk, then the request is sent first. Subsequently, Ci and Ci' are incremented when Bi is true locally, and then a notify message is sent out to all other processes. Other processes increment Ci locally only on receiving this notification. Ph.D. Thesis Defense

The algorithm for CAN based System • Implementation on distributed approach Application Order Control P1 P1 …… Pn CAN BUS Ph.D. Thesis Defense

The algorithm for CAN based System • Implementation on active monitor approach Application Monitor Proxy …… P1 Pn M Proxy CAN BUS Ph.D. Thesis Defense

The algorithm for CAN based System • The example used in our implementation • Sleeping barber problem A shop has M barbers, one chair for each barber and a waiting room with K chairs. If all barbers are busy when a customer, says A, arrives, then A waits in the waiting room (provided there is an empty chair). If a barber, says B, is free, then A sits in B's chair. After B is done cutting the hair, A leaves the shop. Subsequently, B waits for another customer to sit on its chair. • We have implemented solutions to the sleeping barber problem using the active monitor approach and the distributed approach. • The system consists of six 167CR boards connected via a 250Kb/s speed CAN network, two barber nodes, three customer nodes and a noise node that is added to the system to adjust the system load. Three customers come to the shop in a random time, between 0 to 25 ms. The barber serves every customer 25ms. Ph.D. Thesis Defense

The algorithm for CAN based System • The performance analysis of two approaches Ph.D. Thesis Defense

Synchronization in Message Passing Systems