Distributed Mutual Exclusion

Distributed Mutual Exclusion

Preliminaries • Two broad classes of algorithms • Assertion-based • Token-based • System model • Each site (node) is in one of these states: • requesting Critical Section (CS) • Executing CS • Idle • Distributed mutual exclusion algorithms should be • Free from deadlocks • Free from starvation • Fair • Fault-tolerant

Performance Measures • Measures • Number of messages per CS invocation • Synchronization delay=elapsed time between a site leaving CS until the next site enters CS • Response time=elapsed time between a site’s request and its exit from CS • System throughput=rate of CS executions • Assume that • T is the average ½ network roundtrip delay • E is the average CS execution time • Low and high load performance • Best, average, and worst case performance

Primary Site Protocol • Simple centralized protocol where • A control site is assigned the task of regulating access to CS • Each site that wants to enter CS requests permission from control site • Performance • Synch. Delay = 2T • Throughput=1/(2T+E) • Advantages and disadvantages of this algorithm

Quorum-Based Protocols • Given a set of nodes S, a quorum set (coterie) C is a set of subsets of S, called quorums, such that any two of them have a non-empty intersection • A site Si has a (set) of quorums associated with it, and whenever it wants to enter CS it requests permission from all the nodes in a quorum • A site in a quorum is assumed to give permission for CS only to 1 site at a time • A site upon completing CS, usually informs its quorum • Many quorum-based algorithms differ on the choice of quorums, and the “voting” protocol used by quorum sites

Lamport’s Mutex Protocol • Setup • The quorum of each site is the set of all sites • Each site maintains a priority queue with all CS requests (requests are ordered according to their Lamport timestamps) • Messages are delivered in FIFO order between any two sites (FIFO channels) • A requesting site Si sends a REQUEST(TSi,I) to all sites; each site Sj places any REQUEST(TSi,I) msg it receives to its queue and sends a REPLY(TSj) message to Si • A site Si enters CS if • it received a reply message with timestamp higher than Tsi from all other sites • Its request is at the top of its own queue • Upon completion of CS, site Si sends a RELEASE(Tsi) message to each site, which in response remove Si’s request from their queues

Lamport’s Mutex Protocol • Correctness • An executing CS site Si has its request in every site’s queue; thus only 1 request can be at the top of every queue (due to FIFO channels) • Performance • Number of messages= 3(N-1) • Synch. Delay=T • Note that a site Sj does not need to send a REPLY(TSj) to a REQUEST(Tsi,I) if Sj has already send a REQUEST(TSj,j) to Si with TSj >TSi

Ricart-Agrawala’s Protocol • An optimization to Lamport’s protocol • Combine RELEASE and REPLY messages • Algorithm • Site Si sends REQUEST(Tsi,I) to all other sites • Site Sj upon receiving REQUEST(TSi,I) sends a REPLY(TSj) to Si if either Sj is idle or it is requesting CS with TSj <TSi; else Sj defers Si’s request • Si enters CS only if it receives REPLY msgs from all other sites • Upon completing CS, Si sends REPLY msgs to all deferred requests • Observe that a site with a higher-priority (lower timestamp) outstanding CS request defers all other (lower priority) requests

Ricart-Agrawala’s Protocol • Performance • #messages = 2(N-1) • Synch. Delay=T • Note that further optimization are possible (for example an “authorization” from one site to some other site is implicitly valid until that the first site sends an authorization to that other site)

More Quorum-based protocols • Based on different ways to construct quorum systems by different logical organizations of the nodes • Maekawa’s protocol • Organize the nodes in a grid • Agrawal and El Abbadi’s protocol • Organize nodes in a tree • Others • Organize nodes in a triangle, etc • Implicit quorums (assign votes to each node – Thomas ’s majority protocol, etc) • In all these protocols care must be taken to avoid deadlocks • Which can happen when requests are not sent to all sites and requests are not prioritized

Handling Deadlocks in Quorum-based Protocols • Three new messages are introduced • FAILED(i) from Si  Sj, • when Si can not grant Sj’s request since it granted the request of another node • INQUIRE(I) from SiSj • Si wants to find out if Sj has already succeeded in getting permission from everybody in its quorum • YIELD(I) from SiSj • Si “returns/frees” the authorization/vote Sj has given to it

Handling Deadlocks in Quorum-based Protocols • Sj upon receiving REQUEST(Tsi,I) from Si • If Sj granted REQUEST(TSk,k) from Sk • If TSk < TSi then send FAILED(j) to Si • Else send INQUIRE(j) to Sk • Sk, in response to INQUIRE(j), sends a YIELD(k) to Sj if Sk received a FAILED or send a YIELD message already • Sj, in response to YIELD(k), places Sk’s in the queue and grants the request that is at the top of its queue (by sending it a REPLY message, and also sending any FAILED message where appropriate)

Token-Based Protocols • Suzuki-Kasami’s protocol • Raymond’s Tree-based protocol

Suzuki-Kasami • Maintains a token, which has • Array LN[i] = #CS executions by Pi • FIFO queue Q of all processes with outstanding requests • Each node Pi maintains array • RN[i,j]=largest number of CS request by Pj known to Pi

Raymond’s Token-based MUTEX Protocol • Let P be the process that has the token (varies over time) • Each site Pi maintains • Holder=points to the neighbor of Pi on the path from Pi to P • Q=a queue of the neighbors of Pi that send a request to Pi for the token and have not received it yet. • F: flag that is true iff Pi send request for the token to Holder • Processes (nodes,sites) are organized in logical tree rooted at the node that holds the token • Token is idle when the node having the token is not in CS

Raymond’s Protocol • When Pi wants CS • If has the token, then adds itself to its queue Q • Else it sends a REQ msg to Holder and sets F to true. • When Pj receives REQ from Pi • If it has the token, it adds Pi to its queue Q • Else if F=false then it sends REQ to Holder • When an idle token becomes available at Pj, and its Q is not empty • Let P be the top of Pj’s Q • Remove P from Q, and set F to false • If P = Pj, Pj enters CS • Else • Send the idle token to P, and set Holder to P • If Q is still not empty • send REQ to P, and set F to true.

Suzuki-Kasami • Protocol (T: token): • If Pi requests CS and does not have token, then • sn = RN[i,i]++ • Broadcast REQ(i,sn) • when Pj receives REQ(i,sn) • Set RN[j,i] = max(RN[j,i], sn) • If Pj has the token, • and RN[j,i] = LN[i]+1, add Pi to T.q • When process Pi completes CS execution sets • T.LN[i]++ • When token becomes idle at Pk • Send token to the process P at the top of T.q; T.q -= p

Distributed Mutual Exclusion