240 likes | 328 Views
An Efficient Decentralized Algorithm for the Distributed Trigger Counting (DTC) Problem. Venkatesan T. Chakravarthy (IBM Research-India) Anamitra Roy Choudhury (IBM Research-India) Vijay Garg (University of Texas at Austin) Yogish Sabharwal (IBM Research-India).
E N D
An Efficient Decentralized Algorithm for the Distributed Trigger Counting (DTC) Problem Venkatesan T. Chakravarthy (IBM Research-India) Anamitra Roy Choudhury (IBM Research-India) Vijay Garg (University of Texas at Austin) Yogish Sabharwal (IBM Research-India)
Distributed Trigger Counting (DTC) Problem • Distributed system with n processors • Each processor receives some triggers from an external source • Report to the user when the number of triggers received equals w. (In general w>>n)
Applications of DTC Problem • Distributed monitoring • traffic volume : raise an alarm if #vehicles on a highway exceeds some threshold • wildlife behavior: #sightings of a particular species in a wildlife region exceeds a value. • Global Snapshots: • the distributed system must determine if all in-transit messages have been received to declare snapshot valid. This problem reduces to DTC Problem [Garg, Garg, Sabharwal 2006]
Assumptions: • complete graph model, i.e., any processor can communicate with any other processor • no shared clock and no shared memory • processors communicate using messages • reliable message delivery • no faults in the processors.
Measure of any DTC Algorithm: • Low message complexity • Low MaxRcvLoad, the maximum number of messages received by any processor in the system. • Low MsgLoad the maximum number of messages communicated by any processor in the system.
Trivial Algorithm • Fix one node to be Master Node • Total Deficit (w) is maintained by the Master Node • Any processor that receives a trigger informs the Master Node • The master node decrements the deficit • Finish when deficit reaches zero • Total messages = O(w) • MaxRcvLoad and MsgLoad also O(w)
Previous Work: Any deterministic algorithm has message complexity Ω(n log(w/n)) [Garg et al] Centralized algorithm message complexity O(n log w). MaxRcvLoad can be as high as O(n log w). Tree-based algorithm message complexity O(n log n log w). more decentralized in a heuristic sense. MaxRcvLoad can be as high as O(n log n log w), in the worst case.
Modifications to the trivial algorithm • Any processor sends message (count of triggers received) to the master only after it receives B triggers. • Works in multiple rounds. • w’: deficit at beginning of a round. (initially w’ = w) • Master keeps count of the triggers reported by other processors and the triggers received by itself. • End-of-round declared when count reaches w’/2 • System never enters a dead state • Unreported triggers for each processor < B • Count of triggers at master < w’/2 • Message complexity O(n log w) • log w rounds • w’/2B = n messages exchanged in every round.
Main Result: LayeredRand, a decentralized randomized algorithm. • Theorem: For any trigger pattern, the message complexity of the LayeredRand algorithm is O(n log n log w). Also, there exists constants c,d > 1 such that Pr[MaxRcvLoad > c log n log w] < 1/nd
LayeredRand Algorithm • n = (2L -1) processors arranged in L layers • lth layer has 2l processors, l=0 to L-1 • Algorithm proceeds in multiple rounds. • w’: initial value of a round (number of triggers yet to be received) • Threshold for lth layer defined as • C(x): sum of triggers received by x and some processors in layers below.
LayeredRand Algorithm (Contd.) • For non-root processor x at layer l • If a trigger is received: C(x)++ ; • If C(x)>= τ(l) • pick a processor y from level l-1 at random and send a coin to y. • C(x) := C(x) -τ(l); • If a coin is received from level l+1:C(x) :=C(x)+ τ(l+1). • Root r • maintains C(r) just like others. • If C(r) > , initiate end-of-round procedure • gets total number of triggers received in this round • broadcasts new value of w’ for next round.
Example w’ = 96 G End of round w’ for next round = 96- 53 = 43 6 49 45 2 τ(1) = 4 E F 5 3 1 4 3 1 1 2 τ(2) = 2 2 1 2 1 1 1 1 D A B C
Analysis • System does not stall in the middle of a round, when all the triggers have been delivered. • Message complexity to O(n log n log w) • MaxRcvLoad bounded to O(log n log w) with high probability
Correctness • Consider the state of the system in the middle of any round. • x: any non-root processor at layer l • Dead state thus implies C(r)>3w’/4, leading to contradiction.
Message Complexity O(n log n log w) • log w rounds • Every coin sent from layer l to l-1 means that at least τ(l) triggers have been received at layer l in this round. • #coins sent from layer l to the layer l-1 is at most w’/ τ(l) • #coins sent in a particular round • O(n) message exchanges for every end-of-round procedure.
MaxRcvLoad O(log n log w) w.h.p. Prob[MaxRcvLoad of some processor exceedsc log n log w] < n-(c-1) , for any constant c>=48 • In any given round, #coins received by layer l < w’/ τ(l+1) < 4.2l+1.log n • Each coin sent uniformly and independently at random to one of the 2lprocessors occupying layer l. • Mx: r.v. denoting the number of coins received by x • E[Mx] = 8 log n log w • Prob[Mx > 8a. log n log w] < 2 -8a. log n log w < n -8a , for a>=6 • Above result follows by applying union bound.
Concurrency • We assumed that the triggers are delivered one at a time • all the processing required for handling a trigger is completed before the next trigger arrives. • Relax on that assumption End of Round! ΣC(x) = 53 instead of 55 G w’ = 96 49 48 τ(1) = 4 E 1 1 F 2 1 1 1 τ(2) = 2 C D A B
Handling Concurrency • Triggers and coins received during a round placed in queue and processed one at a time. • Additional features for handling end-of-round. • Default queue and Priority queue • Unprocessed triggers and coins placed in default queue • End-of-round messages in priority queue • Default queue serviced only when priority queue empty • Counters C(x), D(x) and RoundNum • D(x): triggers processed by x since the begin • C(x): reset after every round
End-of-round procedure • Processors arranged in a tree. • Four phases. • First Phase: root initiates RoundReset message • A processor x on receiving RoundReset • suspends processing of the default queue until end of round i.e., D(x) value not modified further till new round • Non-leaf processor forwards it to its children; • Leaf processor initiates the second phase End of Round! w’=96 G 49 48 E 1 F 1 2 1 1 1 C D D(x)=2 A B
Second Phase • Leaf processor initiates Reduce message containing its D(x) value • A processor x on receiving Reduce from its children • Non-root processor adds its D(x) value to the sum and forwards it to its parent • Root processor computes w’ – termination or next round. End of Round! ΣD(x) = 55 New w’=41 w’=96 G 49 48 E 1 F 1 2 1 1 1 C D D(x)=2 A B
Third Phase • Root broadcasts the new w’ by Inform message. • Every non-leaf processor forwards it to its children; • Leaf processors on receiving Inform message initiate the fourth phase. End of Round! New w’=41 G E τ(1) = 2 F 2 1 C D A B
Fourth Phase • Processor in this phase perform the following • RoundNum incremented. • signifies new round i.e., processor does not process any coin from the previous rounds. • C(x) reset to zero. • InformAck message sent to its parent. • Processing of the default queue resumed. • System (all processors) enters next round when root receives InformAck End of Round! Next Round w’=41 G E F Discard this coin 2 1 C D A B