160 likes | 348 Views
Dynamic Load Distribution in the Borealis Stream Processor. Ying Xing, Stan Zdonik, Jeong-Heon Hwang Brown Univ. ICDE 2005. One line comment.
E N D
Dynamic Load Distribution in the Borealis Stream Processor Ying Xing, Stan Zdonik, Jeong-Heon Hwang Brown Univ. ICDE 2005
One line comment • Proposed an algorithm which balances load dynamically by distributing operators under highly fluctuating data in the context of clustered Borealis (CQE) system
Problem Cluster of Borealis nodes • In a push-based (CQE) system load-fluctuation occurs in the input data rate • Temporary load spike can affect data processing latency significantly We should avoid temporary overload as much as possible!
Connected Plan r A B S1 2cr 2r C D S2 4cr A Better Plan r A B S1 2r C D S2 3cr 3cr Challenges What operator mapping plan can balance the load best? Cluster of Borealis nodes How should we rearrange the plan dynamically as the load changes?
Solution approach • Busy all together, idle all together • Find out the operators that are busy at the same time • Calculate the correlation of the operators • Distribute busy operators • Move the operators from a heavily loaded machine to under loaded machine • Perform the above operations periodically • Propose a two-phase operator distribution algorithm • Global algorithm – Initial operator mapping • Pair-wise algorithm – dynamically rearrange the mapping
Statistics measured • load time series of operators or nodes • Load of an operator • # of tuples arrived * CPU time required for a tuple • Load of a machine • Sum of the loads of it’s operators • only keep the recent K statistics • Average load of a machine X1 • Average of load time series S1=(s1, s2,…, sk) • Correlation of operators X1, X2 • Correlation of load time series SX1, SX2
Ideal state of the cluster • Average load of all machines are equal • Minimize the average of each machine’s load variance • Make the lower bound of the average variance as small as possible
Pair-wise Load Distribution Algorithm 1 2 3 4 5 6 7 One-way M1 Select operators having the greatest score until the load of the selected operators exceed (L1-L2)/2 Score function: Co(O1, M1) – Co(O1, M2) 8 9 M2
Pair-wise Load Distribution Algorithm 1 2 3 4 5 6 7 Two-way M1 M1 • Redistribute all movable operators • Lower loaded node is selected • Operators are assigned one by one • Operator having the highest score is selected 8 8 9 9 M2
Global Operator Distribution M1 M1 • Redistribute all movable operators after warm up period • A node with the lowest load is selected • Operators are assigned one by one • Operator having the highest score is selected Score function: M2
Experimental results • 1. computation overhead of the algorithms • 2. Effectiveness of the global algorithm • Load variance • End-to-end latency • 3. Pair-wise algorithms • Adaptivity to load changes
Experimental results • AMD Athlon 3200+ 2GHz, 1G Mem • Global Algorithm • Pair-wise Algorithm • 6ms for each pair when n = 10, 10operators/macine
Experimental results • Global algorithms evaluation Latency ratio = end-to-end latency / sum (processing delay) System load level = (sum of busy time) / (# of node * simulation duration)
Experimental results • Pair-wise algorithm evaluation
Critiques • Strong points • Balance loads according to the change of input data rate (data pushing into the system) • A simple algorithm using correlation • Weak points • Unrealistic work-load (operator chains, input streams) • Hard to define parameters of statistics measurement • Load collection period, score threshold, # of time series. … • It must be changed depending on the workload • If an input fluctuation doesn’t have any historical behavior the effect will be limited • Doesn’t consider about dynamic changes of an operator network (query addition, deletion)
Parameters for Experiments (supplementary) • Independent linear operator chain(10 ops) • Number of nodes = 20 • 20 queries • Operator processing delay = 1ms • Load measuring time period = 1sec • # of samples in a load time series = 10 • Execute the pair-wise algorithm every second • Operator migration time=200ms