260 likes | 452 Views
Online Balancing of Range-Partitioned Data with Applications to P2P Systems. Prasanna Ganesan Mayank Bawa Hector Garcia-Molina Stanford University. Motivation. Parallel databases use range partitioning Advantages: Inter-query parallelism
E N D
Online Balancing of Range-Partitioned Data with Applications to P2P Systems Prasanna Ganesan Mayank Bawa Hector Garcia-Molina Stanford University
Motivation • Parallel databases use range partitioning • Advantages: Inter-query parallelism • Data Locality Low-cost range queries High thru’put 0 20 35 60 80 100 Key Range
The Problem • How to achieve load balance? • Partition boundaries have to change over time • Cost: Data Movement • Goal: Guarantee load balance at low cost • Assumption: Load balance beneficial !! • Contribution • Online balancing -- self-tuning system • Slows down updates by small constant factor
Roadmap • Model and Definitions • Load Balancing Operations • The Algorithms • Extension to P2P Setting • Experimental Results
Model and Definitions (1) • Nodes maintain range partition (on a key) • Load of a node = # tuples in its partition • Load imbalance σ = Largest load/Smallest load • Arbitrary sequence of tuple inserts and deletes • Queries not relevant • Automatically directed to relevant node
Model and Definitions (2) • After each insert/delete: • Potentially fix “imbalance” by modifying partitioning • Cost= # tuples moved • Assume no inserts/deletes during balancing • Non-critical simplification • Goal: σ < constant always • Constant amortized cost per insert/delete • Implication: Faster queries, slower updates
Load Balancing Operations (1) • NbrAdjust: Transfer data between “neighbors’’ A B [0,50) [50,100) [0,35) [35,100)
Is NbrAdjust good enough? • Can be highly inefficient • (n) amortized cost per insert/delete ( n=#nodes ) A B C D E F
Load Balancing Operations (2) • Reorder: Hand over data to neighbor and split load of some other node A B C D E F [0,5) [0,10) [10,20) [5,10) [20,30) [30,40) [40,50) [40,60) [50,60)
Roadmap • Model and Definitions • Load Balancing Operations • The Algorithms • Experimental Results • Extension to P2P Setting
The Doubling Algorithm • Geometrically divide loads into levels • Level i Load in ( 2i,2i+1 ] • Will try balancing on level change • Two Invariants • Neighbors tightly balanced • Max 1 level apart • All nodes within 3 levels • Guarantees σ ≤ 8 2i+2 2i+1 Level i 2i 8 Level 2 4 Level 1 2 Level 0 1 Load Scale
The Doubling Algorithm (2) A B C D E F
The Doubling Algorithm (2) A B C D E F
The Doubling Algorithm (2) A B C D E F
The Doubling Algorithm: Case 2 • Search for a blue node • If none, do nothing! A B C D E F
The Doubling Algorithm: Case 2 • Search for a blue node • If none, do nothing! A B E C D F
The Doubling Algorithm (3) • Similar operations when load goes down a level • Try balancing with neighbor • Otherwise, find a red node and reorder yourself • Costs and Guarantees • σ ≤ 8 • Constant amortized cost per insert/delete
From Doubling to Fibbing • Change thresholds to Fibonacci numbers • σ ≤ 3 4.2 • Can also use other geometric sequences • Costs are still constant Fi+2 Fi+1 + Fi =
More Generalizations • Improve σ to (1+) for any >0 [BG04] • Generalize neighbors to c-neighbors • Still constant cost O(1/ ) • Dealing with concurrent inserts/deletes • Allow multiple balancing actions in parallel • Paper claims it is ok
Application to P2P Systems • Goal: Construct P2P system supporting efficient range queries • Provide asymptotic performance a la DHTs • What is a P2P system? A parallel DB with • Nodes joining and leaving at will • No centralized components • Limited communication primitives • Enhance load-balancing algorithms to • Allow dynamic node joins/leaves • Decentralize implementation
Experiments • Goal: Study cost of balancing for different workloads • Compare to periodic re-balancing algorithms (Paper) • Trade-off between cost and imbalance ratio (Paper) • Results presented on Fibbing Algorithm (n=256) • Three-phase Workload • (1) Inserts (2) Alternating inserts and deletes (3) Deletes • Workload 1: Zipf • Random draws from Zipf-like distribution • Workload 2: HotSpot • Think key=timestamp • Workload 3: ShearStress • Insert at most-loaded, delete from least-loaded
Load Imbalance (Zipf) 4.5 Growing Phase Steady Phase Shrinking Phase 4 3.5 3 2.5 Load Imbalance 2 1.5 1 0.5 0 0 500 1000 1500 2000 2500 3000 Time (x1000)
Related Work • Karger & Ruhl [SPAA 04] • Dynamic model, weaker guarantees • Load balancing in DBs • Partitioning static relations, e.g., [GD92,RZML02, SMR00] • Migrating fragments across disks, e.g., [SWZ93] • Intra-node data structures, e.g., [LKOTM00] • Litwin et al. SDDS
Conclusions • Indeed possible to maintain well-balanced range partitions • Range partitions competitive with hashing • Generalize to more complex load functions • Allow tuples to have dynamic weights • Change load definition in algorithms!* • Range partitioning is powerful • Enables P2P system supporting range queries • Generalizes DHTs with same asymptotic guarantees *Lots of caveats apply. Need load to be evenly divisible. No guarantees offered on costs. This offer not valid with any other offers. Etc, etc. etc.