Distributed Load Balancing for Key-Value Storage Systems

Distributed Load Balancing for Key-Value Storage Systems ImranulHoque Michael Spreitzer MalgorzataSteinder

Key-Value Storage Systems • Usage: • Session state, tags, comments, etc. • Requirements: • Scalability • Fast response time • High availability & fault tolerance • Relaxed consistency guarantee • Example: Cassandra, Dynamo, PNUTS, etc.

Load Balancing in K-V Storage • Hash partitioned vs. range partitioned • Range partitioned data ensures efficient range scan/search • Hash partitioned data helps even distribution THU SUN MON TUE SAT WED FRI SUN FRI SAT MON Table Tablets THU TUE WED Server 3 Server 4 Server 1 Server 2

Issues with Load Balancing • Uneven space distribution due to range partitioning • Solution: partition the tablets and move them around • Few number of very popular records SUN FRI SAT MON THU TUE WED Server 3 Server 4 Server 1 Server 2

Contribution • Algorithms for solving the load balancing problem • Load = space, bandwidth • Evenly distribute the spare capacity • Distributed algorithm, not a centralized one • Reduce the number of moves • Previous solutions: • One dimensional/key-space redistribution/bulk loading

Outline • Motivation • System modeling and assumptions • Algorithms • One-to-one • One-to-n • Move suppression • Design decisions • Experimental results Emulation of proposed distributed algorithms • Future works

System Modeling and Assumptions B1, S1 B1, S1 B4, S4 Server A Tablet B5, S5 BA, SA B2, S2 Tablet Server B BB, SB B3, S3 Table Tablet Server C BC, SC <= 0.01 in both dimensions 2. # of tablets >> # of nodes

System State Target Zone: helps achieve convergence S Target Point B Goal: Move tablets around so that every server is within the target zone

Load Balancing Algorithms • Phase 1: • Global averaging scheme • Variance of the approximation of the average decreases exponentially fast • Phase 2: • One-to-one gossip • One-to-n gossip • Move suppression t Phase 2 Phase 2 Phase 1 Phase 1

One-to-One Gossip • Point selection strategy • Midpoint strategy • Greedy strategy • Tablet transfer strategy • Move to the selected point with minimum cost (space transferred)

Tablet Transfer Strategy Server 2 Target for Server 1 Server 1 S B

Tablet Transfer Strategy (2) • Start with an empty bag • Goal: take vectors from the servers so that they add up to the target vector • If slope(bag + left + right) < slope(target): • Add right to bag, move right • Otherwise, add left to bag move left Server 1 Right Left

Initial Configurations Uniform Two Extreme Mid Quadrant

Point Selection Strategy • Midpoint Strategy + Guaranteed convergence + No need to run phase 1 • Lots of extra movement • Visualization Demo • Uniform • Two extreme • Mid quadrant Server 2 S Server 1 B

Point Selection Strategy (2) • Greedy Strategy • Take the point closer to the target • Move it to the target, if • improves the position of the other point • does not worsen by more than δ • Reduces movement Server 2 Server 1 Takes long time to converge in some cases

DHT-based Location Directory

DHT + Midpoint • Greedy + fallback to DHT: • Convergence problem exists for some configurations • Visualization Demo • Solution: • Greedy + fallback to DHT with Midpoint • Demo: uniform, two extreme, mid quadrant • Alternate approach: • Greedy + fallback to Midpoint • Trade-off: movement cost vs. DHT overhead

Experimental Evaluation • Uniform configuration • Greedy + DHT (Midpoint) • Midpoint • Greedy + Midpoint (No DHT) • Effect of varying target zone • Effect of failed gossip count • Metrics • Amount of space moved • # of gossip rounds • Multiple tablet move

Uniform Configuration: Results

Effect of Varying Target Zone Larger target zone = fast convergence, less accuracy Target zone width should depend on the target point value

Effect of Failed Gossip Count (Greedy) Large failed gossip count = More time in greedy mode, more unproductive gossip at the end

One-to-N Gossip • Contact a few random nodes • Locked/unlocked mode • Pick the most profitable one • Distance from the target is minimized • Advantage • Better choices • Initial results • Locked mode: may lead to deadlock • Unlocked mode: most of the cases other nodes start transfer

Move Suppression • Two global stages • Stage 1: • One-to-One gossip, but moves are hypothetical • Stage 2: • Change to chosen placement • Advantage • Tablet not moved multiple times • Challenges • When to switch to Stage 2 from Stage 1

Future Works • Handling initial placement • Frequency of running the placement algorithm • Considering the network hierarchy • Handling failures • Extending to heterogeneous resources Questions?

Distributed Load Balancing for Key-Value Storage Systems

Distributed Load Balancing for Key-Value Storage Systems

Presentation Transcript

Advanced Load Balancing/Web Systems

Load Balancing Part 1: Dynamic Load Balancing

Load Balancing and Intelligent Load Balancing

Load Balancing in Distributed Systems

Load Balancing

A Novel Adaptive Distributed Load Balancing Strategy for Cluster

Load Balancing

Network Coding for Distributed Storage Systems

Load balancing

Load Balancing

Distributed Load Balancing for Parallel Agent-based Simulations

(Distributed) (Structured) Storage Systems

Load balancing

Load-Balancing

LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM

Load Balancing

Dynamic Load Balancing in Distributed Hash Tables

Load Balancing in Distributed N-Body Simulations

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters

Load Balancing

Load Balancing in Distributed N-Body Simulations

Load Balancing in Distributed Systems