1 / 45

Load Balancing & Update Propagation in Tashkent+

Load Balancing & Update Propagation in Tashkent+. Sameh Elnikety Steven Dropsho Willy Zwaenepoel. EPFL. Load Balancing. Traditional: Optimize for equal load on replicas MALB (Memory-Aware Load Balancing) : Optimize for in-memory execution. MALB Doubles Throughput. 37 X. TPC-W

helena
Download Presentation

Load Balancing & Update Propagation in Tashkent+

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Load Balancing &Update Propagation in Tashkent+ Sameh Elnikety Steven Dropsho Willy Zwaenepoel EPFL

  2. Load Balancing • Traditional: • Optimize for equal load on replicas • MALB (Memory-Aware Load Balancing): • Optimize for in-memory execution

  3. MALB Doubles Throughput 37 X TPC-W Ordering 16 replicas 25 X TPS 12 X 1 X

  4. Update Propagation • Traditional: • Propagate updates everywhere • Update Filtering: • Propagate updates to where they are needed

  5. MALB + UF Triples Throughput 37 X TPC-W Ordering 16 replicas 25 X TPS 12 X 1 X

  6. Contributions • MALB (Memory-Aware Load Balancing): • Optimize for in-memory execution • Update filtering: • Propagate updates to where they are needed • Implementation: • Tashkent+ middleware • Experimental evaluation: • Big performance gain

  7. Background: DB Replication Single DBMS Replica 1 Load Balancer Replica 2 Replica 3

  8. Read Tx Replica 1 Load Balancer Replica 2 T Replica 3

  9. Update Tx Replica 1 Load Balancer Replica 2 T ws ws Replica 3

  10. Traditional Load Balancing Replica 1 Equal load on replicas Load Balancer Replica 2 Replica 3 MALB: (Memory-Aware Load Balancing)Optimize for in-memory execution

  11. How Does MALB Work? 1 2 3 Database 1 2 Workload A → B → 2 3 Mem Memory

  12. Read Data From Disk Replica 1 A → B → 1 2 A, B, A, B Mem 2 3 2 Slow Disk A, B, A, B Least Loaded 1 1 2 3 3 Replica 2 Mem Slow Disk A, B, A, B 1 2 3

  13. Data Fits in Memory Replica 1 A → B → 1 2 A, A, A, A Mem 2 3 1 2 Fast Disk A, B, A, B MALB 1 2 3 Replica 2 Mem Fast 2 3 Memory info? Many tx and replicas? Disk B, B, B, B 1 2 3

  14. Estimate Tx Memory Needs • Exploit tx execution plan • Which tables & indices are accessed • Their access pattern • Linear scan, direct access • Metadata from database • Sizes of tables and indices

  15. Grouping Transactions • Objective: • Construct tx groups that fit together in memory • Bin packing: • Item: tx memory needs • Bin: memory of replica • Heuristic: Best Fit Decreasing • Allocate replicas to tx groups • Adjust for group loads

  16. MALB in Action Replica Group A Memory needs for A, B, C, D, E, F A Disk A B CD E F MALB Replica Group B C B C Disk Group D E F Replica D E F Disk

  17. MALB Summary • Objective: • Optimize for in-memory execution • Method: • Estimate tx memory needs • Construct tx groups • Allocate replicas to tx groups

  18. Update Propagation • Traditional: • Propagate updates everywhere • Update Filtering: • Propagate updates to where they are needed

  19. Update Filtering Example Replica 1 Group A A → B → 1 2 Mem 2 3 1 2 Disk A, B, A, B MALB UF 1 2 3 Replica 2 Mem 2 3 Disk Group B 1 2 3

  20. Update Filtering Example Replica 1 Group A A → B → 1 2 Mem 2 3 1 2 Update table 1 Disk A, B, A, B MALB UF 1 2 3 Update table 3 Replica 2 Mem 2 3 Disk Group B 1 2 3

  21. Update Filtering in Action Update to red table UF Update to green table

  22. Tashkent+ Implementation • Middleware: • Based on Tashkent [EuroSys 2006] • Consistency: • No change in consistency • Our workloads run serializably • GSI (Generalized Snapshot Isolation)

  23. Experimental Evaluation • Compare: • LeastLoaded: optimize for equal load • MALB: optimize for in-memory execution • MALB+UF: propagate updates to where needed • Environment: • Linux cluster running PostgreSQL • Workload: TPC-W Ordering (50% update txs)

  24. MALB Doubles Throughput Read I/O, normalized 37 X TPC-W Ordering 16 replicas 25 X 105% TPS 12 X 1 X

  25. Big Gains with MALB DB Size Run from disk 12% 75% 182% Big 105% 45% 48% Small 29% 4% 0% Run from memory Mem Size Small Big

  26. MALB + UF Triples Throughput 37 X 49% 15 Prop. Updates TPC-W Ordering 16 replicas 25 X TPS 7 12 X 1 X

  27. Filtering Opportunities Ratio MALB+UF / MALB Updates 5% Browsing Mix 50% Ordering Mix

  28. Checkout the Paper • MALB is better than LARD (we explain why) • Methods for grouping transactions • Better be conservative in estimating memory • Dynamic reallocation for MALB • Adjust to workload changes • More workloads • TPC-W browsing & shopping • RUBiS bidding

  29. Conclusions • MALB (Memory-Aware Load Balancing): • Optimize for in-memory execution • Update filtering: • Propagate updates to where they are needed • Implementation: • Tashkent+ middleware • Experimental evaluation: • Big performance gain, achievable in many cases

  30. Thank You • Questions? • MALB (Memory-Aware Load Balancing): • Optimize for in-memory execution • Update filtering: • Propagate updates to where they are needed

  31. Example • Transaction type: • SELECT item-name • FROM shopping-cart • WHERE user = ? Transaction type: A Instances: A1, A2, A3

  32. Transaction Types A: SELECT * FROM address WHERE id=$1 B: INSERT INTO course VALUES ($1, $2) C: UPDATE emp SET sal=$1 WHERE id=$2 D: DELETE FROM bid WHERE id=$1

  33. Transaction Types A: SELECT * FROM address WHERE id=$1 B: INSERT INTO course VALUES ($1, $2) C: UPDATE emp SET sal=$1 WHERE id=$2 D: DELETE FROM bid WHERE id=$1 • A1: SELECT * FROM address WHERE id=10 • A2: SELECT * FROM address WHERE id=20 • A3: SELECT * FROM address WHERE id=76 • A4: SELECT * FROM address WHERE id=78 • A5: SELECT * FROM address WHERE id=97

  34. MALB Grouping • MALB-S • Size • MALB-SC • Size and contents • No double counting of shared tables & indices • MALB-SCAP • Size, contents, and access pattern • Overflow tx types • Each tx type in its own group

  35. Tx Grouping Alternatives

  36. MALB Allocation – Load • Collect CPU and disk utilizations • Maintain smoothed averages • Group load: average across replicas • Example • 3 replicas • Data ( CPU, disk ) = (45, 10), (53, 9), (40, 8) • Group load = (46, 9) • Comparing loads: max( CPU, disk ) • Example • Max(46, 9) = 46%

  37. MALB Allocation – Basic • Move one replica at a time • From least loaded group • To most loaded group • Future load estimates • 2 replicas, 20% load -> 1 replica, 40% • 6 replicas, 25% load -> 5 replicas, 30% • Hysteresis • Prevent oscillations • Trigger re-allocation if load difference > 1.25x

  38. MALB Allocation – Fast • Quickly snap to a good configuration • Solve set of simultaneous equations • Example • Current configuration • Group M, 3 replicas, 70% • Group N, 7 replicas, 10% • Equations • Mrep + Nrep = 10 • (70% * 3) / Mrep = (10% * 7) / Nrep • Fine tune with additional adjustments

  39. Configuration of MALB-SC

  40. Disk IO Average Disk IO per Transaction

  41. MALB & Update Filtering TPC-W Ordering LeastCon, MALB, MALB+UF MidDB 1.8GB with different memory sizes

  42. MALB & Update Filtering SmallDB 0.7GB BigDB 2.9GB

  43. Dynamic Reallocation

  44. Related Work • Load balancing • LARD, HACC, FLEX • Workload of small static files • Do not estimate working set • Replicated databases • Front-end load balancer • Conflict-aware scheduling [Pedone06] • Reduce conflict and therefore abort rate • Conflict-aware scheduling [Amza03] • Correctness • Partial replication

  45. Related Work • Replicated database systems • J. Gray et al., The Dangers of Replication and a Solution, SIGMOD’96. • C. Amza, Consistent Replication for Scaling Back-end Databases for Dynamic Content Servers, MiddleWare’03. • C. Plattner et al., Ganymed: Scalable Replication for Transactional Web Applications, Middleware’04. • B. Kemme et al., Postgres-R(SI) Middleware based Data Replication providing Snapshot Isolation, SIGMOD’05. • K. Daudjee and K. Salem, Lazy Database Replication with Snapshot Isolation, VLDB’06. • Networked server systems • V. Pai et al, Locality-aware Request distribution in Clustered-based Network Servers, ASPLOS’98.

More Related