1 / 43

Andy Pavlo

Magical Parallel. OLTP Databases. Andy Pavlo. November 10 th , 2011 – MIT CSAIL. Databases?. Evan Jones?. Lebron is going to Miami!. The McRib will be back!. Michael Jackson is in trouble!. On-Line. Transaction. Processing. Fast. Repetitive. Small. H -Store. H -Store Partitioning.

butch
Download Presentation

Andy Pavlo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Magical Parallel OLTP Databases Andy Pavlo November 10th, 2011 – MIT CSAIL

  2. Databases? Evan Jones?

  3. Lebron is going to Miami! The McRib will be back! Michael Jackson is in trouble!

  4. On-Line Transaction Processing

  5. Fast Repetitive Small

  6. H-Store

  7. H-Store Partitioning P1 P2 P3 P4 P5 WAREHOUSE P1 P2 P3 P4 P5 P1 P2 P3 P4 P5 ITEM ITEM DISTRICT STOCK P3 P2 P1 P4 P5 P1 P2 P3 P4 P5 CUSTOMER ITEM ITEM P1 P2 P3 P4 P5 ORDERS ITEM ITEM ITEM ITEM ITEM ITEMj Replicated P1 P2 P3 P4 P5 ORDER_ITEM ITEM Tables Partitions

  8. H-Store Cluster Procedure Name Input Parameters Client Application

  9. This transaction will execute 4 queries on partitions 1,3, and 6!

  10. Client Application Optimization #1 P1 P2 P3 P4

  11. Client Application Optimization #2 P1 P2 P3 P4 Client Application

  12. Client Application Optimization #2 P1 P2 P3 P4 Client Application

  13. Optimization #3 InsertOrder InsertOrderLineUpdateStock InsertOrderLineUpdateStock STOP LOGGING

  14. Optimization #4

  15. Why this Matters Throughput(txn/s)

  16. Pro Tip: Canadians do not like unnecessary surgeries.

  17. On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systemsin PVLDB, vol 5. issue 2,October 2011

  18. Client Application Houdini

  19. Main Idea: Use models to predict before execution.

  20. Step #1: Estimate the path that a transaction will take

  21. Step #2: Determine which optimizations to enable.

  22. Current State: Input Parameters: w_id=0 i_w_id=[0,1] i_ids=[1001,1002] ? ? GetWarehouse: SELECT * FROM WAREHOUSEWHERE W_ID = ?

  23. Input Parameters: +1 w_id=0 i_w_id=[0,1] i_ids=[1001,1002] +1 Transaction Estimate: +1 +1 +1 +1

  24. Limitations: Long/wide models. Keeping models in synch. Incorrect predictions.

  25. Current State: Input Parameters: w_id=0 i_w_id=[0,1] i_ids=[1001,1002] CheckStock: ? SELECT S_QTY FROM STOCKWHERE S_W_ID = ?AND S_I_ID = ?; ? InsertOrder: X INSERTINTO ORDERS (o_id, o_w_id) VALUES (?, ?);

  26. Special Guest: Evan

  27. Refinement: Partition models based on input properties.

  28. w_id=0 i_w_id=[0,1] i_ids=[1001,1002]

  29. Input Parameters: w_id=0 i_w_id=[0,1] i_ids=[1001,1002] CheckStock: SELECT S_QTY FROM STOCKWHERE S_W_ID = ?AND S_I_ID = ? ? ?

  30. Houdini: Execute txn at the best partition. Only lock the partitions needed. Disable undo logging if not needed. Speculatively commit transactions.

  31. Houdini: Estimate initial path. Update as transaction executes. Recompute if workload changes. Partition for better accuracy.

  32. Experimental Evaluation

  33. Model Accuracy TATP TPC-C AuctionMark 94.9% 95.0% 90.2%

  34. Estimation Overhead TATP TPC-C AuctionMark

  35. Throughput (txn/s) TATP TPC-C AuctionMark +57% +126% +117%

  36. Conclusion: Small overhead cost improves throughput.

  37. November 9, 2011

  38. h-store hstore.cs.brown.edu Twitter: @andy_pavlo

  39. Future work

More Related