460 likes | 723 Views
Magical Parallel. OLTP Databases. Andy Pavlo. November 10 th , 2011 – MIT CSAIL. Databases?. Evan Jones?. Lebron is going to Miami!. The McRib will be back!. Michael Jackson is in trouble!. On-Line. Transaction. Processing. Fast. Repetitive. Small. H -Store. H -Store Partitioning.
E N D
Magical Parallel OLTP Databases Andy Pavlo November 10th, 2011 – MIT CSAIL
Databases? Evan Jones?
Lebron is going to Miami! The McRib will be back! Michael Jackson is in trouble!
On-Line Transaction Processing
Fast Repetitive Small
H-Store Partitioning P1 P2 P3 P4 P5 WAREHOUSE P1 P2 P3 P4 P5 P1 P2 P3 P4 P5 ITEM ITEM DISTRICT STOCK P3 P2 P1 P4 P5 P1 P2 P3 P4 P5 CUSTOMER ITEM ITEM P1 P2 P3 P4 P5 ORDERS ITEM ITEM ITEM ITEM ITEM ITEMj Replicated P1 P2 P3 P4 P5 ORDER_ITEM ITEM Tables Partitions
H-Store Cluster Procedure Name Input Parameters Client Application
This transaction will execute 4 queries on partitions 1,3, and 6!
Client Application Optimization #1 P1 P2 P3 P4
Client Application Optimization #2 P1 P2 P3 P4 Client Application
Client Application Optimization #2 P1 P2 P3 P4 Client Application
Optimization #3 InsertOrder InsertOrderLineUpdateStock InsertOrderLineUpdateStock STOP LOGGING
Why this Matters Throughput(txn/s)
Pro Tip: Canadians do not like unnecessary surgeries.
On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systemsin PVLDB, vol 5. issue 2,October 2011
Client Application Houdini
Main Idea: Use models to predict before execution.
Step #1: Estimate the path that a transaction will take
Step #2: Determine which optimizations to enable.
Current State: Input Parameters: w_id=0 i_w_id=[0,1] i_ids=[1001,1002] ? ? GetWarehouse: SELECT * FROM WAREHOUSEWHERE W_ID = ?
Input Parameters: +1 w_id=0 i_w_id=[0,1] i_ids=[1001,1002] +1 Transaction Estimate: +1 +1 +1 +1
Limitations: Long/wide models. Keeping models in synch. Incorrect predictions.
Current State: Input Parameters: w_id=0 i_w_id=[0,1] i_ids=[1001,1002] CheckStock: ? SELECT S_QTY FROM STOCKWHERE S_W_ID = ?AND S_I_ID = ?; ? InsertOrder: X INSERTINTO ORDERS (o_id, o_w_id) VALUES (?, ?);
Refinement: Partition models based on input properties.
w_id=0 i_w_id=[0,1] i_ids=[1001,1002]
Input Parameters: w_id=0 i_w_id=[0,1] i_ids=[1001,1002] CheckStock: SELECT S_QTY FROM STOCKWHERE S_W_ID = ?AND S_I_ID = ? ? ?
Houdini: Execute txn at the best partition. Only lock the partitions needed. Disable undo logging if not needed. Speculatively commit transactions.
Houdini: Estimate initial path. Update as transaction executes. Recompute if workload changes. Partition for better accuracy.
Experimental Evaluation
Model Accuracy TATP TPC-C AuctionMark 94.9% 95.0% 90.2%
Estimation Overhead TATP TPC-C AuctionMark
Throughput (txn/s) TATP TPC-C AuctionMark +57% +126% +117%
Conclusion: Small overhead cost improves throughput.
h-store hstore.cs.brown.edu Twitter: @andy_pavlo