210 likes | 370 Views
Automatic Database Partitioning in Parallel OLTP Systems. @ andy_pavlo. SIGMOD May 22 nd , 2012. Main Memory • Parallel • Shared-Nothing Transaction Processing.
E N D
Automatic Database Partitioning in Parallel OLTP Systems @andy_pavlo SIGMOD May 22nd, 2012
Main Memory • Parallel • Shared-Nothing Transaction Processing H-Store: A High-Performance, DistributedMain Memory Transaction Processing SystemProc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.
Transaction Execution Transaction Result Procedure Name Input Parameters Client Application Database Cluster Database Cluster 7
Automatic Database Design Tool for Parallel Systems Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP SystemsSIGMOD 2012
DDL CUSTOMER NewOrder SELECT * FROM WAREHOUSEWHERE W_ID = 10; SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES(10, 9, 12345,…); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ ORDERS DDL SELECT * FROM WAREHOUSE WHERE W_ID = 10; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ Schema ITEM DTxn Estimator Skew Estimator ------- -------------- Workload Large-Neighorhood Search Algorithm … CUSTOMER CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ORDERS ITEM ITEM ITEM ITEM
CUSTOMER ITEM CUSTOMER ORDERS CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM
? ? ? NewOrder(5, “Method Man”, 1234) ? ? ? Client Application CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM
Large-Neighborhood Search Best Design Input Restart DDL Schema ------- -------------- Workload Initial Design Relaxation Local Search
Cost Model Distributed Transactions + Workload Skew Factor
Algorithm Comparison Horticulture (cost estimate) lower is better State-of-the-Art TATP TPC-C TPC-C Skewed
Throughput Horticulture (txn/sec) higher is better State-of-the-Art TATP TPC-C TPC-C Skewed +88% +16% +183%
Conclusion: Dating scene is still difficult. But partitioning your database is now easier.
http://github.com/apavlo/h-store http://hstore.cs.brown.edu