350 likes | 456 Views
Speculative Query Processing. Alkis Polyzotis Yannis Ioannidis. Speculation. Definition (Webster dict.): “to assume a business risk in hope of gain” Success Stories: Computer Architecture Operating Systems Database Systems. Motivation. Timeline. Select :. name. From employee. From :.
E N D
Speculative Query Processing Alkis Polyzotis Yannis Ioannidis
Speculation • Definition (Webster dict.): “to assume a business risk in hope of gain” • Success Stories: • Computer Architecture • Operating Systems • Database Systems
Motivation Timeline Select: name From employee From: employee Where age<25 Where: age<25 Select name GO! GO!
From employee Where age<25 Select name GO Original Cost: New Cost: Applying Speculation Select * From employee Where age<25 Into T Select name From T
Roadmap • Introduction • Speculative Processing Framework • Speculative Manipulations • Cost Optimization • Learning the User • Experimental Results • Conclusions
Setting • Interactive exploration • Ad-hoc queries • Long sessions • Read-only database Final Query DB Query Interface Partial Query
Speculator Learner Framework Query Interface DB Speculative Actions Speculative Query Partial Query Possible Manipulations Cost Optimizer Actions User Model Cost Model
Example Application Query Interface DB Select * From employee Where age<25 Into T1 Speculative Actions Select … From employee Where age<25 Materialize age<25 Cost Optimizer Speculator Cost Model Learner
Roadmap • Introduction • Speculative Processing Framework • Speculative Manipulations • Speculative Actions • Composition Rules • Cost Optimization • Learning the User • Experimental Results • Conclusions
Speculative Actions • Prepare DB for speculated query • Types: • Data Staging • Histogram Creation • Index Creation • Materialization • Simple • Complex
Data Staging • Pre-fetch pages of relations • Objective: warm up the buffer pool • Low risk • Difficult to implement!
Histogram Creation • Create histograms on predicate fields • Objective: Help DB Optimizer • Low Risk
Index Creation • Create indices on predicate fields • Objectives: • Provide additional access methods • Provide better statistics • Low Risk
T S Simple Materializations • Materialize predicates in new tables • High gain • Increased risk S R
T W Complex Materializations • Materialize joins in new tables • Very high gain • Very high risk! Y W R S
Roadmap • Introduction • Speculative Processing Framework • Speculative Manipulations • Speculative Actions • Composition Rules • Cost Optimization • Learning the User • Experimental Results • Conclusions
+ Create Index on T Composition Rules S R
R S S R Composition Rules • May cancel previous manipulations
Roadmap • Introduction • Speculative Processing Framework • Speculative Manipulations • Cost Optimization • Learning the User • Experimental Results • Conclusions
Cost Optimization • Select most promising manipulation • Evaluate cost in terms of: • Gain vs. risk • Possible final query • User’s behavior
Roadmap • Introduction • Speculative Processing Framework • Speculative Manipulations • Cost Optimization • Learning the User • Experimental Results • Conclusions
Survival of predicates • Will s appear in the final query? • Perceptron learner • Temporal Differences learning • Train on recent actions of the user • Large learning rate
Probability of joins • Will R,S be joined in the final query? • Bayesian learner • Re-train on previous n queries
Roadmap • Introduction • Speculative Processing Framework • Speculative Manipulations • Cost Optimization • Learning the User • Experimental Results • Conclusions
Methodology • Collect actual traces • Users explore a TPC-H dataset • Query interface records actions & timestamps • Replay traces • Normal Processing • Speculative Processing
Performance Evaluation • Metric: Absolute Improvement • Metric calculated as a histogram • Source parameter: execution time
Configuration • Dual Pentium-III 500 MHz • 1GB physical memory • Linux RH 6.2 • Oracle Server 8.1.6 • DB uses files, not raw partition Thank you Remzi!
Experiment I: Comparing Gain Buffer: 32MB Data Set: 100MB No Indices No Histograms
Experiment I:Comparing Gain Buffer: 32MB Data Set: 100MB No Indices No Histograms
Experiment II: Using the Cost Model Buffer: 32MB Data Set: 100MB Indices & Histograms
Experiment III: Performance Buffer: 32MB Data Set: 100MB Indices & Histograms
Experiment III: Performance Buffer: 320MB Data Set: 100MB Indices & Histograms
Experiment III: Performance Buffer: 32MB Data Set: 300MB Indices & Histograms
Conclusions • Speculative Processing Framework • Speculative Manipulations • Cost Optimization • Experimental Results: • Materializations offer most gain • Cost model avoids penalties • Performance improvement up to 40%
Future Work • Evaluate speculation on: • larger dataset • bigger schema • Apply speculation in: • Search queries • Web queries • Multi-user database • Client collaboration • Use speculation in the server