290 likes | 499 Views
Robust query processing. Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010. Max-diff histograms. Equal height?. Equal area. True distribution Average value. Equal width. Max-diff. Histograms with slope. True distribution Average value.
E N D
Robust query processing Goetz Graefe, Christian König, Harumi Kuno,Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010
Max-diff histograms Equal height? Equal area True distributionAverage value Equal width Max-diff Dagstuhl - Robust Query Processing
Histograms with slope True distributionAverage value Linear regression Max-diff Max-diff with slope Dagstuhl - Robust Query Processing
Slope, patterns, extrapolation Dagstuhl - Robust Query Processing
Detecting query slowdown Dagstuhl - Robust Query Processing
External merge sort • Initial runs: size M, count N/M • Merge fan-in F= M − read-ahead buffers • Merge depth = merge levels = logF (N/M) Size = F×M … Fan-in = F … … … Size = M Dagstuhl - Robust Query Processing
Applies if M < N1≤ F×M 1 < N1/M ≤ F 0 < logF(N1/M) ≤ 1 Actual fan-out K: 1 < K ≤ F Hash table + K output buffers (M−K) + (K×M) ≥ N1 K ≥ (N1−M) / (M−1) Fairly smooth cost function Eases query optimization Eases memory management Hybrid hash join 1 1 … K K Dagstuhl - Robust Query Processing
Duality of sorting & hashing Merging vs. partitioning Dagstuhl - Robust Query Processing
Multiple optimization techniques are needed to find this plan • Join clause inferred between line item & part supply • Group-by list reduced by functional dependencies • Grouping (on alternative column) pushed down through join • “Interesting orderings” between scans, joins, grouping Dagstuhl - Robust Query Processing
Multiple optimization techniques in a hash-based plan Same as previous example, plus • Integrated hash operation … • … within a hash team • Disk-order scans Dagstuhl - Robust Query Processing
Star joins: semi-join reduction First, join each dimension table with an index of the fact table; then, (hash-) intersect bookmark lists; finally, fetch fact table rows Also considered: Cartesian products of dimension tables Dagstuhl - Robust Query Processing
Symmetric semi-join reduction Fields T1.*, T2.* Fetch using T1.s Select … from T1 join T2 on T1.a = T2.awhere … Fields T1.*, T2.s Fetch using T1.s Fields T1.s, T2.s Join “T1.a = T2.a” Fields T2.a, T2.s Index T1 (a, s) Index T2 (a, s) Dagstuhl - Robust Query Processing
Index-to-index navigation performance Trad. fetch Dagstuhl - Robust Query Processing
2-dimensional parameter space Dagstuhl - Robust Query Processing
Fast loads and fast queries ? Zoneindexes Multipleindexes PartitionedB-trees Zonefilters Query performance Zonemaps No indexes or statistics Load bandwidth Dagstuhl - Robust Query Processing
Traditional index choices • Don’t index. Scan for each query – no cost for index creation • Index creation before query processing • Useful for predictable workloads • “Monitoring and tuning” wizard • Extra effort, hard to predict Scan Index searches Index creation Index tuning Adaptive Indexing Adaptive merging
Adaptive merging in partitioned B-trees run generation z a z a z a z a merging z z a a z a z a z a … after merging a-j j k z k z k z k z a #0 #1 #2 #3 #4 Dagstuhl - Robust Query Processing September 27, 2014 17
Adaptive merging vs database cracking Database crackingImproved crackingAdaptive merging Dagstuhl - Robust Query Processing September 27, 2014 18
Traditional priority queue Enter and exit at root 2 log2 M comparisons Tree of winners Enter at leaf, exit at root log2 M comparisons Specific entry points Duplicate entries M/2 entries Tree of losers Enter at leaf, exit at root No duplicates, M entries Tree of losers Run 4: key A Array slot 0 1 Run 3: key D 2 0: F 3 7: B 4 1: G 5 2: E 6 5: D 7 6: C 2: E3: D 4: A5: D 6: C7: B 0: F1: G Dagstuhl - Robust Query Processing
Exploit large memory Even during small merge Merge from memory Smooth transition Run generation to merging Continuous cost function Effect of hybrid hash join 2 × 6 GB ÷ 100 MB/s = 120 sec = 2 min Graceful degradation 1 2 0 2 3 0 1 Dagstuhl - Robust Query Processing
Graceful degradation in memory hierarchy Rotating disk drive Main memory Output Run inmemory High fan-in merge A few runs on flash A few runs on disk Buffer for large disk pages Flash memory Dagstuhl - Robust Query Processing
SQL Server lock modes Dagstuhl - Robust Query Processing
Optimal B-tree node sizes in 1997 Dagstuhl - Robust Query Processing
Hilbert space-filling curve Dagstuhl - Robust Query Processing
Automatic Tuning: Relaxation-based Nicolas Bruno and Surajit Chaudhuri, Automatic Physical Database Tuning: A Relaxation-based Approach, in Proceedings of the ACM International Conference on Management of Data (SIGMOD), Association for Computing Machinery, Inc., 2005
Self-Tuning DB: AutoAdmin Sanjay Agrawal, Nicolas Bruno, Surajit Chaudhuri, and Vivek Narasayya, AutoAdmin: Self-Tuning Database Systems Technology, in Data Engineering Bulletin, IEEE Computer Society, 2006
Continuous Monitoring: SQLCM Surajit Chaudhuri, Arnd Christian König, and Vivek Narasayya, SQLCM: A Contiuous Monitoring Framework for Relational Database Engines, in ICDE 2004.