Self-Tuning Database systems

Self-Tuning Database systems Wang Haocong Jan 8, 2009

Tuning databases • Logical database design • Physical database design (indexes) • Memory, disks • Materialized Views • Partitioning

Self-Tuning Systems • Databases are complicated! • Schema design is hard • Lots of “knobs” to tweak • Need appropriate information • Does the DB approach give us more ability to “self-tune” than some other approach (e.g., Java)?

What Would We Like to Auto-Tune? • Query optimization – statistics, bad decisions, … • The schema itself? • Indices • Auxiliary materialized views • Data partitioning • Perhaps logging?

What Are The Challenges in Building Adaptive Systems? • Really, a generalization of those in adaptive query processing • Information gathering – how do we get it? • Extrapolating – how do we do this accurately and efficiently? • Sampling or piloting • Minimizing the impact of mistakes if they happen • Using app-specific knowledge

Who’s Interested in these Problems? • Oracle: • Materialized view “wizard” • Microsoft “AutoAdmin”: • Index selection, materialized view selection • Stats on materialized views • Database layout • IBM SMART (Self-Managing And Resource Tuning): • Histogram tuning (“LEO” learning optimizer) • Partitioning in clusters • Index selection • Adaptive query processing

A Particular Instance: Microsoft’s Index Tuning Wizard • Why not let the system choose the best index combination(s) for a workload • The basic idea: • Log a whole bunch of queries that are frequently run • See what set of indices is best • Why is this hard? Why not index everything? • Create these indices with little or no human input

Possible Approaches • Obviously: only consider indices that would be useful • The optimizer can “tell” which indices it might use in executing a query • But that continues to be a lot of indices! • Can exhaustively compare all possible indices • Note that indices can interact (esp. for updates) • How do we compare costs and benefits of indices? • Execute for real • Use optimizer cost model with whatever stats we have • Gather some stats (e.g., build histograms, sample) and use cost model

Physical design

SQL Server Architecture

Their Approach in More Detail • For a workload of n queries: • Generate a separate workload with each query • Evaluate the candidate indices for this query to find the best “configuration” – limited to 2 indices, 2 tables, single joins • Candidate index set for workload is the union of all configurations • Too expensive to enumerate all; use a greedy algorithm: • Exhaustively enumerate (using optimizer) best m-index configuration • Pick a new index I to add, which seems to save cost relative to adding some other I’ or to the current cost • Repeat until we’ve added “enough” k indices • “Despite interaction among indices, the largest cost reductions often result from indices that are good candidates by themselves” • They iteratively expand to 2-column indices – index on leading column must be desirable for this to be desirable

Further Enhancements • Use the tool for “what-if” analysis • What if a table grows by a substantial amount? • Supplement with extra info gathered from real query execution • Maybe we can “tweak” estimates for certain selectivities • An attempt to compensate for the “exponential error” problem

Physical tuning tool • Decide when to tune • Decide what “representative” workload • Run the tool and examine the recommended physical design changes • Implement them if appropriate

Alternative Tuning Models • Alerter • When to tune • Light weight tools • Workload as a Sequence • Read/update queries • Create/drop physical structures

Dynamic tuning

Dynamic tuning • Low overhead, not interfere with normal functioning of DBMS • Balance cost of transitioning and potential benefits • Avoid unwanted oscillations

Impact • Tuning Large Workloads • Partition • Sample • Tuning Production Servers • Test server

Future Directions • Ability to compare the quality of automated physical design solutions • Light weight approaches • Machine learning techniques, control theory and online algorithms

Memory tuning in DB2 • Innovative cost-benefit analysis • Simulation technique vs. modeling • Tunes memory distribution and total memory usage • Simple greedy memory tuner • Control algorithms to avoid oscillations • Performs very well in experiments • For both OLTP and DSS

SBPX Operation 9. Stop timer Disk SBPX Buffer Pool 3. Page request for 5. Check SBPX 1. Victimize Page (move to SBPX) 2. Load new page from disk 7. Victimize BP page (send to SBPX) 4. Check Bufferpool 6. Start timer 8. Load page from disk

Experimental results – tuning a static workload BP Size

7000 avg = 6206 6000 Reduce 63% 5000 4000 Some Indexes Dropped avg = 2285 Time in seconds 3000 2000 avg = 959 1000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Order of execution Experimental results – workload shift

Thank you!

Self-Tuning Database systems

Self-Tuning Database systems

Presentation Transcript

Database Tuning

Self tuning regulators

FUSION: A Framework for Engineering Self-Tuning Self-Adaptive Software Systems

Self-managing database systems

A Self-Tuning Cache architecture for Embedded Systems

CSBP430 – Database Systems Chapter 16: Practical Database Design and Tuning

Self-Tuning Memory Management of A Database System

Database Performance Tuning

Module 4 Database Tuning

Self-Tuning Memory Management of A Database System

Rethinking Database System Architecture: Towards a Self-Tuning RISC-style Database System

From Adaptive to Self-Tuning Systems

Database Tuning

SELF TUNING OF CONTROLLERS

Self-Tuning Musical Chime

Rethinking Database System Architecture: Towards a Self-tuning RISC-style Database System

Self tuning regulators

Database Performance Tuning

Self-Tuning Memory Management of A Database System

Rethinking Database System Architecture: Towards a Self-Tuning RISC-style Database System

A Self-Tuning Cache Architecture for Embedded Systems