280 likes | 447 Views
Autonomic DBMSs: System Tune Thyself!. Pat Martin Database Systems Laboratory School of Computing. Supported by IBM, CITO and NSERC. Outline of Talk. The problem – system complexity The solution – autonomic computing systems Autonomic DBMSs
E N D
Autonomic DBMSs:System Tune Thyself! Pat MartinDatabase Systems LaboratorySchool of Computing Supported by IBM, CITO and NSERC
Outline of Talk • The problem – system complexity • The solution – autonomic computing systems • Autonomic DBMSs • Some current research - tuning multiple buffer pools • Summary
The Problem • Computer systems continually expanded to achieve greater functionality and efficiency • Expansion has led to a complexity crisis • Systems are too complex to be managed effectively!
A Solution – Autonomic Computing Systems Autonomic Computing Systems, like our nervous system, manage themselves
Autonomic Computing System • Aware of itself and its environment and acts accordingly • Able to reconfigure itself under varying and unpredictable conditions • Able to recover from events that cause it to malfunction • Able to anticipate optimized resources needed to perform a task • Able to protect itself
Autonomic DBMS Project • Goal is develop a DBMS that can automatically • Recognize properties of its workload • Monitor itself with minimal impact on applications’ performance • Reallocate resources to improve performance • Detect and diagnose performance problems • Recognize and react to changes in its environment and available resources
Example – Buffer Pool Tuning • Automatically configure tablespaces to buffer pools based on an analysis of the database and the workload (BP Configuration Problem) • Dynamically adjust sizes of buffer pools to minimize I/O costs for the database and workload (BP Sizing Problem)
Multiple Buffer Pools logical access physical write physical read index item warehouse customer
BP Configuration Problem Given a set of database objects and a workload, determine a mapping of database objects to buffer pools to maximize performance for the given workload.
Configuration Rules of Thumb • Separate data and indexes • Isolate a large data table • Separate objects that are updated frequently and objects that are primarily read • Put temporary tables in their own BP • Separate small frequently accessed tables from larger tables that are scanned • Isolate tables that are accessed frequently by short updates
BPConfig Approach • Analyze logical page reference trace • obtain trace of workload on default configuration • derive access patterns for DB objects • random, re-reference and sequential accesses • Create characterization vectors • type, access patterns, read/write info, size info • Partition DB objects into buffer pools • cluster based on characterization vectors
Partitioning DB Objects • Partition using k-means clustering algorithm • Similarity measured by weighted Euclidean distance • Considered different weighting schemes • equal • favour read/write • favour access pattern
Experiments • Experimental environment • IBM Netfinity 8500R: 4 900 MHz PIII Xeon CPU, 16 GB RAM, 70 disks, Windows NT • TPC-C benchmark: OLTP workload, 400 warehouse (40 GB) database • DB2 Version 7.1 • 100,000 4K pages for the buffer pools
Experiments (cont.) • Configuration schemes • BPConfig, expert, default (1BP), random, distributed (1 BP per DB object) • Evaluation criteria • Weighted Response Time • TPM • % Physical Reads
Experiments (cont.) • Properties of BPConfig configurations (3 buffer pools) • separates index and data objects • separates heavy access and light access objects • WID tables isolated (equal and read/write weightings)
BP Sizing Problem Given a workload, a set of buffer pools and a fixed number of buffer pages, determine the appropriate size of each buffer pool to maximize performance for the given workload.
Approaches to Sizing BPs – Class-based Optimization • Specify performance goals for each transaction class • Algorithm tries to satisfy goals • Logical access cost proportional to physical access cost • Physical access cost determined by buffer pool miss rates
Class-based Optimization (cont.) Collect performance data Choose target class Loop until goal metChoose target buffer poolChoose source buffer poolReallocate pages End Ti with worstperformance BP with greatestbenefit BP with leastcost
Class-based Optimization (cont.) • Problems: • How do we select appropriate performance goals for a class? • Some classes may be favoured over others • Thrashing between buffer pool states is a possibility
Approaches to Sizing BPs – System-based Optimization • BP sizes chosen to maximize system performance metric, eg. throughput • Use a simple greedy algorithm • Considered 2 cost functions: • Minimize hit rate • Minimize data access time (physical reads don’t all cost the same!)
System-based Optimization - Experiments • Experimental environment • IBM xSeries 240 PC Server: 2 1 GHz PIII CPUs, 2 GB RAM, 22 disks, Windows NT • TPC-C benchmark • DB2 Version 7.1 • 50,000 4K buffer pool pages • 3 buffer pools configured with BPConfig
Other AutoDBA Projects • Automatic diagnosis • Automatic recognition of workload type • Integration of BPConfig and sizing algorithm • Automatic BP management in PostgreSQL • Tools for DBMS capacity planning
AutoDBA Project Members • Queen’s: • Wendy Powley, Darcy Benoit, Said Elnaffar, Wenhu Tian, Xiaoyi Xu, Xilin Cui, Ted Wasserman, Nailah Ogeer • IBM: • Berni Schiefer, Sam Lightstone, Randy Horman, Robin Van Boeschoten, Keri Romanufa, Calisto Zuzarte