340 likes | 517 Views
DB2 Performance Best Practices September 13, 2007 Matt Emmerton DB2 Performance and Solutions Development IBM Toronto Laboratory memmerto@ca.ibm.com. Best Practices Overview. Many areas of focus System Hardware System Software Database Application All require proper design and tuning.
E N D
DB2 PerformanceBest PracticesSeptember 13, 2007Matt EmmertonDB2 Performance and Solutions Development IBM Toronto Laboratorymemmerto@ca.ibm.com
Best Practices Overview • Many areas of focus • System Hardware • System Software • Database • Application • All require proper design and tuning
System Software • Operating System • Ensure that latest required maintenance level and all required APARs have been applied • Only use certified/supported device drivers • Database Engine • Ensure that latest required maintenance level and all required APARs have been applied • Latest fixpacks provide bug-fixes and performance improvements • DB2 Viper (v9) • AIX 5.2 + TL08 SP2 + 64-bit kernel • AIX 5.3 + TL04 SP2 + APARs + 64-bit kernel • Recommended: AIX 5.3 + TL05 SP3 + 64-bit kernel • DB2 Viper 2 (v9.5) • AIX 5.3 + TL06 + 64-bit kernel (tentative)
Database Design • Good design involves many areas: • Disk hardware • Database • Tablespaces • Tables • Indexes • Bufferpools
Database Design – Disk Hardware • Number and size of disks matter • Few large disks will create a performance bottleneck • Many small disks will pose an administrative nightmare • Both OLTP and DSS applications benefit from more disks • Separation of logs and data • Logs should be on separate disk subsystem with separate controller(s), RAID-5 at a minimum • Data should be on separate disk subsystem with separate controller(s), RAID-1 at a minimum • Be sure to take advantage of the caching technologies of your RAID subsystem • Enabling read cache (on data disks) can provide performance benefits on DSS workloads (repeated scans can be cached) • Enabling write cache (on log disks) can provide performance benefits on OLTP workloads. • Must be battery-backed and mirrored to be durable • Battery-backed write cache has improved OLTP performance up to 3% in some workloads
Database Design – Database • Database encoding • Unicode databases should only be created in the following circumstances: • Need to store non-ASCII data • Need to store XML data • Use native code pages when possible • Very important when migrating from System i/p environments • Object page size • Using few page sizes will simplify memory management • Choice of page size can impact performance • OLTP benefits most from 4K and 8K page sizes • DSS benefits most from 16K and 32K page sizes
Database Design – Tablespace Type • DMS – Database Managed Storage • Database manages allocation of tablespace extents to database objects • Raw (device) access gives top performance • File (filesystem) access gives moderate to high performance • Space is pre-allocated at creation time • When DIO and CIO features are enabled, performance is comparable to raw • SMS – System Managed Storage • OS manages allocation of tablespace extents to database objects • Filesystem access gives moderate performance • Space is allocated at runtime • Fragmentation and flush delays can degrade performance
Database Design – Tablespace Types • Automatic Storage • Resizes tablespaces as necessary at runtime • Threshold for resize and amount of resize can be controlled by DBA • Created as DMS File for regular or large tablespaces • Created as SMS for temporary tablespaces
Database Design – Filesystem Features • Direct I/O (DIO) • DIO enables filesystem cache bypass • DB2 already caches pages, so why have the OS cache again? • Reduces filesystem cache growth and thus limits potential paging activity and related VMM problems • Enabled automatically if the underlying filesystem supports DIO • Can be over-ridden via ALTER TABLESPACE • Improves performance on DMS File to within 5-10% of DMS raw • Concurrent I/O (CIO) • Enables concurrent readers and writers to separate regions of a single file • Enabled automatically if the underlying filesystem supports CIO
Database Design – Tablespaces • Tablespaces should be spread over many filesystems or disk devices • Using more than one container is a good thing! • This will improve performance via parallelism at the device level • Extent size should be a multiple of the RAID stripe size • This will improve disk performance via parallellism at the disk level
Database Design - Bufferpools • On a dedicated database server, all available memory should be assigned to bufferpools • 32-bit DB2 has memory limitations: • Only supported on Windows • Limited to 2-4 GB depending on system config • 64-bit DB2 has no memory limitations on any platform
Database Design – Tablespace/Bufferpools • Bufferpools are assigned to tablespaces • Minimum 2 bufferpools and tablespaces per page size • One bufferpool/tablespace for temporary objects using SMS • One bufferpool/tablespace for permanent objects using DMS • Separate bufferpools for additional tablespaces may improve performance • Separating objects by type (tables, indexes) • Separating objects by access pattern (sequential, random, head, tail)
Database Design – Tables • Use primitive types for primary keys • Integer types are very efficient • Primary key access via index scans • Referential integrity checking and enforcement • Character types are expensive and inefficient • Consider setting table properties via ALTER TABLE • APPEND: Indicates that rows are seldom deleted • LOCKSIZE: Can choose from row or table locking • Table-based locking is beneficial for read-only table
Database Design – Tables • Row compression • Beneficial in situations where I/O is the dominant factor • reading fewer pages from disk improves response time • extra CPU cost for decompression can be absorbed • most useful for read-only queries • Varying benefit in CPU-bound and/or read-write situations • CPU-bound environments can’t absorb the cost of decompression • Read-Write situations require decompression and compression which increases CPU cost • Update log records are 2-3x the size • Only beneficial when the I/O rates are reduced enough to absorb the extra CPU used by compression
Database Design – Indexes • Avoid primary indexes on CHAR/VARCHAR fields • Very expensive to scan and enforce RI • Fine to use for secondary index access • Tables with fewer indexes are better for performance • Fewer index updates during UPDATE/INSERT/DELETE • Consider setting INDEX properties via ALTER INDEX • PAGE SPLIT LOW/HIGH: Indicates to DB2 how to populate pages during index reorganization • CLUSTER: Creates a clustering index • INCLUDE: Includes table columns within index structure for quick retrieval
Database Design – Advisor • DB2 Design Advisor (db2advis) • Analyses queries and suggests indexes, MDCs, MQTs and statistics that would be beneficial • Can read queries from: • Command-line • User-specified input file • Dynamic SQL cache • Will suggest schema changes: • Indexes to create/delete • Multi-Dimension Clustered tables (MDCs) • Materialized Query Tables (MQTs) • Will suggest maintenance operations: • RUNSTATS (with or without sampling and/or distribution stats) • REORG
Database Tuning • Tuning requires lots of data collection and analysis • OS tools • Vmstat, iostat • DB2 tools • Snapshots, event monitors • Tuning can happen at all areas of the solution stack • OS level • Filesystem tuning, VMM tuning • Database level • Db2set, dbm cfg, db cfg parameters • Schema changes • Consider using our autoconfigure tool
Tuning – Operating System • Paging Space • Should be on multiple disks • Not shared with any other data (/home, DB data or logs) • Paging should be the exception, not the rule on a database server • System Tools • Vmstat gives a continuous picture of system behaviour • Run queue (process/thread scheduling) • Kernel events (context switches, interrupts, system calls) • CPU usage (system, user, idle, I/O pending) • Iostat gives a continuous picture of disk behaviour • Per-controller and per-array breakdowns • Can be used to isolate I/O problems • Can expose DB2 tuning or application problems
Tuning – Filesystems (AIX) • Filesystem cache size can be reduced from AIX defaults • JFS Filesystem Cache • Minperm/Maxperm are the min/max number of pages allocated to JFS filesystem cache • Strict_maxperm determines whether this is a soft (0) or hard (1) limit • JFS2/NFS Filesystem Cache • Maxclient% is the number of pages allocated to JFS2 filesystem cache • Strict_maxclient determines whether this is a soft (0) or hard (1) limit • Must take into consideration the I/O characteristics of the system • Smaller filesystem cache can reduce system paging • Larger filesystem cache can benefit SMS tablespace performance (especially for temporary tables) • All of this is especially beneficial when using DIO/CIO for the majority of tablespaces in a database
Tuning – AIO (AIX) • Asynchronous I/O (AIO) • Allows DB2 to perform useful work while I/Os are being processed • AIO on filesystem-based (DMS File and SMS) tablespaces use AIO kprocs • The number of aioserver kprocs can be configured via the ‘maxservers’ tunable • Not uncommon to have a large number of AIO kprocs, as one is created for each concurrent AIO request • Newer versions of AIX will not use kprocs but instead will use internal structures to manage filesystem AIO
Tuning – Database – Autonomics • Many sizing, tuning and administrative tasks have been automated in v9 and v9.5 • Most manual controls are still available • Configuration Advisor • Will set various database parameters based on system characteristics (#CPU, memory, etc) • Automatic Runstats • Automatic Backup • Self-Tuning Memory Manager • Will manage the amount of memory needed for LOCKLIST, Package Cache, Application Heap, Bufferpools, Sort Heap
Tuning – Database – Number of Agents • NUM_INITAGENTS • Number of agents to initialize at startup • Set to the average number of connections • Can minimize the amount of time it takes to start up the set of active agents required at runtime • NUM_POOLAGENTS • Number of agents that are maintained during runtime • Set to the average number of connections • Can be configured automatically by DB2 (starting in v9.5)
Tuning – Database – Application Memory • PCKCACHESZ • “Package Cache Size” • Represents the amount of memory used to cache compiled statements in the database engine • If this is too small, compiled statements will be purged from the cache and thus will need to be recompiled (under the covers) before they can be executed again • Take database snapshots, and look for “pkg_cache_num_overflows”. If this is high, then it is a good indication that the cache is too small • Can be configured automatically by DB2 (starting in v9.5) • Can be tuned automatically tuned via STMM (starting in v9) • APPLHEAPSZ • “Application Heap Size” • Represents the amount of memory used as a “working set” for each database connection • May get SQL0954C errors if your application heap is too small • Can be configured automatically by DB2 (starting in v9.5) • Can be tuned automatically tuned via STMM (starting in v9)
Tuning – Database – Logger • LOGFILSIZ • “Log File Size” • Increase from the default; 5000 pages is a good starting point • LOGBUFSIZ • “Log Buffer Size” • Increase when “log pages read” counter is high in the database snapshots • A large LOGBUFSIZ ensures that when a transaction rolls back, it does not have to read log pages from disk (for past transactions) in order to complete the rollback operation • SOFTMAX • “Soft Checkpoint – Maximum Log Files” • The checkpoint interval expressed as the number of log files (in percent – 100 = 1 log file) • Modified data pages are written to disk after the transactions they are associated with are outside of the SOFTMAX interval • Large values will incur more I/O at recovery time (as more log must be read to recover) • Small values will incur more I/O at runtime (as data pages are written to disk sooner)
Tuning – Database – Bufferpools • NUM_IOCLEANERS • “Number of Page Cleaners” • Can be configured automatically by DB2 (starting in v9) • NUM_IOSERVERS • “Number of Prefetchers” • Can be configured automatically by DB2 (starting in v9) • SOFTMAX • Primarily a log tunable, but also controls the rate at which dirty pages are written to disk • CHNGPGS_THRESH • “Changed Pages Threshold” • Indicates the threshold of (dirty pages / total pages) at which to start writing dirty pages to disk • Lower values will provide a more constant I/O behaviour • DB2_USE_ALTERNATE_PAGE_CLEANING (APC) • Enables a different page cleaning algorithm that is more proactive and responsive to changing system dynamics • Known to be beneficial for most OLTP environments • Not the best on DSS workloads that use block-based bufferpools or large TEMPs • V9 FP3a and FP4 will have changes to fix the TEMP issue for DSS workloads
Tuning – Database – Sorting • SHEAPTHRES/SHEAPTHRES_SHR • Instance-wide soft limit on the number of pages to use for private and shared sorts • A limit set by the DBA • Can be tuned automatically by DB2 (starting in v9.5) • SORTHEAP • Per-sort limit on the number of pages to use for each private or shared sort • Can be tuned automatically by DB2 (starting in v9) • Tuning • Look for “sort overflows” in Database Snapshots • These indicate when sorts could not be contained in memory and had to “spill” to disk (as temporary pages which are placed in temporary tablespaces) • Sorts that spill are very inefficient (at least 2 additional I/Os per page that spills) • DSS workloads typically benefit from large SORTHEAP • OLTP workloads typically do not benefit from large SORTHEAP
Application Design and Tuning • Static SQL / PSM • Access plans “set in stone” when application is compiled • Make sure that code is bound against a database tuned for production use • Dynamic SQL (CLI) • Take the time to use prepared statements with parameter markers • The extra cost of using prepared statements is negligible when compared to the package cache churn when statements are continually purged and then recompiled • Stored Procedures / SQL Procedures (PSM) • Stored procedures can be used to execute multiple statements on the server • This can be used to minimize client/server network traffic and associated processing • SQL Functions • Can be used to simplify application logic by reducing repetitive code
Application Design and Tuning • Database Connections • Minimize the number of connections you application uses • Connections are not cheap – each connection uses ~100KB on the server • Consider using connection concentrator if you need a large number of connections • Analyse Query Plans • Use the db2exfmt and db2expln tools to produce query execution plans for your SQL statements • This will allow you to see what methods DB2 is using to execute the query • Changing the query or the schema can improve bad query plans • Always Close Cursors • Always close cursors once all data has been read • This frees up system resources used to maintain cursor state • In some cases, this will release locks • Always Commit • Always commit transactions, even read-only transactions • This releases locks and will reduce the amount of work required to traverse the lock lists
Questions? • Any questions?