310 likes | 323 Views
Systematic Oracle performance tuning. Guy Harrison, Chief Architect, Database Solutions March 2007. Agenda. A brief history of Oracle tuning Limitations of common approaches Systematically tuning by levels Application demand Database contention Reducing physical IO Tuning physical IO.
E N D
Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions March 2007
Agenda • A brief history of Oracle tuning • Limitations of common approaches • Systematically tuning by levels • Application demand • Database contention • Reducing physical IO • Tuning physical IO
The story so far …. • First, the earth cooled … • Then, the dinosaurs came • Then, Oracle 5 was released • The Rules of Thumb: • Buffer cache hit rate > 90% • Every segment in a single extent • Index everything • Trial and error • Limited Instrumentation • Tkprof (6.0) • V$sysstat • No WWW
The performance tuning renaissance • Oracle 7-8 1992+ • An empirical, profiling based approach • Many champions, but notably: • Anjo Kolk • Carey Millsap • An Oracle performance community emerges (Oak Table, etc) • Yet Another Performance Profiling Methodology (YAPP): ResponseTime=ServiceTime+WaitTime
YAPP and Wait interface based tuning • Based primarily on existence of the “wait interface” and related Oracle instrumentation • When Oracle sessions wait for a resource (IO, lock, latch, buffer, etc) they record wait durations: • V$SYSTEM_EVENT, $SESSION_EVENT, $SESSION_WAIT • SQL trace files created with the ‘10046 event’ • Largest wait categories represent greatest opportunities for optimization • Average wait times can reveal contention points
The path to enlightenment • Problems to overcome: • Many waits were/are mysterious • Exactly what does “PMON to cleanup pseudo-branches at svc stop time” mean? • Message had yet to reach all in the community • Many still busy tweaking “hit ratios” • Service time – especially CPU time – was hard to accurately measure • Oracle’s internal CPU counters notoriously inaccurate • No breakdown within CPU time • Tuning less effective for CPU-bound systems • Response time included components outside of Oracle: especially as client-server gave way to 3/N tier systems • No guarantee that tuning would result in end user improvement • In 10g, most of the technical issues have been resolved: • Documentation and external resources (eg, the WWW) • Time model • Ms timings • Extended V$SQL timings
WBT pitfalls : an example Single block (e.g. indexed) I/Os represent 83% of DB elapsed time. Average time for an IO is about 10x expected. Therefore it would be reasonable to assume insufficient IO bandwidth (e.g., distinct disk drives) to support the workload.
But before you go upgrading that Disk array…. • A single missing index can cause huge increases in logical IO demand. • This magnifies disk IO demand, internal contention (latches, etc) and CPU utilization. • Furthermore, IO demands from a missing index can increase faster than hardware upgrades can be applied
Making a distinction between symptoms and causes • When faced with an obviously IO-bound database, it’s tempting to deal with the IO subsystem immediately. • However, as often as not, IO problems are symptoms of problems arising at other levels. • Typically SQL tuning or schema design • Likewise, eliminating a contention point might actually increase physical IO demand • Lock waits can reduce the demand on the IO subsystem • Tuning SQL might increase transaction rates • Which then results in latch contention • You are best advised to ignore symptoms in the lower levels until you have optimized the performance of high levels.
Interactions between layers Application demand Concurrency Mgt Logical IO Physical IO Application Oracle Software Memory: SGA and PGA IO Subsystem SQL/ PLSQL Block read/ write Disk reads/ writes Rows/ return code Data Blocks Data/ Return code Application DB access code (parse, execute, single fetch, array fetch) Parse SQL, check security, acquire locks, latches, buffer space, etc Cache data blocks for re-use; sort data as required, create hash tables for hash join Read/write data from disk devices
What it means (key slide) • Problems in one layer can be caused or cured by configuration in the higher layer. • The logical steps in Oracle tuning are therefore: • Reduce application demand to it’s logical minimum by tuning SQL, optimizing physical design (partitioning, indexing) and tuning PL/SQL • Maximize concurrency by minimizing contention for locks, latches, buffers and so on in the Oracle code layer • Having normalized logical IO demand by the preceding steps, minimize the resulting physical IO by optimizing Oracle memory • Now that the physical IO demand is realistic, configure the IO subsystem to meet that demand by providing adequate bandwidth and evenly distributing the resulting load
Examining the time model • High end tuning tools offer big returns on investment, but basic Oracle instrumentation serves well for most stages. • A lot of insight can be gained by looking at the time model combined with the wait interface
Why it doesn’t all add up…. • Items in the time model are “nested”: some categories incorporate times from other categories • Wait time contributes to DB time and background elapsed time • Does parse time include CPU time, etc? 1) background elapsed time 2) background cpu time 2) background wait time 1) DB time 2) DB CPU 2) User wait time 2) connection management call elapsed time 2) sequence load elapsed time 2) sql execute elapsed time 2) parse time elapsed 3) hard parse elapsed time 4) hard parse (sharing criteria) elapsed time 5) hard parse (bind mismatch) elapsed time 3) failed parse elapsed time 4) failed parse (out of shared memory) elapsed time 2) PL/SQL execution elapsed time 2) inbound PL/SQL rpc elapsed time 2) PL/SQL compilation elapsed time 2) Java execution elapsed time
Reduce the application demand • If you can, tune the application code • “The best optimized SQL is the SQL you didn’t do” (Anjo) • Reduce the amount of logical IO generated by SQL statements • This in turn reduces CPU and IO but which fluctuate less predictably. • Rewrite SQL or use stored outlines/profiles • Index wisely • Consider physical design changes • Partitioning • Denormalization • Don’t forget other application demand factors • Parse CPU consumption (also can cause latch contention) • PL/SQL execution time
Searching for a specific plan steps Table and index scans might be amenable to better indexing We could also look for row counts relative to the object being scanned Procedurally, we can even look for nested table scans and other “anti-patterns”; even before they cause problems
Types of contention • Locks • Mostly application, but occasionally system (ST in legacy tablespaces) • Latches • Often side effect of excessive application demand • Lack of bind variables • But sometimes the final constraint on DB throughput • Buffers • Buffer cache, redo buffer, etc • Hot blocks (buffer busy) • Slow “lazy” writers processes (DBWR, LGWR, RVWR) • Sequences • Redo/Archive logs • Contention between the LGWR and the DBWR or ARCH • Shared servers/PQO servers • Global Cache (RAC) contention • Most significant for “hot block” types of contention (cbc latch, buffer busy)
Reducing physical IO • Memory is used to cache data and to perform sorting and hashing (ORDER/GROUP BY, joins). • Oracle 10g manages allocations within PGA/SGA well enough • Determining the trade offs between these areas is the most important IO reduction method
A good buffer cache hit ratio doesn’t help much • A FTS can only generate so much IO • A disk sort can generate many times the IO required to read the data from disk • Some of this sort IO appears to be hidden from the wait interface
Advisories • Advisories exist that provide estimates of the workload impact if a memory area were a different size • In essence, Oracle is maintaining LRU chains that are longer than actual allocations • Key advisories: • V_$PGA_TARGET_ADVICE • V_$DB_CACHE_ADVICE • V_$SHARED_POOL_ADVICE • V_$SGA_TARGET_ADVICE (10g) • Unfortunately, Oracle suffered from “split-brain” during implementation: • PGA reports in bytes RW • DB cache reports in IOs saved • Shared pool reports in parse time • When comparing PGA and SGA advisories, you need to convert them to common units (wait time)
IO management • The demand for physical IO should be about right now • You’ve reduced the application logical IO demand • You’ve eliminated contention that might mask that demand • You’ve minimized the amount of logical IO than turns into physical IO by effective memory management • Now optimize the IO subsystem for the amount of physical IO you observe • Buy enough disks to meet IO demand and storage demand • Remember also that sparse disks are more efficient • Oracle’s SAME (Stripe and Mirror Everything) is a good default approach • Separating log/FRA and datafile IO is arguably the only split up you should have to make • Dedicated disks for sequential redo IO OR • Separate wide, fine-grained stripe for redo and FRA • ASM makes this very easy, but be careful: intolerant of poor underlying configurations
ASM makes it very easy • Only need to provision for total storage for all files and total IO demand • Balancing within the diskgroup is (partially) automatic • Can implement redundancy within the group • DBA gets more control – Sys admin involvement is minimal • BUT: • Still a version 1.0 technology • Simplistic algorithm does not work well if underlying disks are of different sizes/characteristics • Hard to map segments to spindles, especially if implemented on top of hardware storage appliance • Auto-rebalance has not been observed • ASM instance is a vulnerability