260 likes | 475 Views
10G - New Manageability Features. Presented by Lenka Vanek lenka.vanek@quest.com. Oracle 10g – Manageability. Active Session History (ASH) Contains recent session activity Automatic Workload Repository (AWR) = Infrastructure -> Central element – provides services
E N D
10G - New Manageability Features Presented by Lenka Vanek lenka.vanek@quest.com
Oracle 10g – Manageability • Active Session History (ASH) • Contains recent session activity • Automatic Workload Repository (AWR) • = Infrastructure -> Central element – provides services • Automatic Database Diagnostic Monitoring (ADDM) • Generate advice based on AWR data • Server Generated Alerts - metrics computation and and threshold validation done by Oracle database 10G directly
ASH =v$ session +History • Contains recent session activity • History of v$session_wait … records what is session waiting for • Every Second • Inactive sessions not sampled. • Design – rolling buffer in memory • Size - between 1M & 128M (avg. sample record 600 bytes) • --- Algorithm used to estimate ASH buffers sizememory_quota = max(2% of sga_target, 5% of shared_pool_size); /* sga_target = 0 when AUTO SGA is OFF */cpu_quota = 2MB * (# of CPUs);ash_size = min( cpu_quota, memory_quota );ash_size = max( 1MB, ash_size); /* atleast 1MB */ash_size = min( 128MB, ash_size); /* atmost 128MB */
ASH • Provider for ADDM • v$session and v$session_wait join eliminated – • Prior 10G - sessions experiencing waits were generally located by joining the v$session_wait view with the v$session view. • 10G - offers query simplification. All the wait event columns from v$session_wait have been added to v$session. • x$ash • V$active_session_history - contains one row for each active session per sample • DBA_HIST_ACTIVE_SESS_HISTORY - contains historical data • the greater the system activity, the smaller the number of seconds of session activity that can be stored in the circular buffer • Flushed every 30 minutes or when buffer is full • MMON every 30 minutes and by MMNL (Memory Monitor Light) whenever the buffer is full • wrh$active_session_history
ASH Statistics ASH v$session SGA Rolling Buffer Recent History 30 min is just goal V$ACTIVE_SESSION_HISTORY MMON MMNL AWR Snapshots
ASH - Limitations • Query of v$active_session_history needs a session • Query of v$active_session_history requires all relevant latches in SQL layer • If system is crippled ASH will impose more overhead on these latches
AWR – Automatic Workload Repository • AWR is Infrastructure • Collects, maintains and utilizes statistics • Two major parts: • In-memory statistics – fixed views- V$ • WR Schema, Snapshots – persistent portion for historical analysis. • SYSAUX tablespace - occupies 63.7% of space • Process MMON – memory monitor – disk transfer, snapshots purging, retention period
AWR SQL*Plus BG ….. SGA In memory statistics Collections: Time model, wait classes, OS stats, Metrics, SQL Stats Object Stats ASH BG AWR Snapshots SYSAUX 7 days - Default V$ DBA_% MMON FG FG v$sysstat v$sql v$segment_statistics v$sys_time_model v$osstat v$event_name ……… ADDM
AWR = STATSPACK ++ • Foundation for all of the other self-tuning features. • Runs every 30 min • Provides data for • ADDM • Alerts • Advisors • Cost Based Optimizer • End-to-end tracing • Automatically installed, populated, purged for 10G only • Default retention – 7 days. This can be changed.
AWR and Snapshots • Stores information in form of Snapshots (similar to statspack snapshots, but more precise) • Snapshot = set of data captured at a certain time • Each time a snapshot is taken, the ADDM is triggered to do an analysis of the period corresponding to the last two snapshots Snapshot can be taken manually BEGINDBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();END; /
AWR – Base Statistics and Metrics • Base statistics – raw data • Metrics – secondary, derived from base statistics • Updated by MMON • Example: avg number of physical reads per sec in the system in the last 30 minutes • Tract the rates of change • Indicators of DB performance • Deltas of Stats and Events over 15 and 60 seconds • Max, min, avg., standard deviation over 30 min • 10 minutes for File IO • 30 minutes for SQL – metric values are constantly increasing in x$ until 30 minute snapshot when it is externalized into dba_hist_sqlstat
AWR - Metrics • New Views • v$sysmetric ….not a 1 to 1 map of v$sysstat, there are some new values • v$sessmetric • v$metricname • v$filemetric, v$waitclassmetric, v$eventmetric, …… • 10G Supports metrics for • System • Session • File • Wait-event statistics • The wait event model - steadily gaining ground as a good tuning tool • At any given moment an Oracle process is either busy servicing a request or waiting for something to happen
Wait Event Enhancements • Formed new wait events classes … before too many individual events • Changes to v$event_name - CLASS# and CLASS columns are added. These columns help to group related events while analyzing the wait issues. • Example - list the events related to IO, • Understanding the overall health of the database. • New columns in the v$session and v$session_wait views that track the resources sessions are waiting for. • Histograms of wait durations • Assist in determining whether a wait event is a frequent problem that needs addressing or a unique event.
New Views • v$system_wait_class – the instance-wide time totals for the number of waits and the time spent in each class of wait events. Understanding the overall health of the database • v$session_wait_class - the number of waits and the time spent in each class of wait event on a per session basis. • v$event_histogram – a histogram of the number of waits, the maximum wait, and total wait time on a per-child cursor basis. • v$file_histogram – a histogram of all single block reads on a per-file basis. Determine if the bottleneck is a regular or a unique problem. • v$temp_histogram – a histogram of all single block reads on a per-tempfile basis. • v$session_wait_history – This view displays the last 10 wait events for each active session.
Wait Classes Overview • Administrative (39) • switch logfile • rebuild index • Application (11) • enqueues • sqlnet break/reset • Cluster (113) • Commit(1) • Log file Sync • Concurrency (12) • Latches: cbc, lbc, • Lib cache locks • Buffer busy wait • Configuration(20) • log file size • Enqueues: ST, HW, ITL • Latch: redo copy,shared pool • Idle(56) • Network(25) • System I/O(19) • Scheduler(6) • User I/O(12) • Other(485)
AWR and STATISTICS • Oracle10g to be either robust or simple • You can control the set of statistics to capture by using STATISTICS_LEVEL parameter. • If STATISTICS_LEVEL is set to: • BASIC: The computation of AWR statistics and all self-tuning capabilities are turned off. • TYPICAL: Only part of the statistics are collected. They represent what is typically needed to monitor the Oracle server behavior. DEFAULT. • ALL: All possible statistics are captured.
AWR Limitations • Only for 10G • Each DB has its own AWR repository • Cannot provide cross instance analysis • Overhead on production box • PL/SQL interface – cannot communicate if box is down • STATSPACK cannot be migrated into 10G • User cannot modify AWR Schema
Automatic Database Diagnostic Monitor • Performance Diagnostics within DB • Not a monitor – perform analysis – an Advisor • Analyzes the AWR data, much the same as a human DBA would analyze a STATSPACK report • Utilizes AWR and publishes report - every hour • Identify problems - ADDM searches for lock-and-latch contention, file I/O bottlenecks and SGA shortages, etc… • Proposes solutions – relies on Advisor for solution
ADDM • Where is the time spent? • What did DB do? • Elimination method – where is NOT your problem – TREE Structure • Uses a tree structure to represent all possible tuning issues • The tree is based on the new wait and time model statistics • Root of this tree represents the symptoms, and going down to the leaves • If time-based threshold is not exceeded for a particular node, ADDM prunes the corresponding sub-tree
Limitation of ADDM • Same limitation as AWR – only for 10G • Works of a set of rules and these are not external of DB. Based out of fixed thresholds. • Depends on ASH for analysis
Advisory Framework • Advisors are server components - provide useful feedback about resource utilization and performance • Automatic Database Diagnostic Monitor (ADDM) = An advisor for the database instance • ADDM can call • SQL Tuning Advisor - tuning advice for a SQL statement • SQL Access Advisor - determines optimal ways to access data • Space Advisor • Segment Advisor: Responsible for space issues regarding a database object. It analyzes the growth trends. • Undo Advisor: Suggests parameter values and the amount of additional space that is needed to support flashback for a specified time
Advisory Framework • Memory Advisor – MMAN PGA Advisor: recommends optimal usage of PGA memory based on your workload. SGA Advisor: tuning and recommending SGA size depending on pattern of access for the various components within the SGA: Buffer Cache Advisor: Predicts cache hit rates for buffer access for different sizes of the buffer cache. Library Cache Advisor: Predicts the cursor cache hit rate for the library cache for different sizes. • AWR = Common Data Source
10G Server Generated Alerts • Threshold and non-threshold alerts • Tablespace full • Snap-shot too old, Recovery Area Low On Free Space, Resumable Session Suspended • Notification by page/email/PDA • Push not pull for efficiency – • Advanced Queue • Server pushes alerts • Predefined persistent queue ALERT_QUE owned by SYS • Alerts persist to AWR, review historically • purged according to Workload Repository snapshot purging policy
10G Server Generated Alerts • Most Oracle server-generated alerts are configured by setting two threshold values on database metrics: • Warning threshold – 85% • Critical threshold – 97% • Only space-related alerts have thresholds defined by default – Tablespace Space usage • There are 161 metrics for which you can define thresholds
Server Alerts Limitations • Concept great – implementation has a lot to be desired • Only 4 alerts out of the box • Rest needs to be set up by user • Avoid false peaks – need to set # of occurrences. This is very difficult in production- How long should I wait?