660 likes | 1.08k Views
The Buffer Cache. Objectives. At the end of this section, you should be able to: Describe the architecture of the buffer cache Identify the algorithms that govern the operations in the buffer cache Describe wait events and statistics that are relevant for tuning the buffer cache.
E N D
Objectives • At the end of this section, you should be able to: • Describe the architecture of the buffer cache • Identify the algorithms that govern the operations in the buffer cache • Describe wait events and statistics that are relevant for tuning the buffer cache
The Kernel Buffer Cache • The kernel buffer cache is a means of reducing the • amount of disk I/O. • An Oracle database resides on disk. • Its data must be cached to enable optimized access for frequently accessed data, while allowing concurrent read/write access.
Buffer Cache • Provides fast access to commonly accessed database blocks (or pages) • Provides discipline for processing concurrent block change requests • Keeps consistent-read versions of database blocks (multiversion consistency) • Maintains cached buffers through various internal structures
Kernel Layout Oracle Call Interface OCI User Program Interface UPI Oracle Program Interface OPI Kernel Compilation Layer KK Kernel Execution Layer KX Kernel Distributed Execution Layer K2 Network Program Interface NPI Kernel Security Layer KZ Kernel Query Layer KQ Recursive Program Interface RPI Kernel Access Layer KA Kernel Data Layer KD Kernel Transaction Layer KT Kernel Cache Layer KC Kernel Services Layer KS Kernel Lock Management Layer KJ Kernel Generic Layer KG Operating System Dependencies S
Source Code • Buffer cache code is defined in the kcb layer. • kcb.c Get, change, and release buffers • kcbb.c Functions of the DBWn • kcbk.c Checkpoint queue manipulation • kcbl.c Direct I/O routines • kcbt.c Static routines (block dumps) • kcbw.c Working set management • kcbz.c Buffer handling (waits and latches)
Basic Terminology • Hash buckets are structures that maintain lists of data buffer headers, grouped by relative dba and class number. • A hash chain is a list of data buffer headers in one hash bucket. • LRU describes a mechanism that is used to determine which buffers should be used when searching for a free buffer.
Hash Buckets and Chains • Default number of hash buckets: (DB_BLOCK_BUFFERS x 2) • The parameter _DB_BLOCK_HASH_BUCKETS can be used to overwrite the default. • Each hash bucket contains a chain of buffer headers for a set of <RDBA, class> pairs. • Each hash bucket has a cache buffers chains latch to protect concurrent access to the chain.
Overview of Buffer Cache Hash bucket LRUW LRU Buffer memory area Buffer header . . . . . . . . . . . . . . . . . . . . . . . . . . . Buffer descriptor Buffer state object
TSN AFN RDBA Obj # Class CR env Mode Buffer Header (kcbbh) LRU chain kcbbh kcbbh Hash chain pid 4 pid 9 Users list Hash bucket pid 7 Waiters list Flags Buffer Cache Buffer address Buffer state Lock element CKPTQ and FQ RBAs (low, high and recovery) Change State foq = 1 : on write list = 2 : on ping list Working set description FOQ (flag on queue) Stats (touch count and time)
Multiple Buffer Pools • Three buffer pools • Default • Keep • Recycle • Usage • Move the large segment into a recycle cache • Move small warm segments into a keep cache • To view segment-level statistics (Oracle9i,release 2): • V$SEGSTAT_NAME • V$SEGSTAT • V$SEGMENT_STATISTICS
Buffer Pool (kcbwbpd) • Cache is structured into pools based on the longevity and usage of the data that is loaded in it. Buffer Cache Default Recycle Keep • Keep: For frequently updated small tables • Recycle: For randomly accessed large tables • Default : Buffers that are not allocated to the above two
LRU • Fundamentally, the LRU comprises two lists: • LRUW (LRU write list, also called the “dirty list”), maintaining current (dirty) buffers • LRU (least recently used list), maintaining the remaining buffers • A buffer can only be cached on either the LRUW list or the LRU list, not on both.
Multiple LRU lists • Controlled by the initialization parameter DB_BLOCK_LRU_LATCHES • Only useful on multi-CPU machines • Each LRU list is protected by a cache buffers lru chain latch
LRU Lists • Each LRU pair (LRU/LRUW) is called a working set. • Buffers are assigned to working sets in a round-robin fashion. • Processes try all latches until they can acquire an LRU latch. • In the case of multiple DBWR processes, each DBWR will manage its own LRU sets. • Multiple DBWRs are implemented with the parameter DB_WRITER_PROCESSES. In Oracle9i, Release 2, KCBMAXDBWRS, which defines max value for DB_WRITER_PROCESSES has been increased to 20.
Working Sets (kcbwds) • The default is the number of DBWRs x Maximum number of buffer pools (8). • These latches can be distributed among the working sets in each pool. Default Keep Recycle WS1 … WSn WSm … WSs WSt … WSz
Working Sets List Name Type of buffers on list LRU Replacement list LRU-W Write list; old dirty buffers LRU-P Ping list; RAC only LRU-XO Reuse object list; buffers to be written for drop/truncate LRU-XR Reuse range list; buffers to be written for reuse block range Thread CKPT queue Thread checkpoint buffers File CKPT queue Checkpoint buffers queued by file# Recovery CKPT queue Buffers for performing recovery
Buffer Cache Users • Data block reader: A foreground Oracle process that reads data blocks into buffers and makes changes to the blocks • Data block writer: A background Oracle process (DBWn) that manages the writing of modified (dirty) blocks to disk
Buffer Get • Procedure to obtain a buffer: • Get a “buffer descriptor” which includes the RDBA and the class of the desired block. • This action is implemented in the function kcbget(descriptor,lock_mode) • kcbget takes two arguments, the buffer descriptor and a lock mode: • Enforced by the cache layer through a handle or state object • Becomes part of the process state tree
Buffer Gets • Buffer Modes: • KCBMNEW : New buffer for exclusive access • KCBMSHR : Current buffer for shared access • KCBMEXL : Current buffer for exclusive access • KCBMCR : CR buffer for shared access • KCBMCRX : Variant of CR mode • KCBMNULL : Used to keep a reference to the buffer • Mode Compatibilities: Requested Mode NEWSHREXLCRCRXNULL NULL y y y y y y Held Mode SHR n y n y y n EXL n n n n n n
Getting Current Buffers • Scan the appropriate hash chain. • Find a usable buffer in the chain or read from disk. • Attach a state object to the buffer header. • Return pointer to the data area. Hash Chain 30 CURR 20 CURR 30 CR U W U W U W PID 4 PID 7
Buffer Management • Buffers are initially hashed to LRU-AUX list. • Buffers that are candidates for reuse • Buffers that are scanned by DBWR at the tail of LRU and proven to be clean • Foregrounds always start scanning LRU-AUX. • Allocated free buffers are moved to the middle of the MAIN list. • DBWR scans LRU-MAIN and moves dirty buffers to LRUW-MAIN.
Buffer Management • LRU-XR, LRU-XO and LRU-P are also called “write” lists. • Buffers are linked to these due to a specific write action. • These lists are candidates for immediate write-outs by the DBWR. • Enable write prioritization capabilities
Touch Count • Dump the buffer header to see touches per buffer: • Or query X$BH: SQL> alter session set events 2 'immediate trace name BUFFERS level 5'; BH #120 (0x80395bcc) file#: 4 rdba: 0x01000aea (4/2794) class 1 ba: 0x822ca000 set: 1, dbwrid: 0 hash: [803e841c,803e841c], lru: [803949c4,80395e7c] LRU flags: hot_buffer ckptq: [NULL] fileq: [NULL] st: XCURRENT, md: NULL, rsop: 0x0, tch: 100Touch count L:[0x0.0.0] H:[0x0.0.0] R:[0x0.0.0] buffer tsn: 3 rdba: 0x01000aea (4/2794) ... SQL> SELECT TCH FROM SYS.X$BH WHERE FILE#=4 AND DBABLK=2794; TCH ------------- 100
Write Priority • Three write priorities are defined. • Each write type is associated with a priority. Type of Write Priority Ping High Thread CKPT High Instance recovery checkpoints High Media recovery checkpoints Medium Incremental checkpoints Medium Cold dirty buffers (aging) Medium User initiated checkpoints Low Tablespace checkpoints Low
DBWR and Checkpoints • Working sets and DBWR • Dirty buffer management • Checkpoint services
Working Sets and DBWR • DBWR processes work on specific working sets that are assigned to them in a round-robin fashion. Working Sets s1 s4 s7 s10 DBW1 DBW2 DBW3
DBWR Invocation • Performs writes for several reasons: • Make free requests • Checkpoints • Ping writes • Cleanout of cold dirty buffers • Before Oracle8i, DBWR was initiated by timeouts and foreground postings, and each case was handled differently. • In Oracle8iand Oracle9i, all requests are managed by a single action handler.
DBWR: Make Clean Buffers • DBWR gets posted by foregrounds to write out dirty buffers. • DBWR scan depth: When the number of known clean buffers in a set is less than 50% of DBWR scan depth for that set • Foreground scan depth: If no clean buffers were found within foreground scan depth limit while scanning for a usable buffer • If the number of dirty buffers in the write list is greater than _DB_LARGE_DIRTY_QUEUE value
DBWR: Make Clean Buffers DBWR accumulates a batch worth of dirty buffers from all sets and issues a write Inc CKPT buffers Set 1 Copy the dirty buffers to write batch structure Set 2 Write to disk Set 3 Write Batch
Checkpoints • To ensure that the data blocks that have their redo generated up to a certain point in the redo log (RBA) are written to the disk • Checkpoint structure includes: • Checkpoint SCN • Checkpoint RBA • Thread that allocated the checkpoint • Enabled thread bitmap • Timestamp
DBWR: LGWR Invocation • Done at checkpoint time to perform batch writes • LGWR hands the checkpoint RBA to DBWR • DBWR then writes all buffers in the checkpoint queue (CKPTQ) whose RBA <= checkpoint RBA
Checkpoint and File Queues • Improve the performance of • Thread checkpoints • Tablespace checkpoints • Make the checkpoint algorithm scalable to large caches
Checkpoint and File Queues • Pre-Oracle8 DBWR scanned the entire cache to find buffers with checkpoint bit set. • CKPTQ and FQs eliminate this scan. • When the buffer is first modified, it is inserted into the CKPTQ in RBA order • The buffer is also inserted into the appropriate FQ • When a checkpoint is initiated, DBWR writes all buffers on the queue until the checkpoint RBA is less than the head of the CKPTQ RBA.
Checkpoint Queues • Checkpoint queues contain the dirty buffers that are ordered by the low RBA. • There is one file queue per data file containing: • Dirty buffers belonging to that file • Delayed logged buffers Buffer with highest low RBA kcbbh Working Set Descriptor kcbbh CKPTQ FILEQ1 FILEQ2 FILEQ3 FILEQn CKPT queue latch
Checkpoint Queue Management • Linking buffer to checkpoint queue: • Whenever the low RBA is set for a buffer, the buffer is added to tail of the CKPTQ and appropriate FQ. • Removal of buffer from checkpoint queue: • The low RBA is cleared when the buffer is added to write batch. • When the buffer is written to disk, it is unlinked from the FQ and CKPTQ. • Write of hot buffers from CKPTQ is deferred.
DBWR: Flush DBA Range • DBWR purges dirty buffers whose DBAs are within the low and high DBA range for each file in the tablespace, when issuing: • The files in the tablespace are handled iteratively. • These buffers are moved to LRU-XR. SQL> alter tablespace … begin backup;
DBWR: Flush Invalidated Range • DBWR purges dirty buffers whose DBAs are within the range pertaining to a dropped object. • It then invalidates all dirty (whether CURRENT or CR) blocks for reuse. • These buffers are moved to LRU-XO. • This mechanism is essential for recovery.
Buffer Cache Events • There are 17 events defined in the buffer cache layer in Oracle9i Release 2. • There are 13 events defined in the buffer cache layer. • Only a few are important: • Free buffer waits • Buffer busy waits • Write complete waits • The others are rarely an issue.
Buffer Cache Statistics • Buffer cache statistics are defined in kcb.h. • The statistics can be classified into three groups: • User statistics • Background (DBWR) statistics • RAC-related statistics • Description of the statistics can be found in Oracle9i Database Reference: Appendix C.
DBWR Statistics • DBWR statistics are calculated when the DBWR writes out buffers and scans the LRU lists. • Most important: • DBWR timeouts • "Make free" requests • Checkpoints • DBWR is a self-tuning process. • Behavioral changes based on how often it is posted and the percentage of clean cache relative to the total cache
Buffer Cache Latches • Buffer cache latches are used to protect: • Buffer chains • Buffer handles • LRU lists • Nine different latches protect the buffer cache. • Four of them have child latches: • Cache buffers lru chain • Cache buffers chains • Checkpoint queue latch • Buffer pool
Buffer Cache Fixed Tables • Significant fixed tables: • X$KCBWAIT*: Wait statistics by block class • X$KCBFWAIT : Wait statistics by file ID • X$KCBWBPD*: Buffer pool descriptors • X$KCBBHS : DBWR Histogram statistics • * These tables have V$ views built on them.
Tuning the Buffer Cache • Tuning the buffer cache also involves tuning I/O • and DBWR. • Start with a good buffer cache hit ratio (rule of thumb: 90%). • Eliminate the use of buffer cache whenever possible. • Use direct I/O (direct loader, direct insert). • SORT_MULTIBLOCK_READ_COUNT
Tuning the Buffer Cache • There are two primary events contributing to the greatest amount of wait time and performance bottlenecks: • Buffer busy waits • Free buffer waits • Always tune the event with the highest wait time first.