510 likes | 759 Views
Redo and Archiving. Objectives. After completing this lesson, you should be able to: Explain how to measure redo resource contention List the causes of redo contention Detail the LGWR algorithm and tuning concepts Identify redo-less operations and their likely effect on performance
E N D
Objectives • After completing this lesson, you should be able to: • Explain how to measure redo resource contention • List the causes of redo contention • Detail the LGWR algorithm and tuning concepts • Identify redo-less operations and their likely effect on performance • Describe the ARCH process and best archiving practices
Redo Architecture Concepts • Physiological logging: • Page actions: Change Vectors • Minitransactions: Redo Records • Page fix rule • Write ahead log • Log force at commit • Logical ordering of redo
Logging Methods TABLE A INDEX B INDEX C
Logging Methods Logical Log insert operation delete operation Physical Log After Images C B A A' B' C' Physiological Log dba A, delete op, insert op dba B, delete op, insert op dba C, delete op, insert op
Server DBWR DBWn CKPT Oracle9i Architecture Diagram SMON PMON System Global Area Redo Log Buffer ControlFiles Buffer Cache ARCn LGWR RedoLogs ArchivedRedo Logs Database Files
Redo Logging in the Oracle Server • Complex actions are structured as a sequence of minitransactions, called redo records: • Constitute an atomic change to the database state • Are used to move the database state forward • Redo records are structured as a sequence of page actions, called change vectors: • Change one Oracle block • Are used to move the block consistency forward
Page Fix Rule • The page fix rule guarantees that: • The buffer cache location of a data block is locked before reading or modifying the block • Redo and undo change vectors are generated before modifying the block • Consistency and isolation of the buffer are maintained
Write-Ahead Logging • Write-ahead logging guarantees that: • Before a block buffer is changed, in the buffer cache, redo must have been written to the log buffer • Before DBWR writes a data block buffer, LGWR must have written the necessary redo records to the log files.
Log Force at Commit • A transaction commit: • Signals the completion of the transaction • Posts LGWR to write the contents of log buffer to disk • Waits for LGWR to complete the write • Does not force a data block write • Group commit: When several transactions commit simultaneously, only one post to LGWR is sufficient to flush all the contents of the log buffer.
Logical Ordering of Redo • Logical ordering of redo is achieved through the implementation of system change numbers (SCNs) • Each database has one global SCN generator. • This sequence acts as an internal clock identifying the committed version of a database. • An SCN is assigned to a transaction when it commits and stored with each redo record
Redo Log • A redo log is an ordered list of redo records. • Redo logs are normally mirrored, forming a redo logs group. • Each individual log in a group is called a member. • The database server requires a minimum of two redo log groups. • Redo log groups are used in a circular fashion. • Each instance creates its own sequence of redo logs. This is known as a thread of redo.
Redo Byte Address • Pointer to a specific location within a redo thread • Recorded at the beginning of each redo record • Made of three components: • Log sequence number • Block number within the redo log • Byte offset within the block REDO RECORD - Thread:1 RBA: 0x000065.00000002.0010 LEN: 0x02f0 VLD: 0x01 SCN scn: 0x0000.0000e4d3 05/12/02 15:19:54
Redo Log Buffer Buffer cache Shared pool Redo log buffer SGA DBWR LGWR Redo log files Data files
Redo Log Buffer • The redo log buffer is a memory area for temporary storage of redo records • The total size is determined by the LOG_BUFFER parameter Subdivided into several pieces that are used in a circular fashion • The size of each piece is an operating system block (usually defined as 512 bytes). • Each piece maps to an on-disk block of the current redo log.
Redo Generation • The server process: • Pins a data block buffer in Exclusive mode • Builds change vectors and bundles them into a redo record in the PGA • Determines the space that is required in the log buffer • Allocates that space in the log buffer • Writes the redo record into the log buffer • Changes the data block in the buffer cache
Redo Generation Release redoallocation latch Is there space available? Calculate redo size Get redo allocation latch N Y Release redo copy latch Increment statistics Adjust remaining free space Increment redo buffer allocation retries Determine SCN Release redo allocation latch Get redo copy latch Get redo writing latch Copy change vectors to log buffer Y Return true Release redo copy latch Has space become available? Release redo writing latch N Y Release redo writing latch Is there space in the current log file? Wait for log buffer space Post LGWR to perform I/O N Return false
Writes to Redo Log Files • LGWR writes the contents of the buffer to disk: • If posted by foreground processes: • Because redo log buffer space is not available • Because the transaction is committed • If the log buffer is 1/3 full • If 1 MB worth of redo records has been logged • If the thread is closed • At a log switch (user-initiated or normal) • At a three-second LGWR inactivity timeout
LGWR Algorithm • Acquire the redo writing and redo allocation latches. • Determine the buffers to write out. • Release the redo allocation latch. • Determine how many writes are required. B2 A B1 Log Buffer Request A: Single write Request B: Two writes
LGWR Algorithm • Calculate the target RBA for advancing the incremental checkpoint. • Release the redo writing latch. • Ensure that all foregrounds have completed modifying the redo buffers that need to be written. • Update redo block headers in the log buffer, including checksums if necessary. • Write the blocks to disk.
Redo Wait Events • There are 12 wait events that are directly related to redo. • Under normal operation, only a few of these events are waited on: • Log file parallel write • Log file sync • Log file switch • In a well-tuned system, redo-related waits should be none or minimal.
Redo Statistics • There are 16 redo statistics: • Most of them (14) are defined in kcrfh.h. • The others are defined in kcb.h and are related to the buffer cache and log file synchronization. • Statistics are calculated in these kcr functions: • kcrfwr(): Write redo into the log buffer • kcrfws(): Wait for space (log switch) • kcrfwi(): Write redo into the log file
Redo Latches • There are three types of redo latches: • Redo copy • Redo allocation • Redo writing • There can be multiple redo copy latches. • There is only one redo allocation and one redo writing latch.
Optimizing the Redo Buffer • Log buffer contention is not generally a major problem: • It is easily identified by checking the statistic redo buffer allocation retries. • Ideally, there should be no waits for redo log space. • Two ways to reduce space requests: • Reduce the amount of redo that is generated. • Improve the efficiency of LGWR.
Optimizing the Redo Buffer • The parameter LOG_BUFFER defines the size of redo log buffer. • Increase this parameter if there are no I/O bottlenecks on the redo disks. • A value of 128K is reasonable for most systems. • Striping redo log files may help LGWR to flush the log buffer faster.
Tuning Redo Latch Contention • If waits for log file sync have the greatest impact: • If waits for LGWR wait for redo copy are also high, then probably there are too many copy latches. • If not, then the I/O is too slow. • If waits for latch free have the greatest impact: • Contention for redo copy latch: Increase the number of copy latches • Contention for redo allocation latch: Consider redo-less operations • Contention for redo writing latch: Increase the size of the log buffer or reduce _LOG_IO_SIZE
Redo-less Operations • Certain operations can be redo-less: • Direct loader and direct path INSERT • CREATE TABLE asSELECT (CTAS) • ALTER TABLE ..MOVE • CREATE INDEX • ALTER INDEX REBUILD / SPLIT PARTITION • All partition operations that involve data movement • Redo-less operation is initiated by the NOLOGGING attribute. • The object cannot be recovered if it is created with NOLOGGING. • Standby database must be refreshed after NOLOGGING.
NOLOGGING Performance Time (Seconds) Redo Generated (KB) 100000 250 22807 20731 201 200 10000 5004 150 1000 104 77 100 100 30 (Logarithmic Scale) 64 13 39 44 10 50 35 0 1 CTAS CREATE INDEX Direct INSERT CTAS CREATE INDEX Direct INSERT NOLOGGING LOGGING
Redo with NOLOGGING • A redo record is still written for NOLOGGING operations. • For each write there is an invalidation record. • Each record covers several blocks (range). • When this redo is applied, the range of blocks is marked soft-corrupt.
Archiving • Archivelog mode tracks redo logs that must be archived. • Online redo logs must be archived before being overwritten. • Database flag changes to media recovery enabled. • ARCH process is normally started to perform automatic archiving. • User processes can also archive logs if they are instructed to do so.
Events that Post the Archiver • When archiving is manually enabled (ARCHIVE LOG START command) • Log switch • Timeouts every 300 seconds (5 minutes) • During instance recovery
Redologfile Archivelogfile ARCH Process Flow • Read online redo log asynchronously • Fill archive log buffers • Write to archive log file asynchronously Archive log buffer Async write Async read Archive log buffer Async read Async write
Archiver Operations • Performs error recovery • Switches to another member if errors occur • If all members are bad, then returns an error • Performs load balancing • Reads each buffer from a different member • Performs redo log validation • Ensures validation regardless of any initialization parameter • Checks log file header structure for errors • Enables log block checksums by setting DB_BLOCK_CHECKSUM to true
Tuning ARCH • Wait event log file switch (archiving needed) • Ensure optimal file configuration: • Optimize throughput for LGWR and ARCH. • Maintain resilience to media failures. • Use asynchronous I/O: • Operating system asynchronous I/O support • Oracle asynchronous I/O parameters • Evaluate size and number of redo logs.
Tuning ARCH • Establish a strategy for temporary peaks: • Reduce the number of archive destinations. • Use multiple ARCH processes. • Tune the archiving process by adjusting: • _LOG_ARCHIVE_BUFFER_SIZE • _LOG_ARCHIVE_BUFFERS
Multiple Archive Log Processes • Support multiple archive locations • Increase archiving throughput • Reduce the need to perform manual archives • Are controlled by the new dynamic parameter: LOG_ARCHIVE_MAX_PROCESSES • Are started if LOG_ARCHIVE_START=TRUE and automatic archiving is enabled • Default value is 1
Archive Problem Solving • Obtain a thorough description of the problem. • Perform debugging with archive commands. • Check v$ views. • Check any archive trace files. • Determine what ARCH is calling. • Check for platform-specific issues. • Find a solution to prevent the problem occurring.
Summary • In this lesson, you should have learned about: • Causes of performance problems that are related to redo record generation • Internal implementation of LGWR and the redo log buffer • Tuning redo generation • Internal implementation of ARCH • Tuning archiving
References • WebIV Note: 73163.1 • Source: kcrfw.c,kcrf.h,kcrfh.h,kcrr.h,kcrr.c