310 likes | 578 Views
Data Guard Basics. Julian Dyke Independent Consultant. Web Version - February 2008. juliandyke.com. © 2008 Julian Dyke. Agenda. Data Guard The Theory The Reality. Data Guard The Theory. Data Guard Reasons for Deployment. Site Failures Power failure Air conditioning failure Flooding
E N D
Data GuardBasics Julian Dyke Independent Consultant Web Version - February 2008 juliandyke.com ©2008 Julian Dyke
Agenda • Data Guard • The Theory • The Reality
Data GuardReasons for Deployment • Site Failures • Power failure • Air conditioning failure • Flooding • Fire • Storm damage • Hurricane • Earthquake • Terrorism • Sabotage • Plane crash • Planned Maintenance • HUMAN ERROR
Standby Primary Redo Instance Instance Database Database Site 1 Site 2 Data GuardStandby Database Primary Database Standby Database
Data GuardPhysical Standby • Physical Standby • Technology introduced in Oracle 7.2 • Marketed as Data Guard in Oracle 8.1.7 and above • Standby is identical copy of primary database • Redo changes • transported from primary to standby • applied on standby (Redo Apply) • Can switch operations to standby • Planned (switchover / switchback) • Unplanned (failover) • Failover time dependent on various factors • Rate of redo generation / size of redo logs • Redo transport / apply configuration
Data GuardLogical Standby • Introduced in Oracle 9.2 • Subset of database objects • Redo copied from primary to standby • Changes converted into logical change records (LCR) • Logical change records applied on standby (SQL Apply) • Standby database can be opened for updates • Can modify propagated objects • Can create new indexes for propagated objects • May need larger system for logical standby • LCR apply can be less efficient than redo apply • Array updates on primary become single row updates on standby
Data GuardProtection Modes • Three protection modes: • Maximum protection - zero data loss • Redo synchronously transported to standby database • Redo must be applied to at least one standby before transactions on primary can be committed • Processing on primary is suspended if no standby is available • Maximum availability - minimal data loss • Similar to maximum protection mode • If no standby database is available processing continues on primary • Maximum performance (default) • Redo asynchronously shipped to standby database • If no standby database is available processing continues on primary
Data GuardRedo Log Shipping • ARCH background process • Copies completed redo log files to standby • LGWR background process - modes are: • ASYNC - asynchronous • Oracle 10.1 and below • redo written by LGWR to dedicated area in SGA • read from SGA by LNSn background process • Oracle 10.2 and above • redo written by LGWR to local disk • read from disk by LNSn background process • SYNC - synchronous • Redo written to standby by LGWR - modes are: • AFFIRM - wait for confirmation redo written to disk • NOAFFIRM - do not wait
ArchivedRedoLogs PrimaryDatabase StandbyDatabase StandbyRedoLog ArchivedRedoLogs OnlineRedoLog Data GuardARCH Redo Transmission Primary Database Standby Database LGWR RFS MRPLSP LOG_ARCHIVE_DEST_2 ARC0 ARC1 ARCn LOG_ARCHIVE_DEST_1
ArchivedRedoLogs StandbyRedoLog ArchivedRedoLogs StandbyDatabase PrimaryDatabase OnlineRedoLog Data GuardLGWR (ASYNC) Redo Transmission Primary Database Standby Database LGWR RFS MRPLSP LNSn ARCn ARCn LOG_ARCHIVE_DEST_1
ArchivedRedoLogs OnlineRedoLog StandbyRedoLog ArchivedRedoLogs StandbyDatabase PrimaryDatabase Data GuardLGWR (SYNC) Redo Transmission Primary Database Standby Database LGWR LNSn RFS MRPLSP ARCn ARCn LOG_ARCHIVE_DEST_1
Data GuardRole Transitions • There are two types of role transition • Switchover • Planned failover to standby database • Original primary becomes new standby • Original standby becomes new primary • No data loss • Can switchback at any time • Failover • Unplanned failover to standby database • Original standby becomes new primary • Original primary may need to be rebuilt • Possible data loss
Site1 Site2 Site1 Site2 PhysicalStandby Primary Primary PhysicalStandby Instance Redo Instance Instance Redo Instance Database Database Database Database Primary Database Standby Database Standby Database PrimaryDatabase Data GuardSwitchover Before Switchover After Switchover
Site1 Site2 Site2 Site1 Site2 PhysicalStandby PhysicalStandby Primary Primary Primary Instance Instance Instance Instance Redo Redo Instance Database Database Database Database Database Primary Database Standby Database Standby Database PrimaryDatabase Unavailable Data GuardFailover Before Failover After Failover
Data GuardRead-Only Mode • Physical standby database can be opened in read-only mode • (Managed) Recovery must be suspended • Reports can use temporary tablespaces • Sorts • Temporary tables • Reports cannot modify permanent objects • Failover times may be affected • Suspended redo must be applied
Data GuardDelayed Redo Application • Delay in redo application can be configured • Redo is transported immediately • Provides protection against site failure • Redo is not applied immediately • Provides protection against human error • Increases potential failover times • In Oracle 10.1 and above flashback database can be used as an alternative to delayed redo application
Data GuardData Guard Broker • Introduced in Oracle 9.2 • Stable in Oracle 10.2 and above • Managed using DGMGRL utility • Contains Data Guard configuration • Additional layer of complexity • Used by Enterprise Manager to manage standby • Mandatory for some new functionality e.g. • Fast Start Failover
Observer Primary Standby Node 1 Node 2 Site3 Database Database Site2 Site1 Data GuardFast Start Failover
Data GuardFast Start Failover • Detects failure of primary database • Automatically fails over to nominated standby database • Requirements include • Flashback logging must be configured • DGMGRL must be used • Observer process running in third independent site • Highly available in Oracle 11.1 and above • MAXIMUM AVAILABILITY protection mode • Standby database archive log destination must be configured as LGWR SYNC • MAXIMUM PERFORMANCE protection mode • Oracle 11.1 and above • Primary database can potentially be reinstated automatically • Using flashback logs
Data GuardFast Start Failover • Advantages • No interconnect network required between sites • No storage network required between sites • RAC licences not required if each site is a single-instance • Disadvantages • Active / Passive • Requires Enterprise Edition licence • Remaining infrastructure must also failover • Network • Application tier • Clients
Data GuardOracle 11g New Features • Snapshot Standby • Standby can be converted to snapshot standby • Can be opened in read-write mode (for testing) • Redo transport continues • Redo apply delayed • Standby can subsequently be converted back to physical standby • Active Data Guard • Separately licensed option • Updates applied to primary • Changes can be read immediately on standby databases • Standby database can be opened in read-only mode • Redo can continue to be applied
Data GuardLicensing • Standby database nodes must by fully licensed • Same metric as primary (named user, CPU etc) • Standard Edition • Cannot use Data Guard • Use user-defined scripts to transport redo • Use Automatic Recovery to apply redo • Manually resolve archive log gaps • Enterprise Edition • Use Managed Recovery to apply redo • Use Fetch Archive Logging to resolve archive log gaps • Additional licenses required for Active Data Guard
Data GuardAlternatives • Standard Edition • Manual log shipping using scripts • SAN level Replication technologies • Netapp SnapMirror, MetroCluster • EMC SRDF, Mirrorview • HP StorageWorks • Redo log replication technologies • Quest Shareplex
Data GuardThe Reality • Many sites run physical standbys • Well proven technology • Spare capacity on standby often used for development or testing during normal operations • Relatively few sites run a logical standby • Streams is much more popular • Many sites enable flashback logging • In both development and production environments • Very few using Automatic Failover • Very few sites working with Oracle 11g yet • Consequently none using Active Data Guard
Data GuardThe Reality • Failover times • Normally dependent on management decisions • Usually some investigation before failover • Time to failover database is minimal (5-10 minutes) • Time to failover infrastructure can be hours • Network configuration • DNS • Application / web servers • Clients • Failover SLAs often up to 48 hours • Rebuild times • Can take minutes using flashback logging • Can take much longer depending on reason for failover
Thank you for your interest • References • http://www.juliandyke.com/References/References.html • Questions • info@juliandyke.com