170 likes | 649 Views
ORACLE 10g DATAGUARD Ritesh Chhajer Sr. Oracle DBA. Agenda. Physical vs. Logical Standby Standby Protection Modes Log Transport Attributes Standby Redo Logs Setup Physical Standby step-by-step Managing and Monitoring standby Role Transition: Switchover/Failover. Overview. Purpose
E N D
ORACLE 10g DATAGUARD Ritesh Chhajer Sr. Oracle DBA
Agenda Physical vs. Logical Standby Standby Protection Modes Log Transport Attributes Standby Redo Logs Setup Physical Standby step-by-step Managing and Monitoring standby Role Transition: Switchover/Failover
Overview Purpose To provide an efficient disaster recovery solution by maintaining transactionally consistent copies of the production database at a remote site. Physical Standby Kept in sync with the primary by using media recovery to apply redo generated on primary Used for BCP Can be opened in read-only mode but redo won’t be applied for that time Logical Standby Kept in sync with the primary by transforming redo data received from primary into logical SQL statements and then executing those SQLs against the standby database. Used to offload reporting from the primary database Can be opened in read-only mode while the changes are being applied.
Protection Modes Decide on Standby Protection Mode before setting it up: MAXIMUM PROTECTION Pre-requisites Using LGWR SYNC AFFIRM, transport of redo logs to be done in a synchronous fashion. Standby redo logs (SRLs) need to be created on standby site. At least one standby must be available for the primary database to function. Need high speed network. Pros Zero data loss Cons Primary shuts down if in case of network issues unable to commit on standby at the same time.
Protection Modes 2. MAXIMUM AVAILABILITY Pre-requisites Using LGWR SYNC AFFIRM, transport of redo logs to be done in a synchronous fashion. Standby redo logs (SRLs) need to be created on standby site. Features If network issues, switches to maximum performance and when the fault corrects switches back to maximum availability. Data loss only if primary loses it’s redo logs. SQL> alter database set standby to maximize availability; 3. MAXIMUM PERFORMANCE Asynchronous redo shipping using ARC or LGWR ASYNC. No impact on primary’s performance even if network issues. No need to create SRLs unless real-time apply is needed on the standby site.
Log Transport Services Log Transport Service Attributes are defined on primary in log_archive_dest_2 ARC(default) ARC will first archive the online redo log to local destination on primary. Then second ARC process spawns and writes the archive to remote standby. By default, log_archive_local_first=true in init.ora on primary. DO NOT CHANGE IT. LGWR In contrast to ARC, which transmits redo to standby only at log switch time, LGWR attribute instructs LGWR process to transmit redo to standby at the same time while the redo is writing to the online redo logs. Transmission of redo can be done synchronously (SYNC) or asynchronously (ASYNC) AFFIRM All Disk I/O at standby to be performed synchronously
Log Transport Services SYNC By default, LGWR archives synchronously. Once I/O is initiated, archiving must wait for I/O to complete. This means transaction is not committed on primary database until redo data necessary to recover that transaction is received by the destination. ASYNC LGWR does not wait for the I/O to complete. LGWR network server process(LNS) performs actual network I/O. User-configurable buffer used to accept outbound redo data from LGWR. ASYNC=20480 indicates a 10MB buffer. Maximum can be upto 50MB. MAX_FAILURE Defines number of times to retry a destination that has been closed due to a failure NET_TIMEOUT Used with LGWR ASYNC. Defines how many seconds to wait before giving up on a network connection. REOPEN Determines how long the primary waits before retrying a connection
Setup Creating a physical standby Both primary and standby systems must be identical in configuration with regards to operating system, platform architecture and database version. H/W config may differ. 1. Enable archiving on primary log_archive_dest_1=‘LOCATION=<path where logfiles will be archived locally>’ log_archive_format=%t_%s_%r.dbf log_archive_start=true( As of 10g release, its deprecated ) SQL> shutdown immediate; SQL> startup mount; SQL> alter database archivelog; SQL> alter database open;
Setup contd. 2. Enable force logging on primary SQL> alter database force logging; This is required as any nologging operations would not be logged within redo stream. In this mode, nologging operations are permitted to run, but changes are placed into redo. 3. Creating password file on primary and standby Create a password file( if not created yet ) orapwd file=orapw<SID> password=<pwd> remote_login_passwordfile=exclusive SYS password must be identical on both primary and standby for log transport services to function. 4. Creating standby controlfile on primary SQL> alter database create standby controlfile as ‘<../path/standby.ctl>’; 5. Take hotbackup of primary and copy datafiles,archivelogs and standby controlfile to standby 6. Create tnsnames.ora aliases for primary and standby on both primary and standby NOTE: DO NOT COPY REDO LOGS SINCE STANDYB WILL CREATE IT’S OWN
Setup contd. 7. Prepare init.ora on primary db_name=‘TEST’ db_unique_name=‘PRI’ service_names=‘PRI_SERVICE’ log_archive_config=‘DG_CONFIG=(PRI<db_unique_name>,STDBY<db_unique_name>)’ log_archive_dest_1=‘LOCATION=<archivelogpath>’ log_archive_dest_state_1=enable log_archive_dest_2=‘SERVICE=<STDBY_SERVICE> ARCH ASYNC reopen=300 max_failure=0 net_timeout=60 VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=STDBY’ log_archive_dest_state_2=enable log_archive_min_succeed_dest=1 log_archive_max_processes=2 standby_file_management=auto fal_server=<STDBY_SERVICE> fal_client=<PRI_SERVICE>
Setup contd. 8. Prepare init.ora on standby db_name=‘TEST’ db_unique_name=‘STDBY’ service_names=‘STDBY_SERVICE’ log_archive_config=‘DG_CONFIG=(PRI<db_unique_name>,STDBY<db_unique_name>)’ log_archive_dest_1=‘LOCATION=<archivelogpath>’ log_archive_dest_state_1=enable log_archive_dest_2=‘SERVICE=<PRI_SERVICE> ARCH ASYNC reopen=300 max_failure=0 net_timeout=60 VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=PRI’ log_archive_dest_state_2=enable log_archive_min_succeed_dest=1 log_archive_max_processes=2 standby_file_management=auto fal_server=<PRI_SERVICE> fal_client=<STDBY_SERVICE>
Setup contd. 9. Mount the standby and start applying the changes SQL> startup mount; SQL> alter database recover managed standby database disconnect; To put standby in read-write mode SQL> alter database activate standby database; To stop the apply: SQL> alter database recover managed standby database cancel immediate; To start real time apply: SQL> alter database recover managed standby database using current logfile disconnect; This needs creation on SRLs (standby redo logs) To put standby in read-only mode SQL> alter database recover managed standby database using current logfile disconnect; SQL> alter database open read only; Note: Once the standby is made primary (read-write), verify redo logs and tempfiles.
Monitoring On primary check if archive logs are getting copied to standby: SQL> select status from v$archive_dest where dest_id=2; On Standby monitor MRP process: SQL> select status from v$managed_standby where process like ‘%MRP%’; Status must be “APPLYING_LOG” or “WAIT_FOR_LOG” ps –ef|grep mrp On standby detect archive gap SQL> select * from v$archive_gap; This will return records if MRP status is “WAIT_FOR_GAP” With 10gR2, v$dataguard_stats is introduced to monitor redo transport/apply progress; SQL> select value from v$dataguard_stats where name=‘apply lag’; SQL> select value from v$dataguard_stats where name=‘transport lag’; Note: In case Dataguard is RAC, MRP process would be applying on one of the node. If this node crashes, MRP must be started on that surviving node to which VIP of the crashed node has failed over.
Standby Redo Logs Guidelines when creating standby redo logs: Number of standby redo logs should be the same number as online redo logs plus one. Standby redo logs should be exactly the same size as the online redo logs. SRLs should be created on both primary and standby to facilitate seamless role changes. In a RAC environment, all SRLs should be on a shared disk and may be thread specific. Used with maximum protection modes and when real-time apply is used. How SRLs work? LGWR process on primary initiates a connection with standby. Standby listener responds by spawning a process called RFS(remote file server) RFS process creates n/w conn with processes on primary and waits for data to arrive. Once data comes, RFS places it into standby redo logs. When log switch occurs on primary, standby redo logs are switched and RFS will go to next available standby redo log.
SWITCHOVER Switchover allows a primary and standby to reverse roles without any data loss. No need to re-create the old primary. Performed for planned maintenance. Steps: 1. Verify if primary can be switched over to standby SQL> select switchover_status from v$database; If value returns “TO_STANDBY”, its alright to switch the primary to standby role. 2. Convert primary to standby SQL> alter database commit to switchover to physical standby; If value is “SESSIONS ACTIVE” from step 1, then SQL> alter database commit to switchover to physical standby with session shutdown; 3. Shutdown the restart the old primary as standby SQL> shutdown immediate; SQL> startup mount; At this point, we now have both databases as standby. 4. On target standby database, verify switchover status. If value is “TO_PRIMARY” then SQL> alter database commit to switchover to primary; If value is “SESSIONS ACTIVE”, then append “WITH SESSION SHUTDOWN” to above command. 5. Shutdown and restart the new primary database SQL> shutdown immediate;startup;
FAILOVER Failover implies data loss and can result in the need to re-create old primary. Steps: 1. Identify and resolve any gaps that may exist on standby. SQL> select * from v$archive_gap; Copy missing archives from primary to standby and register them to standby controlfile. SQL> alter database register physical logfile ‘<archivepath>’; 2. If standby redo logs are configured and active, SQL> alter database recover managed standby database finish; If NO SRLs or they are not active, SQL> alter database recover managed standby database skip standby logfile; 3. Convert standby to primary; SQL> alter database commit to switchover to primary; 4. Restart new primary SQL> shutdown immediate;startup; Note: Once the standby is made primary (read-write), verify redo logs and tempfiles.
THANK YOU Get in Touch http://www.ritzyblogs.com