440 likes | 823 Views
Using Oracle Data Guard for Applications Disaster Recovery … and More. Jeff Slavitz Independent Consultant Oracle Applications DBA and Developer Jeff@OracleAppsPro.Com. What Would You Do???. It’s Monday morning.
E N D
Using Oracle Data Guard for Applications Disaster Recovery … and More Jeff Slavitz Independent Consultant Oracle Applications DBA and Developer Jeff@OracleAppsPro.Com
What Would You Do??? • It’s Monday morning. • Fatal production system hardware problem, or production hardware location is now a smoking hole • Expect Production to be down for an unforseen amount of time • It is year-end, you have a new manager and your senior DBA is on vacation Jeff Slavitz - NorCal OAUG Training Day 2010 v3
What is your backup situation? • Do you have a cold backup? On-site or off-site? • Do you have a hot backup. On disk? On tape? On another accessible site? How recent is it? • Will you be able to read your tape backups? • How recent are the archive logs that are now on the system that is down? • Where are your archive log backups since the last hot backup? Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Options • Restore and clone latest backup to a test box. How familiar is your DBA with RMAN recover? Is the box big enough to run the company? Will interfaces work (EDI, Payroll, …) • Rent or buy a box. How long would it take to find and setup a new box? • How long will it take to get the company up and running? Have you practiced this process or is this a fire drill? How many P1s will you need? • How much data will the company lose? • What does the company do while Oracle is down? Jeff Slavitz - NorCal OAUG Training Day 2010 v3
How Do You Feel? How Does Your Manager Feel? Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Or… • In 15 minutes you’ve fully switched Oracle Applications over to a standby database • You’ve tested the procedure with all DBAs • The failover procedure is fully supported and documented by Oracle • Interfaces are tested on the standby database • You run the entire company once a month for a short period of time on the standby database • You have a test standby database that DBAs practice on and you use for reporting purposes Jeff Slavitz - NorCal OAUG Training Day 2010 v3
How Do You Feel? How Does Your Manager Feel? Jeff Slavitz - NorCal OAUG Training Day 2010 v3
My Goal Today • Tell you how you can be relaxed, not stressed, in a disaster recovery situation • What is Data Guard • Describe added value Data Guard and Flashback provide in addition to disaster recovery • On a high level, review steps to implement Data Guard and Flashback • Lessons learned - what Metalink notes don’t tell you • Please ask questions as I go along • Apologies to the non-DBAs Jeff Slavitz - NorCal OAUG Training Day 2010 v3
What is Data Guard? • Data Guard is part of Oracle, not a separate product, which provides a set of services that create, maintain, manage, and monitor one or more standby databases • Data Guard maintains these standby databases as transactionally consistent copies of the production database • Data Guard can switch any standby database to the production role Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Without Data Guard Jeff Slavitz - NorCal OAUG Training Day 2010 v3
With Data Guard Oracle® Database High Availability Best Practices 10g Release 2 (10.2) Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Why Not Use Data Guard? • Too expensive – not! You already have and license it if you have Oracle Enterprise Edition. Nothing to purchase in order to implement what we discuss today. • Too bleeding edge. Data Guard has been around since Oracle 9i and Oracle 8i as Standby Server. • Too difficult to implement. Well documented by Oracle and completely supported. This presentation is a roadmap through the process plus my own observations and experiences. • No DR box. Use test box for standby. Doesn’t protect against smoking hole but it’s better than nothing. • How often do you really need DR? Is once enough? Plus a standby database can be used for other purposes. Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Why You Should Use Data Guard • If your Production Oracle Applications instance went away would anybody notice? • Our job is to protect the company’s data • Your manager will thank you – DR is important. • Data Guard plus Flashback provide added value: • Instant clone • Reporting instance • Many exciting possibilities! Jeff Slavitz - NorCal OAUG Training Day 2010 v3
What Is Flashback? • Introduced in Oracle 9i • Flashback is part of your Oracle database – no licensing requirement • Provides a rewind button for your data • Flashback Database - return database to a previous point in time • Flashback Table - return a table to a previous point in time • Flashback Query – see data at a previous point in time Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Data Guard and Flashback Together Many possibilities when used together … Scenario: Show stopper problem in Production Need to test a patch or data fix with current production data on test system ASAP Old school: Clone Time to complete: ??? Hours Data Guard: Use standby database as an instant clone Time to complete: 10 minutes Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Instant Clone • Activate standby database • Perform testing on standby database • When testing complete, flashback standby database to activation point in time • Test again? Allows destructive testing. • When all done, flashback and re-activate standby database Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Reporting Database Scenario: You want a data warehouse which is a snapshot of Production as of 9pm the day before. Users need access via Oracle Applications and Discoverer Old School: Daily clone Data Guard: Use standby database as a reporting database Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Reporting Database • Activate standby database • Users login to standby database with Oracle Applications, Discoverer, or any other tool. Users can make changes in Application though they won’t be saved after Flashback. • Archive logs accumulate on standby database • At 9pm each day flashback database to the way it was at 9pm the day before, apply archive logs created since then, reopen database • Can have multiple standby databases – one for DR, one for reporting, one for ??? • Oracle 11i Active Data Guard allows read-only access to standby database while it is being updated from primary Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Bumps Along the Way • Time implementation so it’s not at year-end • Bumpy transition into production – low Test database activity is not an accurate picture of how Data Guard and Flashback will work in Production • 3mb network was not enough for standby to keep up with primary - needed 10mb just for Data Guard • Do periodic full company test of standby system • Data Guard and Flashback implementation are well documented but there is a learning curve Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Implementing Data Guard and Flashback Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Hardware Configuration Primary (California) Standby (Nevada) Database Database 45mb of which 8-10mb used by Data Guard Application Application Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Software Configuration • Database • 10.2.0.4 • Data Guard Physical Bundle patch 7936993 • Data Guard Logical Bundle patch 7937113 • Data Guard Broker Bundle patch 7936793 • Applications 11.5.10.2 Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Data Guard Setup Two Metalink notes: • Note 216212.1: Business Continuity for Oracle Applications Release 11i, Database Releases 9i and 10g • Note 452056.1: Business Continuity for Oracle Applications Release 12 on Database Release 10gR2 (earned my vote for best Metalink note ever) • Use Release 12 note • Better configuration, simpler and works fine with 11i • Substitute APPL_TOP for Instance Top in text Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Logical vs Physical Two types of standby databases • Physical • Byte-for-byte duplicate of the primary database • Archived redo logs transferred from primary database are directly applied to the standby database • When standby in recovery mode it is mounted not open • Logical • Different structure than the primary database • Oracle uses SQL statements to update standby • Standby can be open while it is being updated Today we are talking about Physical standby databases Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Maximum Protection Mode • Three types of protection – Maximum Protection is highest level of data security • Primary database changes are not committed until it has been confirmed that the data is available on at least one standby database • If Oracle determines that the redo data cannot be transferred from the primary server to the standby server, it will automatically stop the primary database instance • Data transmitted synchronously - network implications Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Maximum Performance Mode • Default protection mode • The commit operation on the primary database is not contingent upon the data being received by the standby server • If all of the standby servers become unavailable, processing will continue on the primary database • Data transmitted asynchronously • This performance mode is what are discussing today Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Maximum Availability Mode • Second highest level of data security • Primary database changes are not committed until it has been confirmed that the data is available on at least one standby database • If the standby database becomes unavailable for any reason, the protection mode is temporarily lowered to maximum performance until the problem has been corrected Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Data Guard Installation Overview • Develop a naming convention for your primary and standby database servers e.g. PROD_<primary server>, PROD_<standby server> • Create standby redo logs: one more than number of redo logs, same size as redo logs, do not multiplex • Clone database to standby • Copy datafiles to standby with closed database or with open database using RMAN (see note 753241.1, Configuring Standby Database on R12 using RMAN Hot Backup) • Test primary and standby and listeners using tnsping • Clone application to standby • You are now ready to turn on Data Guard Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Primary init.ora ifile Parameters # Global unique name db_unique_name=PROD_CAMELDB1 # Flash recovery area. This is the default location for control files, online # redo logs, archived redo logs, flashback logs, RMAN backups. db_recovery_file_dest=/u01/oracle/flash_recovery_area # First destination for archived redo logs log_archive_dest_1 = 'LOCATION=USE_DB_RECOVERY_FILE_DEST MANDATORY‘ # Second destination for archived redo logs log_archive_dest_2 = 'SERVICE=PROD_LNXDRDB1 valid_for=(online_logfiles,primary_role) db_unique_name=PROD_LNXDRDB1 LGWR ASYNC=20480 OPTIONAL REOPEN=15 NET_TIMEOUT=30‘ Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Primary init.ora ifile Parameters # Limit on the total space available to the flash recovery area. db_recovery_file_dest_size=100g # In minutes, how long you want to keep flashback files. Since using # guaranteed flashback really don't need this to be too big. db_flashback_retention_target=120 # On/Off switch for sending logs to standby system log_archive_dest_state_2 = defer # Databases in Data Guard configuration using my instance naming convention log_archive_config='dg_config=(PROD_CAMELDB1,PROD_LNXDRDB1)‘ # Log gap detection and resolution when this database is the standby fal_server = 'PROD_LNXDRDB1‘ fal_client = 'PROD_CAMELDB1' Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Start Data Guard Start primary and second database and listeners • Primary: startup • Standby: startup mount Turn on archive redo log transport on primary • alter system set log_archive_dest_state_2=enable; Put standby in recovery mode • Recover managed standby database using current logfile disconnect; Confirm logs are shipping to standby • Primary: alter system archive log current; • Standby: Check archive directory or query v$archive_log Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Care and Feeding of Data Guard • Check alert log of primary and standby on a regular basis • Set Enterprise Manager to email you when an error shows up in alert log • Automatic archive log gap resolution usually works but network connectivity issue can require you to manually fix • Primary alert log will notify you when a gap exists: FAL[client]: Failed to request gap sequence GAP - thread 1 sequence 435-442 • Copy missing archive logs from primary to standby • On standby: alter database register logfile … Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Flashback • db_recovery_file_dest • defines Flash recovery area disk location • This is the default location for control files, online redo logs, archived redo logs, flashback logs, RMAN backups • Archive logs are placed in flashback area in sub-directory with date • DO NOT delete files in flashback area manually; database calculates space available based on what RMAN has done • db_recovery_file_dest_size • Specifies limit on the total space to be used by the flash recovery area • Check space usage using v$flash_recovery_area_usage • When recovery area is full database stops! • db_flashback_retention_target • Defines in minutes, how long you want to keep flashback files • Since using guaranteed flashback don't need this to be too big • To turn flashback on: mount database and alter database flashback on Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Switchover – Database Steps • Seamless switch from primary to standby • Switch existing primary to new standby: • Shutdown application • Confirm all logs received and applied on standby • alter database commit to switchover to standby with session shutdown; • Recover managed standby database using current logfile disconnect • Switch existing standby to new primary: • alter database commit to switchover to primary; • alter database open; • alter system set log_archive_dest_state_2=enable; • Confirm logs are shipping to standby by checking alert log, archive directory or query v$archive_log Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Switchover – Application Steps • Some of these steps are not in Metalink note but are required • Clear application context: exec fnd_conc_clone.setup_clean • Run adautocfg on database and application tiers • Run cmclean.sql (note 134007.1) to avoid Output Post Processing manager not starting • Update fnd_concurrent_requests • Change logfile_name and outfile_name to new APPLCSF path • Change logfile_node_name and outfile_node_name to new host name • Update fnd_conc_req_outputs (for XML reports) • Change file_name to new APPLCSF patch • Change file_node_name to new host name Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Switchover – Application Steps • Update wf_notification_attributes.text_value with new node name • Update wf_item_attribute_values.text value with new node name • Check for any other profile values that contain old node name • Clear Apache cache • Start application and send users new login url • Synchronize APPLCSF log and out files • Use rsync • Faster than scp and only copies changed files rsync -avz $APPLCSF/out/${TWO_TASK}*/ - applprod@lnxdrapp1:$APPLCSF/out/${TWO_TASK}* • Consider setting cron job to synchornize APPLCSF in case of system failure Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Failover • Immediate switch of primary to standby system • Some archive log data may be lost due to network latency • On standby-soon-to-be-primary: • Cancel database recovery • alter database commit to switchover to primary • Configure application as shown in Switchover • Even with cron job synchronizing APPLCSF log and out files you probably won’t get all report output • Repair former primary system Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Failover – Sync Old Primary as New Standby • Establish new standby by copying primary datafiles using RMAN, cold backup or flashback. To use flashback: • Find the SCN when the existing primary database became the primary: select to_char(standby_became_primary_scn) from v$database; • On new standby: • Startup mount • flashback database to scn <above #> • alter database convert to physical standby; • recover managed standby database using current logfile disconnect; Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Instant Clone or Reporting Database • On standby: Cancel standby recovery create restore point STBY_ON guarantee flashback database Active and open database Configure application as in Switchover • Users can now login to the standby database through Oracle Applications, Discoverer or anything. • Apply patches, test, update data in standby database. • To revert to restore point STBY_ON and do further testing as needed: Shutdown immediate Startup mount Flashback database to guaranteed restore point STBY_ON Alter database open Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Resume Standby Operation On Standby: • Shutdown application • Shutdown database • Startup mount • Flashback database to restore point STBY_ON • Alter database convert to physical standby Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Lessons Learned • Script as much as possible – you will make typos under time pressure • Use Enterprise Manager to see how Data Guard affects your system and tune accordingly • Implement Data Guard and flashback at different times • Setup one test database using Data Guard for testing and training Jeff Slavitz - NorCal OAUG Training Day 2010 v3
For the Future • Try Maximum Availability protection • Use Enterprise Manager to manage Data Guard. This requires Data Guard Broker implementation which requires use of spfile. • Completely script switchover and failover as shown in note 452056.1 • Use RMAN to backup standby database only • Upgrade database to 11G and experiment with Active Data Guard. This allows you to create a read-only database that is open and is constantly updated from primary. Jeff Slavitz - NorCal OAUG Training Day 2010 v3
For More Information • Google (it knows everything) • http://blogs.oracle.com/stevenchan • Note 452056.1 – Business Continuity for Oracle Apps Release 12 on Database 10GR2 (works fine for 11i) • Note 216212.1 – Business Continuity for Oracle Apps Release 11i, Database Releases 9i and 10G • Note 753241.1 – Configuring Standby Database on R12 using RMAN Hot Backup (works fine for 11i) • Note 805438.1 – How to Open Physical Standby For Read Write Testing and Flashback • Data Guard Redo Transport & Network Best Practices Oracle 10GR2 White Paper • Oracle Data Guard Concepts and Administration manual • High Availability Best Practices manual • Backup and Recovery Reference manual Jeff Slavitz - NorCal OAUG Training Day 2010 v3
Questions? Want the latest version of this presentation? www.OracleAppsPro.com Jeff Slavitz Jeff@OracleAppsPro.com (415) 388 - 3003 Jeff Slavitz - NorCal OAUG Training Day 2010 v3