440 likes | 452 Views
Learn from Tim Boles, a Senior Staff Database Administrator, about the causes of data loss and its cost, basics of backups, building a backup policy, and how to ensure you can recover your data. Explore topics such as availability versus recovery, the new thing: cloud computing, virtual RAC standby database, and when availability does not help. Discover strategies for recovering from hardware failure, human error, software corruption, and more. Don't miss valuable resources and experts' insights on data recovery.
E N D
Are You Sure You Can Recover In Any Circumstance? Tim Boles Database Administrator Senior Staff
Who Am I ? • Tim Boles • DBA with Lockheed Martin IS&GS Civil Division • Oracle Database Administrator Since 1998 • Experience from gigabytes – terabytes databases • timothy.s.boles@lmco.com • www.lockheedmartin.com/isgs
Topics • Availability Is Not Recovery • Causes of Data Loss And Its Cost • Basics of Backups • Building A Backup Policy • How To Be Sure You Can Recover
The New Thing Cloud Computing Virtual RAC Standby Database 99.999 Uptime
When Does Availability Not Help? • RAC – Lose the underlying data files. • ESAN – Lose Power and drives don’t come up. • Virtual – Disgruntled Employee Drops Schema • COOP Site – Software Bug / Virus corrupts data • DISK MIRROR – Data Corruption
The Experts Say http://gbr.ppperdine.edu/033/dataloss.html http://www.ontrackdatarecovery.com/understanding-data-loss
Hardware Failure Recovery • Server Failure(s) • Drive Failure(s) • ESAN • Disaster Recovery Site
What If? • Server Disk Failure with Oracle Software Binaries • SAN with redo logs fail • Mirrored Master Destruction with Administrative Files • listener.ora • tnsnames.ora • password • dataguard configuration • Enterprise Manager configuration files • RMAN repository failure
Human Error • OS Commands • Bad DDL • Bad DML • Compounded by Additional Mistakes • Features Only Help When Enabled • What’s Your Plan?
Software Corruption • Customized COTS / In-House • COTS • Leopard OS • Oracle BUG http://tomkarpik.com/articles/massive-data-loss-bug-in-leopard/
How Would You Recover? • DROP SCHEMA CASCADE • Oracle software deletion • Wrong data deletion • detected immediately • detected several hours later • Batch Job corruption • Software Upgrade • Block Corruption Detected in Backup
Counting the Cost Meta Group of Stanford, CT in October of 2000: IT Performance Engineering & Measurement Strategies: Quantifying Performance Loss.
What The? RMAN Backup Rman Restore
RMAN Does Not Back Up • Oracle Software Home (binaries) • BFILES • Password Files • pfiles (spfiles are covered with newer versions) • tnsnames.ora • listener.ora • sqlnet.ora • /etc/oratab • scripts (shell, sql)
The Basics • Backup and Recovery Plan • Physical Backups • Data Storage • data files, contol files, Archived Redo • Support Files • Binaries, Initialization Files, Scripts, .ora, password • Logical Backups (Exports) • Logical data structure such as tables, tablespaces, objects, users, data within tables http://www.oracle.com/technology/deploy/availability/htdocs/BR_Overview.htm
Where to Start? • Stake Holders • Who Cares About The Data? • Users • Auditor, Lawyer, Regulator • Security System Administrators • Who Touches The Data? • System / Backup Administrators • Database Administrators
Basic Concerns • Size of Database (growth potential) • Backup Window • Space Available for Backup Storage • Media Used • Tools Available • Data Retention Times • Acceptable Mean Time To Recovery (MTTR)
Beyond the Basics • Encryption • Storage of Encryption Keys • Access to Encryption Keys • Design of Database • Read-Only Tablespaces • Tablespace Partitions • Compression Algorithms
Is Your Backup Good? • Backup Log • Physical Check • Logical Check • Only good if you can recover
What Is Your Source? • Memory / Experience • Oracle Documentation / Books • Internet Search Engines • Co-worker • Monitoring Tools (i.e. Oracle Enterprise Manager) • Customized Documentation
Documentation Is Your Friend • Good Business Sense • Every System Is Different • Boosts Ability to Concentrate • Gain Experience and Knowledge • Refine Backup / Restore Policies • Refine Procedures
B&R Document 20000 ft View • Overall Backup Strategy • Architecture Summary • Script Listing and Description • Procedures • Test Documentation
Overall Backup Strategy • Types of Backups And Reasons • Physical • Hot / Cold • Full / Incremental • Exports • Full • Schema, Table, (Transportable) Tablespace • Tools • Scheduling • Notification • Retention Policies (Time and Off-site Location) • System Specifics
Architecture Summary • Server Configuration • Tool Integration • Database Configuration
Tools and Technology Available • Media Failure • Restore Media from Backup • Recover using RMAN or SQL Commands • Full • Partial • Tablespace point-in-time (TSPITR) • Time-based (PITR) • Cancel-based • Change-based • Human or Software Error • Flashback http://www.oracle.com/technology/deploy/availability/htdocs/BR_Overview.htm
Flashback 9i and 10g R1 • Oracle 9i • Flashback Query • Oracle Database 10g R1 • Flashback Database • Flashback Table • Flashback Drop • Flashback Version Query • Flashback Transaction Query
Flashback 10g R2 and 11g • Oracle Database 10g R2 • Restore Points • Flashback Database Through Resetlogs • Oracle Database 11g R1 • Flashback Transaction • Flashback Data Archive • Oracle Database 11g R2 • Flashback Data Archive tracks most DDL http://www.oracle.com/technology/deploy/availability/htdocs/Flashback_Overview.htm
Cheat Sheet • System Name • System Description • Server Name • Server Description • Lead Administrator Contact Information • Database Information • Name • Oracle version/patch level • CSI • Important users / Password Expire Dates • Features Enabled
Cheat Sheet Continued..Locations • ORACLE_HOME • Oracle User Home • Administration SQL scripts • Administration Shell scripts • RMAN/backup scripts • Backup Logs • Backup Storage • contol files • Archive Logs
Script Listing and Description • Location • Usage • Execution Syntax • Parameters with Descriptions
Test Documentation • Backup Procedures • Recovery Scenarios To Test • Document Restore Procedures
Media Loss • Loss of a Control File • Loss of a data file for a tablespace • System, rollback segment, UNDO, user data, Index, read-only, partition • Loss of Redo Log file • Inactive Online, Current Online, Archived • Loss of entire redo group • Inactive Online, Current Online, Archived • Data Block Corruption • Physical • Logical • In Backup
Recovery of Entire Database • Recovery with No RMAN catalog • With / Without controfile • With / Without redo logs • Recovery to New Machine • Recovery to New File System. • Point in Time Recovery of Entire Database • Recovery of RMAN catalog • Creation of Standby Database • Creation of Duplicate Database on Test System
More Than Just One File • If database crashes during backup. • If binaries are destroyed. • If entire database server has to be replaced. • If SAN loses multiple drives. • If database crashes during table movement. • If database crashes during use of Flashback Technology • If Read-Only tablespace was created before last backup. • If Read-only tablespace was created after last backup
User / Software Error (Flashback) • Recovery of Dropped Schema • Recovery of Dropped Table • Data Corruption in Row • Transaction Flashback • Single • All resulting transactions • Software Installation Failure • Data Corruption in entire schema • Data Corruption in schema 5 hours old but reset of database needs to remain. • Trigger or procedure is recompiled with wrong code
Visit the IOUG Booth This Week • Located in the User Group Pavilion - Moscone West, 2nd Floor • Learn why over 23,000 have joined IOUG and what it can do for you • Chat with the IOUG Board of Directors • Hear about new regional IOUG BI user communities • Find out how to submit an abstract for COLLABORATE 11 – IOUG Forum • Enter for a chance to win a COLLABORATE 11 registration • Stock up on IOUG gear and educational materials!