430 likes | 779 Views
DB2 Recovery Solutions & More. Bill Arledge DB2 Data Management Analyst BMC Software. When Availability is Critical, Recovery is Crucial!. Unplanned downtime is an unfortunate fact of life... Up to 80% of all unplanned downtime is caused by software or human error*
E N D
DB2 Recovery Solutions & More Bill Arledge DB2 Data Management Analyst BMC Software
When Availability is Critical, Recovery is Crucial! • Unplanned downtime is an unfortunate fact of life... • Up to 80% of all unplanned downtime is caused by software or human error* • Up to 70% of recovery is “think time”! *Source: Gartner, “Aftermath: Disaster Recovery”, Vic Wheatman, September 21, 2001 ©2006 BMC Software
Recovery is a Real Challenge • Cost of Downtime varies • By Industry • By Business Cycle • Staff Productivity and Expertise pressures • Harder to get and keep good technicians • Recovery is a ‘part time’ job, skills may wane • A lot of hours can go into DR test ‘preparations’ • Planned downtime (backups) pressures • Consistent Copies/Dumps require outage • Even a brief outage may impact business • Unplanned outages happen at painful times ©2006 BMC Software
Recovery Elements CREATE RECOVERY JCL FAILURE EXECUTE RECOVERY ANALYSIS RECOVERY MANAGEMENT FAST UTILITIES APPLICATION OUTAGE ©2006 BMC Software
23 Hours of Good Transactions 1 Hr of Bad Transactions Recovery started Image Copy RecoveryPoint Apply 23 Hours of Log Let’s think out of the BOX! • Who says the only way to recover to a PiT is forward recovery? • What if you could ‘avoid’ recovery for some objects in an application • What if you could exploit storage technology instead of tape copy? • What if you could go ‘backwards’ through the log? • What if you want to make the bad SQL just go away by ‘undoing’ it? • This presentation will show how to use BMC Software to reduce or eliminate the downtime for Backup, Recovery, Replication, and Batch Restart ©2006 BMC Software
Recovery Management for DB2Solving Business Problems with Innovation and Automation • Solution Integration • Building on core components to leverage BMC technology • Backups – High availability techniques for necessary process • COPY PLUS Value Proposition • Snapshot Copy (Software, Hardware, Instant Snapshot) • Hybrid Copy Technique – mix and match for effective backup and recovery • Cabinet Copy – dramatic reduction in elapsed and CPU time • Encrypted Image copy – secure offsite tape storage • Online Consistent Copy – clean copy with NO Quiesce • Application Recovery – speed and automation for an infrequent event • Recovery Management interface • Innovative forward and PIT recovery techniques – Index automation, Backout, Backup & Recovery Avoidance, Timestamp Recovery, Log accumulation • Creative uses for the DB2 Log data – Reporting, UNDO, Migration • Disaster Recovery • Local site preparation automation • Remotes site System and Application Recovery automation • DR Reporting, Estimation, Simulation • Remote Recovery and Replication ©2006 BMC Software
Recovery Management for DB2 • Top ROI Features • Large Application Group (e.g. SAP) Backup and Recovery Automation* • Snapshot Copy • Software, Hardware*, Instant Snapshot*, Instant Restore* • Online Consistent Copy* • Index Copy and Recover Automation* • Physical Backout* • Backup and Recovery Avoidance* • Log Manipulation • Extract & store by filter, Reports (including Data change Audit), Logical UNDO • Dropped Object Recovery* • Disaster Recovery and Coordinated Recovery automation* • Disaster Recovery Reporting* • Recovery Time Estimation* • Recovery Simulation* • Timestamp Recovery* • DB2 Version 8 Exploitation (more later!) ©2006 BMC Software
High Speed Backup Technology Job Generation and Management Log Analysis Redo/Undo Fast, Dependable Recoveries RECOVERYMANAGERfor DB2 Log Master™for DB2 COPY PLUSfor DB2 RECOVERPLUS for DB2 BMC Recovery Management for DB2 Solving Business Problems with Innovation and Automation Integrating the capabilities of mature, patented functions. Recovery Management for DB2 Intelligent * Integrated * Automated * Optimized Solution Exclusives • Recovery Simulation • Recovery Estimation • Disaster Recovery Tracking and Reporting • Backout to Forward Recovery Automation • DR Mirror Management • Online Consistent Copy • Inflight Recovery (Recover to ANY Timestamp • Encrypted Image Copy • Cabinet Copy • DB2 9 Support SNAPSHOTUPGRADEFEATURE® The Sum is greater than the parts ©2006 BMC Software
DB2 OBJECT: 6 Part Table space Average Row Length 101 2 Secondary Indexes 60 Million Rows 8,644,513 Pages/~33GB BMC OPTIONS USED: Options Maxtasks 6 Output Tcopy Unit CARTVTS, STACK YES, Shrlevel Change, Indexes Yes, Resetmod no, Group Yes IBM OPTIONS USED: Shrlevel Change Copy DDN(Tape1) Parallel (6) COPY PLUS for DB2 hours:minutes:seconds ©2006 BMC Software
3 Types of Snapshot from BMC • Software snapshot • Brief outage required for ‘clean’ copy, known to the database • Makes a DBMS image copy, typically tape • Restore from tape or disk • Exploits processor cache • Hardware snapshot • Uses volume mirrors or data set snaps as source for DBMS copies • Brief outage required for ‘clean’ copy, known to the database • Restore from tape or disk • Instant Snapshot • Uses hardware data set snaps to create disk-resident backup • Note – NOT a DBMS formatted image copy (but can copy imagecopy) • Brief outage required for ‘clean’ copy, known to the database • NO OUTAGE for ‘fuzzy’ copy, known to the database • Instant Restore from Disk – in SECONDS!! ©2006 BMC Software
Recovered Databases Hybrid Copy Illustration with STACK CABINET Many Small IMS/DB2/VSAM Databases A Few Large IMS/DB2/VSAM Databases BMC COPY Single disk Copy dataset Instant Snapshots BMC RECOVER IMS/DB2 LOGS ©2006 BMC Software
Hybrid Copy DB2 Example with STACK CABINET • Part cabinet copy, part Instant Snapshot Copy • One BMC Copy statement - • COPY TABLESPACE DB.* SHRLEVEL CHANGE STACK CABINET • OUTSIZE parm drives large objects to DSSNAP, smaller to single cabinet copy dataset • Generated copy for 1828 data sets • 198 Instant Snapshots, 1630 regular copies to cabinet copy dataset • Less than 17 Minutes elapsed time (NO OUTAGE) • Very little CPU time • Recovery time for entire application – less than 1 hour • Without BMC, recover time with DSNUTILB over 360 minutes – 6 HOURS!! ©2006 BMC Software
or $je Lb*(1 C18 bo 3(7V Encrypted Image Copies • Satisfies need for SOX compliance to protect financial and customer information • Encrypted Copies using DES (64bit) or AES (128bit) algorithms • KEYDSNAME is created at installation with restricted access • Holds key, timestamp, optional algorithm identifier, optional comment • Requires BMC® Recovery Management for DB2 Solution Encrypted Output copies COPY DB.TS ENCHIPER YES RECOVER DB.TS Joe Blogs 123 45 6789 Joe Blogs 123 45 6789 keydsn ©2006 BMC Software
Online Consistent Copy • Used for migration of a consistent set of data • Test Database creation • Data Warehouse population • No outage required • Very fast, exploits intelligent storage technology • Copy contains only committed data – uncommitted data excluded • Supports copying a group of spaces at the same point of consistency • Supports multi-dataset non-partitioned spaces • Note: CAN be used as input to recovery requiring log apply ©2006 BMC Software
BMC Snapshot Online Consistent Copy • OCC may be input to UNLOAD PLUS • OCC can be created with normal copy process MVS Operating System DB2 ‘B’ BMC Copy Snap Request Data set Migrate data to another DB2 with RECOVER TOCOPY WITH OBIDXLAT Register ‘Online Consistent Copy’ Storage Device Apply log records to consistent point Log Records Data set DB2 ‘A’ Snaps data set RECOVER TOCOPY WITH NO LOGAPPLY Storage Device Data set ©2006 BMC Software
DB2 Recovery Manager - Overview • ISPF application with DB2 repository tables • Access DB2 Recovery Resources and … • Group objects for recovery • Validate recoverability of objects • Specify/Generate recovery jobs • Most processes available in Batch DB2 Recovery Resources • ICF Catalog • SYSLGRNG/X • SYSCOPY / Image Copies • Active Logs • Archive Logs • BSDS • DB2 Catalog • Tablespace • Indexes • RI structures Recovery Manager Repository Recovery Jobs Backup Jobs Disaster Recovery
Application Recovery - 101 • Without Recovery Manager • Determine what objects need to be recovered. • By Plan, Volume, Database • What objects are included in the above • Where do I need to recover to? • Image Copy (which one) • Quiesce Point • Current • Build the process … • One recovery job or multiple • Log being used in recovery • Recover Indexes or Rebuild • Recover Plus Backout an option? • Do all spaces actually need to be recovered • Run the jobs • Sit and wait … hopefully no abends. ©2006 BMC Software
Scheduled Jobs Copies Application Backup and Recovery Complete Subsystem-wide Backup Generated Balanced Jobs INITIATORS To Scheduler ARMBGPS ©2006 BMC Software
INITIATOR Recoveries RECOVER Group DB2 Subsystem Generated jobs SUBMIT RECOVER groups TO Current, Timestamp or RBA ©2006 BMC Software
Page built in memory, only written once Simultaneous COPY Simultaneous key extract Full Inc Full Full Index Hx034 Hx045 Hx065 Hx0f5 Hx0d4 Hx0e7 Hx0e1 TABLE Hx0a2 RECOVER PLUS – Fast Forward Recovery Active Logs LOG INPUT Archive Logs LOG SORT Table Space Copies MERGE Copies KEY SORT Index Space Work Dataset INDEX BUILD INDEX work-area can use memory to reduce I/O ©2006 BMC Software
Change Accumulation File R+/CHANGE ACCUM Job Another Recovery Resource • R+/CHANGE ACCUM : • Preprocesses/sorts log data to optimize log apply • Allows all other RECOVER PLUS options • No availability outage • No Tablespace STOPS, Locks, or Drains • Can be used in lieu of frequent copies Extract tablespace UNDO/REDO records for specified objects Sort into merge sequence (same sequence as copy output)
Index Copy and Recovery Automation • Some Indexes are better Recovered than Rebuilt • Non-partitioned Indexes can have LARGE record counts • Rebuild requires scan of all PARTs • INDEXES can be COPIED and RECOVER can apply Log Records • BMC can help • Automatically COPY Indexes based on size-threshold • Index copies can be Incremental Copies • Automatically RECOVER copied Indexes, REBUILD uncopied indexes • User does not have to specify recovery type – we decide • This can DRAMATICALLY REDUCE recovery time!! ©2006 BMC Software
Recovery started Image Copy RecoveryPoint Point in Time Recovery – Physical Backout The fastest way to get the database to the point prior to the application error is to remove one hour of records, rather than restoring 23 hours of records. Should BACKOUT fail, automatically do normal Forward Recovery 23 Hours of Good Transactions 1 Hr of Bad Transactions Backout 1 Hr of Log ©2006 BMC Software
Doing Nothing Smarter With XUNCHANGED • The Fastest Recovery is the one that can be AVOIDED. • How do We Know? • SYSIBM.SYSLGRNX tracks Open for Update ranges for all objects • Recovery to a Point in Time is usually an ‘application’ event • But not all objects in an application get updated every transaction • BMC Recovery Management for DB2 solution can… • Read SYSIBM.SYSLGRNX to figure out “what has changed” • Issue GENJCL BACKUP XUNCHANGED syntax • Issue GENJCL RECOVER XUNCHANGED syntax • Only application objects that have changed since the designated Point in Time will be recovered – a sometimes dramatic impact ©2006 BMC Software
Pit Range Quiesce at 000000000900 Bad Update at 000000001000 Image copy at 000000000100 TIMESTAMP RECOVERYNo QUIESCE required (Forward Flavor) OPTION RECOVERYPOINT TIMESTAMP 2004-04-20-09.00.00.. RECOVER TABLESPACE EMP.PAYROLL PIT_RBA START_RBA UR1 UR2 000000001200 ©2006 BMC Software
Pit Range Quiesce at 000000000900 Bad Update at 000000001000 TIMESTAMP RECOVERYNo QUIESCE required (Backout Flavor) OPTION RECOVERYPOINT TIMESTAMP 2004-04-20-09.00.00.. RECOVER TABLESPACE EMP.PAYROLL PIT_RBA START_RBA UR1 UR2 000000001200 ©2006 BMC Software
MemberBSDS MemberBSDS MemberBSDS Archive Logs Active Logs Archive Logs Active Logs Archive Logs Active Logs Report Writer SQL Generator TABLE DDL Generator Load Generator Mining the DB2 Log Data - Log Master DB2 Batch Log Scan Load Utility Logical Log Reports SQL Processor DML DDL High Speed Apply On-line Interface Repository Load File ©2006 BMC Software
Log Master • Allows logical ‘UNDO’ of application transactions via SQL • Prevents potential data integrity problems by identifying when UNDO processing will affect updates that were performed later. • Provides Data Migration to DB2 or DS databases • Display statistics on log activity including analysis of data capture changes impact. • Comprehensive reporting • Log Information – Audit, Summary, Detail • Performance – Commit, Rollback, Image Copy, Data Capture • Backout Integrity • Miscellaneous – Open Transaction, Quiet Point • Support for recovery of Dropped Objects • High speed SQL apply feature • Conflict Resolution ©2006 BMC Software
Log Master Output – Reports • Miscellaneous Reports • Quiet Point • Find Physical Quiet Points for filtered objects • Optionally insert a QUIESCE into SYSIBM.SYSCOPY for RECOVER purposes • DURATION – Added in V310… only report on quiet points greater than or equal to the specified duration • Open Transaction • What URIDs were still active at the TO point ©2006 BMC Software
Log Master Output – Reports • Information Reports • DETAIL • All Column data presented • AUDIT • Index Key Value presented • Only reports changed columns for updates • SUMMARY • No Data, Just INSERT, UPDATE, DELETE counts • CSV or SDF Format for Spread Sheet loading/analysis • SUMMARY ALL ACTIVITY • Avoids much of Log Master overhead • Not URID boundary aware • Includes Compensation Record counts as well • CSV or SDF Format also ©2006 BMC Software
UNDO - take away only the bad data • LOGMASTER for DB2 can apply UNDO SQL to get rid of bad transactions. Database remains online for optimal e-vailability. Good Transaction 1 Good Transaction 2 UNDO Bad Transactions Bad Transaction Generate UNDO SQL Apply UNDO SQL ©2006 BMC Software
REDO - re-apply ONLY the good data • Customers can perform a point in time recovery and then re-apply good transactions using REDO SQL. Database is briefly offline to recover to consistency, then back online. Good Transaction 1 Good Transaction 2 REDO Good Transaction 2 Bad Transaction 1. Generate REDO SQL Recovery started 3. Apply REDO SQL 2. Point-in-time recovery to a quiet point prior to the bad transaction. Be sure to generate the REDO SQL BEFORE the RECOVER TO PIT!! ©2006 BMC Software
Recreates dropped objects Process is initiated from the online interface Drive Recovery Technology using copy and log from Dropped Object. • UNDO DDL to recreate the dropped object • Syntax for recovery and object ID translation • DB2 commands to rebind application plans that were invalidated when the object was dropped • Drop Recovery Report Recovery Plus Technology Scans DB2 Log Records OBID Translation Applies log to point of DROP LogMasterTechnology Post recovery SQL and Rebind Log MasterTechnology Automated Drop Recovery Generates JCL and outputs to automate Drop Recovery DB2 Subsystem ©2006 BMC Software
Log Master BATCH PGM Log Master BATCH PGM Log Master BATCH PGM Logical Log(+1) Logical Log(+1) Logical Log(+1) TABLE TABLE TABLE DB2 Data Migration • Don’t replicate entire files, just migrate the changes!!! DB2 LOG RBA 1000 RBA 2000 RBA 3000 RBA 4000 (inflight URID 1988) Migrated 1000 - 2000 less inflight URID 1988 Migrated 2000 - 3000 plus inflight URID 1988 Migrated 3000 - 4000 REPOSITORY Migrated RBA range In-flight URIDs Input to LOAD utility or Apply SQL process
INCOPY OBIDXLAT OUTCOPY ONLY OUTPUT IMAGE COPY INDEP OUTSPACE OBIDXLAT DROPRECOVERY INCOPY OBIDXLAT Recover Plus Output OptionsMore Than Just Recovery LOGS DB22 DB21 INPUT IMAGE COPY PROD TEST OPTION RECOVERYPOINT TIMESTAMP 2004-04-20-09.00.00.. RECOVER TABLESPACE EMP.PAYROLL ©2006 BMC Software
Disaster Recovery • Options from weekly dumps to offsite logging • Dumps - Simple, cheap, maximum data loss • Weekly dumps means several days data loss • Remote Mirror - Complex, expensive, no data loss • Disk, Network, Software, Facilities, Operations • Compromise - Periodic vaulting of Copies & Logs • Daily or hourly log shipment will minimize data loss Cost Complexity Data Loss Outage Time ©2006 BMC Software
Disaster Recovery Preparation • Generation of recovery JCL for DB2 system tables and BMC tables. • Grouping and generation of recovery JCL based on application or other criteria. • Recovery Simulation • Test and validate your recovery locally. • Can provide input into DR planning. • DR Mirroring • Monitors and reports existence/persistence of remote mirror volumes • Mirroring support reflected in JCL Generation. • Pick tape lists for recovery copies and logs ©2006 BMC Software
Recovery Management for DB2 DR support (Sysprog and DBA) • Offsite log recovery without complexity • Less data loss than weekly dumps • Automated process, easy to implement • Dialog driven generation of ARM Utilities DB2 Catalog/Directory & BMC RM Repository Remote Full Copies (Change or Reference) ARMBSRR (Gen System) TMS Pull Truck ICF Dump ARMBGEN (Gen Apps JCL) ARMBLOG (Switch Active Log) ARMBARC (copy log) Application Remote Copies (Full/Incremental, Change/Reference) ©2006 BMC Software
Disaster Recovery At the recovery site. • Generates necessary JCL to restart DB2 subsystem. • Supports mirrored subsystems based on installation mirroring configuration. • Recovers system resources and critical BMC repositories (if not mirrored) • Generates JCL for recovery of application data based on load balancing and job synchronization • Automatically collects statistical data on actual recoveries. • Archives recovery statistics for updating history repositories at local site. ©2006 BMC Software
Truck Remote Site Execution • Remote site DB2 startup is easy • Assuming all the tapes made it!! Application Database Recoveries ARMBSRR Job 2 (Cat/Dir/RM Recoveries) ICF Restore Business Resumption (as of ‘last nights’ ARCHIVE LOG point) TMS Restore ARMBSRR Job 1 (CLI, VSAM Allocates Initialize Actives) Application Dataset Restores ©2006 BMC Software
Recovery Estimation / Simulation • Estimation • Predict Recovery Times • Based on history maintained for largest tablespaces • Current tablespace attributes and artifacts • Simulation • Validates recovery artifacts through actual execution of the recovery at local site without disruption to ‘real’ data. • Provides input to decisions on DR planning. • Reporting • Online and batch reporting available to view recovery statistics of actual, estimated and simulation runs. ©2006 BMC Software
BMC Software Recovery Value Return On Investment Migration Replication Audit Reports Encrypt Copies Additional Daily Benefits Physical Backout Timestamp Recovery Transaction Recovery Auto Disaster Recovery E-Net Remote Replication Business Continuity Hybrid Copy Drop Recovery Recovery Groups Recovery Avoidance Assess and Improve Recoverability of Data B&R Options Multi-Vendor Hardware Support Snapshot Copy Rapid Recovery Index B&R Intelligence Innovation ©2006 BMC Software
BMC Software Recovery Management Value Proposition • Reduce or eliminate planned and unplanned outages • Improve Application Availability • Perform backup while applications are online • Perform ‘logical’ recoveries while application are online • Perform ‘physical’ recovery with only log input • Manage Complexity • Automate complicated backup and recovery processes • Leverage investment in intelligent storage devices • Prepare for local and disaster recovery scenarios • Validate recoverability and recover assets • Increase staff and resource efficiencies • Simplifies the recovery process for the DBAs • Assures successful and consistent recovery • Utilize computing resources efficiently ©2006 BMC Software