760 likes | 1.2k Views
BID211: High Availability & Disaster Recovery With Sybase IQ. Sid Sipes Principal System Consultant sipes@sybase.com / 972-687-6477 August 7, 2003. For This Presentation. Let’s Keep It Interactive I will entertain questions on a subject Would be happy to speak to you offline, if desired
E N D
BID211: High Availability & Disaster Recovery With Sybase IQ Sid Sipes Principal System Consultantsipes@sybase.com / 972-687-6477August 7, 2003
For This Presentation • Let’s Keep It Interactive • I will entertain questions on a subject • Would be happy to speak to you offline, if desired • No Question is too Basic • Chances are others may have the same question • Open Question Forum at the end • Time Permitting
Target Audience • Data Warehouse Management • Database Administrators • All Levels • System Administrators • All Levels
Disaster Recovery Agenda • We will be discussing: • Why Do I Need to Backup A Data Warehouse? • How do you Backup Terabytes of data? • Are there any 3rd party tools? • Alternative Backup/Restore Techniques • We will also be discussing Disaster Recovery Planning and Implementation
Disaster Recovery Why do I need to Backup My Database? Why do I need a Disaster Recovery Plan? • There seems to be an attitude that Data Warehouses do not need the same type of safeguards that OLTP systems require!!!!
Disaster Recovery Why Bother with Backups? • A Data Warehouse is Not Mission Critical • Who cares if it goes down for a while? • Only a few analysts will miss it!!! • Very few on-line users • I can always rebuild it • A Data Warehouse only contains historical data. • The source data is around, somewhere in the OLTP systems
Disaster Recovery Here’s Why: • A Data Warehouse is Not Mission Critical • Not True Anymore!!!! • Many Data Warehouse systems are 24x7 • Decisions made on very recent data • Many users are demanding historical data combined with “Near-RealTime” data. • Potential for many thousands of on-line users via web interface
Disaster Recovery I can always rebuild it ! • How? • Where is the Data? • Do you have access to source systems? • What impact will massive extracts have on source systems? • How efficient is your ETL process? • Do you have the network bandwidth to move that much data?
Disaster Recovery I can always rebuild it ! • Chances are great that you can not rebuild your Data Warehouse. • Not Enough Time • Data doesn’t exist anymore (purged from OLTP systems) • Many, Many, Many, Many More reasons why!!! • Things only get worse with users breathing down your neck.
Disaster Recovery It will Happen to YOU!!!!!!! • Backup / Recovery and Disaster Planning are Critical Components of any Data Warehouse Operational Plan. • Sooner or later, you will loose your database • Hardware Problem • Data Corruption • Gremlins • Plan for It
Disaster Recovery Ok, How do I backup Terabytes of Data? • Just How Do I perform Backups Given: • Huge Amounts of Data!! • Slow Backup Devices!! • Not Enough Time • What are the Alternatives?
Disaster Recovery Ok, How do I backup Terabytes of Data? • How Much Time Does a Backup Take? • How Do I handle Terabytes of Data? • Question Recently Asked on IQUG! • Case Size Disk/ File Time Time Rate • (MB) Tape Size (Secs) (HHMM) (MB/Sec)
Disaster Recovery Backup Rates • Best Rate Reported • 55.6 MByte/Sec • ~ 5 Hours / TeraByte!!! • Slowest Rate Reported: • 3.7 MByte/Sec • ~ 75 Hours / TeraByte!!!!
Disaster Recovery Another Interesting Case Sybase ASIQ Client Recently Ran Tests on VLDB • Anticipating 10+TB IQ Database • Tested Backup to Conventional Tape Arrays • 27 Tape Array (20/30 Gig Tapes) • Requires 400+ Tapes to Backup Database • High Probability of Tape Failure • Requires 27 Hours to Backup Database
Disaster Recovery Too Slow!!!! What are the Backup Alternatives? • Sybase ASIQ Built-In Backup • ASIQ Backup Alternatives • Fast Extract • System Level Backup • Third Party Backup Utilities • “Instant” Disk Level Imaging
Disaster Recovery Sybase ASIQ Built-in Backup • Backup Utility Built Into Server • Adaptive Server IQ provides three types of backups: • FULL • INCREMENTAL • INCREMENTAL SINCE FULL
Disaster Recovery Sybase ASIQ Built-in Backup • FULL Backup • Causes a full backup of both the Catalog Store and the IQ Store. • FULL is the default action.
Disaster Recovery Sybase ASIQ Built-in Backup • INCREMENTAL Backup • Makes a full backup of the Catalog Store, • Then backs up all changes to the IQ Store since the last IQ backup of any type.
Disaster Recovery Sybase ASIQ Built-in Backup • INCREMENTAL SINCE FULL • Makes a full backup of the Catalog Store, • Then backs up all changes to the IQ store since the last full IQ backup.
Disaster Recovery Sybase ASIQ Built-in Backup • All three backup types fully back up the Catalog Store. • In most cases, the Catalog Store is much smaller than the IQ Store. • Should be very fast • Temporary Store data is not backed up. • However, the meta data and any other information needed to recreate the Temporary Store structure is backed up.
Disaster Recovery Sybase ASIQ Built-in Backup • BACKUP backs up committed data only. • Adaptive Server IQ backs up only those database blocks actually in use at the time of backup. • Free blocks are not backed up • It does not back up the transaction log file. • It does not use the transaction log to restore the database.
Disaster Recovery Sybase ASIQ Built-in Backup • Backups begin with an automatic checkpoint. • At this point, the backup program determines what data will be backed up. • It backs up the current snapshot version of your database as of the time of this checkpoint. • Any data that is not yet committed when this checkpoint occurs is not included in the backup
Disaster Recovery Sybase ASIQ Built-in Backup • BACKUP always makes a full backup of the Catalog Store on the first archive device, and then backs up the data from the IQ Store in parallel across all of the devices you specify. • Blocks are not distributed evenly across archive media. • You may have more on one device than others, depending on the processing speed of individual threads
Disaster Recovery Sybase ASIQ Built-in Backup • You can back up any IQ database onto either disk or magnetic tape. • AS IQ supports backup and restore using multiple tape drives at near device speeds, or to multiple disks.
Disaster Recovery Sybase ASIQ Built-in Backup • When Backing Up / Restoring from Multiple Devices: • Beware of Exceeding I/O Bandwidth to SAN/Disks with Too Many Tape Drives • Each Drive operates at about 20MB / Sec • Typical I/O Bandwidth is 1 GigaBit (~200 MB/Sec) • 27 Drives * 20 MB/Sec = 540 MB/Sec I/O Requests • You’ve just blown your bandwidth for a single Fiber Channel
Disaster Recovery Sybase ASIQ Built-in Backup • Adaptive Server IQ allows Stacker drives with multiple tapes • Adaptive Server IQ BACKUP does not support • jukeboxes • robotic loaders. • Quarter Inch Cartridge (QIC) drives.
Disaster Recovery Sybase ASIQ Built-in Backup • Is ASIQ Backup Fast Enough? • Probably not for Terabytes of Data • Should Consider Other methods • Tuning for Faster ASIQ Backups is Possible • May provide enough performance • We’ll explore other options in a few slides
Disaster Recovery Sybase ASIQ Built-in Backup Increasing Performance • Increasing the number of archive devices • BACKUP and RESTORE write your IQ data in parallel to or from all of the archive devices you specify. • The Catalog Store is written serially to the first device. • Faster backups and restores result from greater parallelism. • To achieve greater performance when backing up or restoring a large database, specify more archive devices.
Disaster Recovery Sybase ASIQ Built-in Backup Increasing Performance • Eliminating data verification • You can also improve the speed of backup and restore operations by setting • CRC OFF in the BACKUP command. This setting deactivates cyclical redundancy checking. • Default is CRC ON • WARNING – Introduces some Risk
Disaster Recovery Sybase ASIQ Built-in Backup Increasing Performance • Spooling backup data • You may find that it is faster and more efficient to create backups on disk, and then spool them onto tape for archival storage. • If you choose this approach, you need to unspool the data onto disk before restoring it.
Disaster Recovery Sybase ASIQ Built-in Backup Increasing Performance • Increasing memory used during backup • The amount of memory used for buffers during backup directly affects backup speed, primarily for tape backups. • The BLOCK FACTOR parameter of the BACKUP command controls the amount of memory used. • If your backups are slow, you may want to increase the value of BLOCK FACTOR for faster backups.
Disaster Recovery Sybase ASIQ Built-in Backup Increasing Performance • Balancing system load • Adaptive Server IQ allows you to perform backups concurrently with all other read/write operations, except those that affect the structure of the database. • It is still a good idea to schedule backups during times of low system use, however, to make the best possible use of system resources—disk, memory, and CPU cycles.
Disaster Recovery Sybase ASIQ Built-in Backup • OK, ASIQ Backup/Restore May Not Be Fast Enough. • What do I do? • Alternatives • Fast Extract • System Level Backups • Array Vendor “Instant” Backup • Shadow-Image / BCV • True-Copy • SRDF
Disaster Recovery ASIQ Fast Extract. • Use ASIQ Built-In Fast-Extract option • SQL Query specifies what to Backup • Data extracted to ASCII or BINARY files • Single or Multiple Tables • Useful for Backing Up/Restoring Huge Tables • Data Restored using IQ SQL Load Syntax • Very Fast Operation
Disaster Recovery ASIQ Fast Extract. • Downside – Does not save/restore any metadata or system information. • Little or no error checking
Disaster Recovery System Level (OS) Backups • You can use system-level backups for an IQ database. • For Example, DD raw devices. • Must be done carefully, and with certain safe-guards in place • If you attempt to restore your IQ database files from a system-level backup without these safeguards in place, • You will cause data loss or corruption, either from activity in the database while the system-level backup occurred, or from missing files.
Disaster Recovery System Level (OS) Backups • Shutting down the database (non-multiplex) • Your IQ database MUST NOT be running during a system-level backup. • You must shut down your IQ database before starting the system-level backup. • You must also ensure that no one starts the IQ database until the system-level backup is complete
Disaster Recovery System Level (OS) Backups • Shutting down the database (multiplex) • Your multiplex WRITE SERVER MUST NOT be running during a system-level backup. • You must shut down your write server before starting the system-level backup. • You must also ensure that no one starts the write server until the system-level backup is complete.
Disaster Recovery System Level (OS) Backups • You must Backup the following Files • SYSTEM dbspace file, • typically named dbname.db • The transaction log file, which is required for system recovery, • typically named dbname.log • The IQ_SYSTEM_MAIN dbspace file, • typically named dbname.iq • Or the corresponding raw device(s). • Files for any additional dbspaces that have been added to IQ_SYSTEM_MAIN
Disaster Recovery System Level (OS) Backups • You only need to save the lengths of the following files: • The IQ_SYSTEM_TEMP dbspace file, • typically named dbname.iqtmp • Or, the corresponding raw device(s) size. • Files / Devices for any additional dbspaces that have been added to IQ_SYSTEM_TEMP
Disaster Recovery System Level (OS) Backups • System Level Backup • It is not required that you back up the temporary dbspaces. • IQ can reconstruct any temporary dbspace provided that it sees a file of the correct length at the time the database starts. • Therefore, you may simply keep records of the sizes of the files or raw devices used to hold the temporary dbspaces.
Disaster Recovery System Level (OS) Backups • Optional Files • The ASCII message files such as dbname.iqmsg • The $ASLOGDIR/*.svrlog • The $ASLOGDIR/*.stderr files. • Not Required, but contain useful information • ASIQ Engineering may need these files, should something go catastrophically wrong during the restore.
Disaster Recovery Third Party Backup Utilities • Veritas Net Backup • Now GA • Certified (by Veritas and Sybase) for Sybase ASIQ • Requires ASIQ 12.4.3 EBF 4 (or Newer) • Similar in Implementation to Net Backup Agent for Sybase ASE • Will Fit within Existing Veritas Installations
Disaster Recovery Third Party Backup Utilities • Veritas Net Backup • Fits within Veritas Enterprise Architecture • Works with • Veritas Master Node (Scheduler) • Veritas Media Server • Veritas Client (Net Backup Agent
Disaster Recovery Third Party Backup Utilities • Veritas Net Backup • Currently Single-Threaded. • Can only write to a single archive device at a time. • Sybase and Veritas working to add parallelism
Disaster Recovery Third Party Backup Utilities • Sun Solaris HA Suite • Provides ability to take Snap-Shot of Disk Devices • Full Snap-Shot • Incremental Snap-Shots • Might be able to provide Point-In-Time recovery. • MUST OBSERVE Same SAFEGARDS as OS/System Level Backups • Simplex – Server must be shutdown • Multiplex – Write Server must be shutdown
Disaster Recovery What About Disk Array Replication? • Array Vendors (EMC, Hitachi, Sun) offer the ability to remotely replicate all of your IQ data base devices • EMC – SRDF • Hitachi – True Copy? • Replication is Transparent to IQ • Entire Database is Replicated • Catalog • IQ Store • Transaction Log
Disaster Recovery Disk Array Replication • ASIQ Is Certified with the Vendors • EMCC • Hitachi • Hitachi
Disaster Recovery Disk Array Replication • Instant Recoverability • Little Down Time • Both Local and Remote copies of data • True Disaster Prevention • Some of you are already backing up to disk • This will automate the process • You can still perform tape backups on remote copy of database
Disaster Recovery Disk Array Replication • How Does it Work? • As changes are made to disk blocks, the disk array controller • captures the changes • Replicates in near-real-time the changed block to a mirroring set of disks. • Similar to Disk Mirroring • Can replicate to other Array Frames across LAN • Can replicate to remote frames, across WAN.