1 / 11

Oracle B&R for Physics Databases

Explore CERN's Oracle B&R policies, techniques, performance measurements, and recovery scenarios for maintaining data integrity. Learn about RMAN, custom scripts, incremental backups, and more.

Download Presentation

Oracle B&R for Physics Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Oracle B&R for Physics Databases LCG 3D Workshop, September 2006 Luca Canali, CERN IT

  2. Outline • Oracle Backup & Recovery • Goals and B&R policies • Technology choices • CERN’s use of RMAN • Architecture of our backup solution • Deployment • Performances • Lessons learned LCG 3D Workshop, CERN 14-09-2006 - 2

  3. B&R • ‘Thou shall not lose data’ • In case of failures and corruption • (HW, software, human error) • Full and ‘point in time’ recoveries • Disaster recovery • B&R: agree on a policy with the users • Retention period • Max Recovery time • Disaster recovery scenarios LCG 3D Workshop, CERN 14-09-2006 - 3

  4. Oracle B&R • Many techniques, no out-of-the box solution • May change with ‘oracle secure backup’ and EM • Need additional effort • Techniques • RMAN + media manager (recommended) • ‘Manual scripts’ and manual tape archiving (low budget) • Snapshot copies (high-end storage) • Additional effort: • Develop, maintain and schedule backup scripts • Interface with the ‘tape group’ / sysadmin • Overall the DBA team must maintain specialized B&R knowledge LCG 3D Workshop, CERN 14-09-2006 - 4

  5. RMAN @ CERN • RMAN is used to backup all production DBs • The Media Manager is Tivoli (IBM) • Run by a specialized group • Holds several 10s of TB of space • High-end architecture and currently growing • Physics DBs are the main clients • Tape drives are expensive • tradeoff performance - cost LCG 3D Workshop, CERN 14-09-2006 - 5

  6. Custom Scripts • Scripts developed in house • Dedicated server for scheduling and monitoring • RMAN 10g backups scheduled: • Level 0 to tape • Level 1 cumulative to tape • Level 1 differential to tape • Extra protection (recommended): • Full backup to disk • Leverage 10g incremental recovery of copy to disk • We keep the backup to disk 2 days behind production • Note: need a large flash recovery area LCG 3D Workshop, CERN 14-09-2006 - 6

  7. Backup scheduling and retention • Has to be agreed upon with the experiments • Current status for LHC DBs • Full, every 2 weeks • Level 1 cumulative, twice per week • Level 1 differential, every day • Archive logs, every hour • Retention • RMAN policy: retention window of 31 days • Guarantees backups are kept for recoveries at any point in time during last 31 days LCG 3D Workshop, CERN 14-09-2006 - 7

  8. Recovery Scenarios • Typical scenarios: • Point in time recovery because of human error • Full recovery because of HW fault • Disaster recovery from tape • Point in time recoveries • Time consuming • Better done in a separate environment • 10g ‘flashback’ technology can often be used instead of backups to solve the underlying issue LCG 3D Workshop, CERN 14-09-2006 - 8

  9. Test Recovery System A cluster dedicated to test recoveries • Installed as the production DBs • Used to restore production backups • All production procedure must tested • RMAN has an unfriendly syntax that needs practice on a regular basis (~ twice per year) • Used for point in time recoveries • To be exported back to production • Can be activated as production, if needed LCG 3D Workshop, CERN 14-09-2006 - 9

  10. Performance • Measurements from a VLDB: Compass • Full backup • Backup with 2 channels • Every tape channel has a throughput of ~30 MB/sec • Full backup to tape: 3.5 TB in 16 hours • Incremental backups • We use block change tracking • Average daily activity for the month August 2006 • Block_change_tracking file = 420 MB • Incremental level 1 (differential): 9 minutes • Backup size to tape: 16 GB • Corresponding archived redo logs: 58 GB of redo, 230 files LCG 3D Workshop, CERN 14-09-2006 - 10

  11. Conclusions • B&R for Oracle at CERN: • RMAN and Tivoli • Custom scripts + dedicated backup scheduler server • Incremental backup strategy • Backups to disk complementing tape • Running for many years • Define and test main recovery scenarios • With dedicated HW • Performance measurements • Block change tracking for lvl1 backups LCG 3D Workshop, CERN 14-09-2006 - 11

More Related