180 likes | 281 Views
Lecture no 28 - Backup. TDT4285 Planlegging og drift av IT-systemer Våren 2011 Anders Christensen, IDI. Definition. Backup is using redundant storage of information to avoid the loss due to human errors, crashes, mishaps etc – i.e.by neutralizing a single point of error wrt storage. .
E N D
Lecture no 28 - Backup TDT4285 Planlegging og drift av IT-systemer Våren 2011 Anders Christensen, IDI TDT4285 Planl&drift IT-syst
Definition Backup is using redundant storage of information to avoid the loss due to human errors, crashes, mishaps etc – i.e.by neutralizing a single point of error wrt storage. TDT4285 Planl&drift IT-syst
Dimentions for backup • Domain – all or just important data? • Levels – all or just changed data? • Degrees– just file contents or metadata too? • Redundancy– number of copies of each file • Granularity– how often is backup taken • Versioning – number of versions • Change rate – % of changes in file contents • Medium – what kind of backup medium TDT4285 Planl&drift IT-syst
Levels of backup Large file Based on Based on Your file New Your file Myfile Some file Inkr. backup Nivå 2 Inkr. backup Full backup Time Nivå 0 Nivå 1 Time Large file Large file New New Large file Your file Your file Myfile Myfile Your file Some file Some file Some file TDT4285 Planl&drift IT-syst
Full and incremental backup Full backup: copy of all data Incremental backup level 1: copy of all new and all changed data since last full backup. Incremental backup level 2: copy of all new and all changed data since last ingremental backup at level 1. Etc ... TDT4285 Planl&drift IT-syst
Trends in backup Versioned file systems Development Versioning Tape backup Cheap disks TDT4285 Planl&drift IT-syst RAID
Tre degrees of backup 1. Mirror image Contents, metadata and ”implementation” 2. All data Index File contents Index Contents and metadata File contents 3. File contents File contents Only file contents TDT4285 Planl&drift IT-syst
Metadata which may be copied • Access control lists (ACL) • User and group ownerships • All time stamps related to the file (typically at least 3-4 for most OSes) • Information about ”holes” in files • Device files, special files and links • File attributes (R/O, hidden, system files...) TDT4285 Planl&drift IT-syst
Backup methods The user is himself responsible For ensuring that backup is taken. All versions of each file is stored User Versioning Hard disk Full Synchronised Backup database Tape backup Incremental All data is stored once in a database Tape station 1 tape per day TDT4285 Planl&drift IT-syst
Assume schedule: Full every month Incr lvl 1 every week Incr lvl 2 every day Tape rotation 6 mth Worst case: a file with a life span of 1 month is only stored on a single tape. For every tape with a defect, how many files are ”permanently” lost? What is the possibility for a successful restore, given the life span of a file and the rate of tape defects? Redundancy and granularity TDT4285 Planl&drift IT-syst
Do you need to copy the whole file? Append Ford July Backup of the whole file Database file Opel June August Backup only of added record Mercedes May Backup only of changes Volvo April Citroen Mars Append-only log file Backup of the whole file Update Rolls Royce February Mazda Lada January TDT4285 Planl&drift IT-syst
Backup – proactive of reactive? Proactive phase Reactive phase Tape Routine Time critical Restore Backup Restored hard disk Original hard disk Crash Time TDT4285 Planl&drift IT-syst
Location for the storing of backup Burglary Small fire Large fire Fire – loss of building Land slide Same building Next door building Tape robot Same room 1 2 3 4 5 Fire safe TDT4285 Planl&drift IT-syst
Archived backup • Make a separate set of backups, separately from the daily backup sets • Extract a set of tapes from the daily sets • A selection that comprise a complete backup set • Remove a RAID 1 disk from the backup repository • Additional storage of full backup for every partition to a separate archive tape station (or staging area and store on tape when complete) TDT4285 Planl&drift IT-syst
Time aspects wrt backup • Backup is taken more often than restore, so more important to automize backup • Timing more often critical when restoring (reactive) • Restore is something that the users should be able to do themselves (reducing the work load for the system administrators) TDT4285 Planl&drift IT-syst
Some metrics • Time for nightly run of backup • Number of days in a complete bakup cycle, when using incremental backups • Typical time to restore a single, small file • Time to restore largest partition completely • Transfer rate when restoring • Degree of compresion in backuped data • Number of versions and the time interval between each version. TDT4285 Planl&drift IT-syst
Backup schedule at IDI • Full backup to tape every 10th day • Incremental backup to tape every day • Full/incr stored for 60 days • Archive backup 3-4 times per year. Incremental Days Full Partitions TDT4285 Planl&drift IT-syst
Some oops! factors • Nobody has timed a full restore • The licence of the restore program has expired • Nobody has checked the logs for errors, and all tapes for the last 5 months are empty • Disk systems size increases more quickly that the capacity of the tape systems • Daily backup takes slightly more that 24 hours. • A database file was in use and is now inconsistent TDT4285 Planl&drift IT-syst