470 likes | 560 Views
CIT 500: IT Fundamentals. Packages and Filesystems. Topics. Backups Policies and planning Backup software RAID LVM Syslog /proc. Backup Decisions. Why? Why are you backing up data? What would happen if you lost data and didn’t back up? What types of data do you have? What?
E N D
CIT 500: IT Fundamentals Packages and Filesystems
Topics • Backups • Policies and planning • Backup software • RAID • LVM • Syslog • /proc
Backup Decisions Why? Why are you backing up data? What would happen if you lost data and didn’t back up? What types of data do you have? What? What to back up—entire system, or specific filesystems? What OS to backup? What other things to backup—MBR, LVM? When? When is the best time to backup? How often? Where? Where will backup occur? Where to store backup volumes? Who? Who is going to provide backup system? Who will do backups? How? How are you going to do backups? Tape, mirrors, off-site, etc.
Why Backups? • Accidental deletions. • Hardware failures. • Data corruption. • Security incidents. • Plan for the worst. • System catches fire. • Fire spreads to replicated systems. • Sprinklers destroy backup system in data ctr.
Backup Types Full backup Complete copy of all files from a particular time. Backup: slow, requires high capacity. Restore: fast, simple. Differential backup Storage of changed files since last backup. Backup: fast, may store many incrementals per tape. Restore: slow, complex (requires multiple tapes)
Backup Levels • Levels define how much is backed up compared to another backup level. • Lower levels back up more data, but • Have higher cost in media and time. • Higher levels are differential backups that store data that has changed since the last backup at one level below them. • Higher level backups performed more frequently than low level backups, since are faster + cheaper.
Backup Level Examples Level 0: A full backup of the selected filesystems. Level 1: A differential backup that backs up only files that have been changed since the last level 0 backup. Level 2: A differential backup that backs up only files that have been changed since the last level 1 backup.
Using a 3-Level Backup Backup plan: • Perform a level 0 backup on first of month. • Perform a level 1 backup on first day of week. • Perform a level 2 backup each day. Restore with the following procedure: • Restore most recent level 0 backup. • Restore most recent level 1 backup. • Restore most recent level 2 backup.
Capacity Planning Requirements How long do you need to retain data? How much media do you need for each backup? Example: 3 months of backups 3 Level 0 sets of media 5 Level 1 sets of media (up to 5 weeks per month) 7 Level 2 sets of media (7 days per week)
Verifying Backups • Select backup media to test. • Choose one level 2 per week, one level 1 per month, one level 0 per year • List files on backup media. • Restore a random file. • Verify that a file of appropriate size was created. • Verify contents of file.
Backup Software OS Provided (backup individual systems) cpio, dd, dump, tar, ntbackup Open source (backup servers) AMANDA Bacula Commercial (backup servers) Tivoli Storage Manager (IBM) Veritas Storage Manager
dd dd – Copy data from input file to output file if=inputfile of=outputfile bs=[1M] Primarily used for disk-level backups. dd if=/dev/sda1 of=sda1.dd Backs up MBR, partition table, unused disk space
cpio cpio – Copy input/output -i Extract files from backup -o Write backup to STDOUT Used for file level backups Receives list of files to backup on STDIN, so find / -print | cpio -o > backup.cpio
tar tar – Tape Archive c Create archive x Extract files from archive f Use a file instead of tape z Low compression (gzip format) j High compression (bzip2 format) Tar is most commonly used file backup Easiest to use tool; uses BSD options so – optional. tar cf /tmp/home-backup.tar /home
Compression Rely on hardware compression • Most tape drives perform compression. • Compression improves speed since there is less data to write to tape. • Tape capacities often assume 50% compression. Use software compression • gzip for fast, low compression • bzip2 for higher but slower compression • 7zip for highest but slowest compression
Redundant Disks Disks are most likely component to fail • Moving parts • Constant heavy use For high reliability, we need redundant disks • Backups will save our data, but if a disk fails, the system will be down until a new disk is installed and the backup is restored. • Redundant disks don’t remove need for backups; what happens if data center is destroyed?
RAID Redundant Array of Independent Disks Combine physical disks into single logical unit. Can be implemented in hardware or software. Hardware RAID controllers may provide: Caching for higher performance Hot swapping for higher reliability Advantages of RAID over single disks: Capacity Reliability Throughput
Striping • Distribute data across multiple disks. • Improve speed by accessing disks in parallel. • Independent requests can be serviced in parallel by separate disks. • Single multi-block requests can be serviced by multiple disks. • Performance vs. reliability • Performance increases with # disks. • Reliability decreases with # disks.
Parity Store extra bit with each chunk of data. • Odd parity • add 0 if # of 1s is odd • add 1 if # of 1s is even • Even parity • add 0 if # of 1s is even • add 1 if # of 1s is odd
Error Detection with Parity Even: every byte must have even # of 1s. What if you read a byte with an odd # of 1s? • It’s an error. • An odd # of bits were flipped. What if you read a byte with an even # of 1s? • It may be correct. • It may be an error where an even # of bits are bad.
RAID 0: Striping, no Parity Performance Throughput = n * disk speed Reliability • Lower reliability. • If one disk lost, entire set is lost. • MTBF = (avg MTBF)/# disks Capacity n * disk size
RAID 1: Disk Mirroring Performance • Reads are faster since read operations will return after first read is complete. • Writes are slower because write operations return after second write is complete. Reliability • System continues to work after one disk dies. • Doesn’t protect against disk or controller failure that corrupts data instead of killing disk. • Doesn’t protect against human or software error. Capacity • n/2 * disk size
RAID 3: Striping + Dedicated Parity Reliability Survive failure of any 1 disk. Performance • Striping increases performance, but • Parity disk must be accessed on every write. • Parity calculation decreases write performance. • Good for sequential reads (large graphics + video files.) Capacity (n-1) * disk size
RAID 4: Stripe + Block Parity Disk • Identical to RAID 3 except uses block striping instead of byte striping.
RAID 5: Stripe + Distributed Parity Reliability Survive failure of any 1 disk. Performance • Fast reads (RAID 0), but slow writes. • Like RAID 4 but without bottleneck of a single parity disk. • Still have to read blocks + write parity block if alter any data blocks. Capacity (n-1) * disk size
You still need backups Human and software errors • RAID won’t protect you from rm –rf / or copying over the wrong file. System crash • Crashes can interrupt write operations, leading to situation where data is updated but parity is not. Correlated disk failures • Accidents (power failures, dropping the machine) can impact all disks at once. • Disks bought at same time often fail at same time. Hardware data corruption • If a disk controller writes bad data, all disks will have the bad data.
Logical Volumes What are logical volumes? Appear to user as a physical volume. But can span multiple partitions and/or disks. Why logical volumes? Aggregate disks for performance/reliability. Grow and shrink logical volumes on the fly. Move logical volumes btw physical devices. Replace volumes w/o interrupting service.
System Logs • Logs record status and error conditions. • Where do log messages come from? • Kernel • Accounting system • System services • Logging methods: • Service records own logs (apache, cron). • Service uses syslog service to manage logs.
Rotation • Keep backup log files for each day or week logfile logfile.1 logfile.2 logfile.3 • Additional features: • Compress rotated logs to save disk space. • Remove/archive logs that are X days old.
logrotate • Program to handle log rotation. • Run via /etc/cron.daily. • Configured via /etc/logrotate.conf. • Options • How often to rotate • How long to keep logs • Compression or not • Log file permissions • Pre- and post-rotate scripts
logrotate.conf # rotate log files weekly weekly # keep 4 weeks worth of backlogs rotate 4 # create new (empty) log files after rotating old create # uncomment if you want your log files compressed #compress # RPM packages drop log rotation information into include /etc/logrotate.d # no packages own wtmp -- we'll rotate them here /var/log/wtmp { monthly create 0664 root utmp rotate 1 }
Syslog Comprehensive logging system. Frees programmers from managing log files. Gives sysadmins control over log management. Sorts messages by Sources (services that generate log messages) Importance (as reported by the service) Routes messages to different destinations Files Network Terminals
Syslog Components Syslog Daemon that does actual logging. Additional daemon, klog, gets kernel messages. logger User-level program to submit logs to syslog. Can use from shell scripts.
Syslog Message Format • Timestamp: date and time of message • Hostname on which event occurred • Name of program generating log message • Text of log message
Example Syslog Messages Feb 11 10:17:01 localhost /USR/SBIN/CRON[1971]: (root) CMD ( run-parts --report /etc/cron.hourly) Feb 11 10:37:22 localhost -- MARK -- Feb 11 10:51:11 localhost dhclient: DHCPREQUEST on eth1 to 192.168.1.1 port 67 Feb 11 10:51:11 localhost dhclient: DHCPACK from 10.42.1.1 Feb 11 10:51:11 localhost dhclient: bound to 10.42.1.55 -- renewal in 35330 seconds. Feb 11 14:37:22 localhost -- MARK -- Feb 11 14:44:21 localhost mysqld[7340]: 060211 14:44:21 /usr/sbin/mysqld: Normal shutdown Feb 12 04:46:42 localhost sshd[29093]: Address 218.38.30.101 maps to ns.thundernet.co.kr, but this does not map back to the address - POSSIBLE BREAKIN ATTEMPT! Feb 12 04:46:44 localhost sshd[29097]: Invalid user matt from ::ffff:218.38.30.101
Configuring Syslog Configured in /etc/syslog.conf Format: selector <Tab> action Ex: mail.info /var/log/mail.log Selector components Source (facility) List of facilities separated by commas or *. Importance (level) Can be none or *
/etc/syslog.conf # Log anything (except mail) of level info or higher. # Don't log private authentication messages! *.info;mail.none;authpriv.none;cron.none /var/log/messages # The authpriv file has restricted access. authpriv.* /var/log/secure # Log all the mail messages in one place. mail.* /var/log/maillog # Log cron stuff cron.* /var/log/cron # Everybody gets emergency messages *.emerg * # Save news errors of level crit and higher in a special file. uucp,news.crit /var/log/spooler # Save boot messages also to boot.log local7.* /var/log/boot.log
Logger logger –p facility.level message • facility = facility (kern, user, … local0-7) • level = emerg .. debug • message = text message string, quote if spaces
/proc/sys View running kernel configuration data ex: cat /proc/sys/fs/file-max ex: sysctl net.ipv4.ip_forward Change running kernel configuration ex: echo 48000>/proc/sys/fs/file-max ex: sysctl –w net.ipv4.ip_forward=1 Use /etc/sysctl.conf for permanent changes
References • Syed Mansoor Sarwar, Robert Koretsky, Syed Ageel Sarwar, UNIX: The Textbook, 2nd edition, Addison-Wesley, 2004. • Nicholas Wells, The Complete Guide to Linux System Administration, Thomson Course Technology, 2005.