130 likes | 148 Views
Learn about UPS types, emergency boot disks, and RAID for improved server performance and reliability, minimizing downtime and ensuring data security.
E N D
Day 10 Hardware Fault Tolerance RAID
High availability • All servers should be on UPSs • 2 Types • Smart UPS • Serial cable connects from UPS to computer. Notifies the computer to shut down when its about to run out of power. • Uses powerd or manufacturers software to monitor UPS. • Dumb (Simple) UPS • Keeps you up for as long as it can, then just dies.
Minimizing downtime • Backups make sure your data is secure. But they don’t help if a memory DIMM goes bad. That machine is down until you replace that DIMM. • Its always a good idea to have: • More than one main server. • Some replacement parts on hand just incase. • Maybe even a redundant machine. • Emergency boot disks. • Servers in multiple physical locations if possible.
Emergency Boot Disks • In case of hard drive failure, such that the system can’t boot. Having backups on tape doesn’t help until you can get the system booted. • A boot disk is usually shipped on most CD versions of Linux. In addition you can download them from manufacturers web sites. • Of course this assumes you can find the CD, or get to the web.
Booting from Emergency Boot Disk • Ensure your machine has its BIOS set to boot from A:\ • Insert the boot disk. • Boot. • Some boot disks require a root disk also which can be obtained from the same place. When instructed, insert the root disk and hit enter. • Now you have a booted [but limited system]. • You may have to mount your partitions manually.
RAID • Redundant Array of Inexpensive Disks. • Raid buys you 2 things: • Improved Performance • Improved Reliability • You use multiple drives to look like one disk. • This can be done in • Hardware [expensive, faster] • Software [cheep, slower]
Raid Linear • Used to join multiple drives into one big drive. • Was important when large drives were expensive • Today 30GB of drive space is >$300 so this is not as useful as it once was. • Provides no fault tolerance. • Provides no speed improvement.
RAID 0 • Spread the contents of each file over multiple drive. • Since each drive can write its piece of data, you get great performance improvements. • Basically for each drive in the set, you theoretically get 1 time more performance. • 3 drives = 3 times the performance. • However if any 1 drive fails, you loose all data.
RAID 1 • Mirror all data on multiple drives. • If one drive fails, the data is still on the other. • Provides no performance increase, actually causes a decrease in performance. • Doubles the price of your storage.
RAID 3 • RAID 0 + Parity. • Writes data in a strip across multiple drives. • Provides better fault tolerance by writing a parity on another drive. • If a data drive fails, the parity can be used to reconstruct the missing drive. • If the parity drive fails, it can be rebuilt from the data drives. • Speed of RAID 0 + good fault tolerance.
RAID 5 • Same as RAID 3 except the parity information is also spread across the disks. • Slower to write than RAID 3, data and parity have to fight to be written.
Hardware RAID • RAID Controller card. • Expensive • Almost always SCSI. • They have their own memory to cache data reads and writes. • Have their own mini processor to compute the parity bits. • Hot swappable drives make for ease of replacement. • To the Operating System the drives appear as normal drives. It doesn’t have to know about the RAID controller.
Software RAID • Use Disk Druid to setup software RAID. • Linux supports RAID 0, 1, 5 • Can be done with IDE or SCSI drives. • Specific setup information: • http://www.redhat.com • Search for RAID.