730 likes | 860 Views
CHAPTER Network Reliability: Fault Tolerance and Other Issues. Chapter Objectives. Discuss network reliability issues Fault tolerance, tape backup, UPS etc Describe different levels of fault tolerance Levels 1, 2 and 3 Examine the relevance of file allocation tables to fault tolerance
E N D
CHAPTER Network Reliability: Fault Tolerance and Other Issues
Chapter Objectives • Discuss network reliability issues • Fault tolerance, tape backup, UPS etc • Describe different levels of fault tolerance • Levels 1, 2 and 3 • Examine the relevance of file allocation tables to fault tolerance • Explain RAID technology
Chapter Modules • An overview of network reliability • Level 1 fault tolerance • Level 2 and level 3 of fault tolerance • Practical implementation examples • RAID
MODULE An Overview of Network Reliability
Importance of Fault Tolerance • Mission critical applications are today run on networks in many organizations • Important to provide built-in fault tolerance in networks to support mission critical applications
Fault Tolerance • The ability to continue to function when a fault occurs • Example • A server with built-in fault tolerance can continue to operate even when one of its hard disks fails
Focus of Fault Tolerant Features • Most fault tolerance features are centered on a server • Disk storage in the server is the focal point of a number of fault tolerant features • Mechanical components are more susceptible to failure than electronic components • The hard disk is most vulnerable to failure in a server • A number of fault tolerant features address the possible failure of hard disks
Fault Tolerance Implementation • Software based • Hardware based • A combination of both
Sever Based Implementation of Fault Tolerance • Level 1 • Level 2 • Level 3
Preview of Fault Tolerance • Based on the premise of maintaining multiple copies of critical components • Level 1 • Duplicate FATs • Level 2 • Duplicate server hard disks • Level 3 • Duplicate servers
RAID Storage: The Practical Implementation • Redundant Array of Independent Disks • Data is stored in a RAID subsystems • A largely hardware-based implementation
Other Features • Uninterruptible Power Supply (UPS) • Tape backup
Uninterruptible Power Supply (UPS) • Ensures the uninterruptible supply of power to the server • Batteries in the UPS will continue to provide power in the event of a power outage
UPS Implementations • AS-400 example • When the power goes down the UPS takes over and systematically shuts down the system preserving the data files • Other implementations • Power loss---- UPS takes over - • - Standby generator is activated by sensors • The process is reversed when the power come back
Tape Backup • Used more as a precautionary measure than a fault tolerant measure • Data on the server is periodically backed up on a tape • If the disk storage fails on the server: • A previously stored version of the data is loaded on to a newly installed disk storage on the server • Offers some degree of protection against the total loss of data
Network Operating System Support for Reliability • Support for Levels 1 and 2 of fault tolerance is readily available in network operating systems • Currently support is also available for Level 3 fault tolerance as well • RAID 0,1 and 5 are commonly supported
MODULE Level 1 Fault Tolerance
Level 1 Fault Tolerance • A Software Based Solution
Support for Fault Tolerance • Provided by the network OS • Support for Level 1 and 2 has been available in OS for a period of time • Newer operating systems have support for Level 3 fault tolerance • Support for RAID is also incorporated • RAID may be considered as an extension of Level 2 fault tolerance
Level 1 Fault Tolerance • A backup copy of the File Allocation Table (FAT) is kept on the server disk • NOS uses the backup FAT should the original FAT become corrupted • This would ensure the continued operation of the server • The problem should be rectified as soon as possible
File Allocation Table (FAT) FAT FAT Backup FAT File A Size 34K ---- Start Sector 1 Track 2
A Summary of File Allocation Table (FAT) Features • Keeps track of files on the disk • Uses pointers to point to the location of the files • Tracks, sectors • Stores file related information • Size, date last modified, security information etc. • If a FAT is corrupted, none of the files on the disk can be retrieved
A Note on File Systems • Newer file systems have been introduce following FAT16 • FAT32 • Windows 95/98/ME systems • Windows 2000 OS • NTFS • Windows NT related filing technology • Windows 2000 • HPFS • OS/2 related filing technology • Linux • ext2
Newer File System Characteristics • Support longer file names • Better security • Support larger hard disks • Abide by Uniform Naming Convention (UNC) • Provide by Better security • Allows greater control to be exercised on the access to directories, files etc.
Format of Uniform Naming Convention • \\computer_name\directory_name\file_name
MODULE Levels 2 and 3 of Fault Tolerance
Levels 2 and 3 • A dominantly hardware based solution • Obviously, software support in the OS also required
Level 2 Fault Tolerance (FT) • Implemented by installing a duplicate disk in the server • The server data is duplicated on the second disk in real-time to provide fault tolerance • The duplication process itself is automatic when a NOS that supports Level 2 FT is used • In the event of a failure of the primary hard disk, the network will continue to operate using the secondary hard disk • However, immediate action must be taken to replace the failed hard disk
Level 2 FT Implementation • Types of Implementation • Disk Mirroring • Disk Duplexing • Disk Mirroring • One controller supporting two drives • Disk Duplexing • Two controllers and two drives • Each drive would have its own controller • Better protection compared to disk mirroring
Level 2 Fault Tolerance Implementation HD HD Controller Controller Controller HD HD Mirroring Duplexing
Level 3 Fault Tolerance • Dual interconnected servers are used to support the network • Second server is simply a mirror of the first server • Data mirroring is done automatically by the NOS that supports Level 3 fault tolerance
Level 3 Fault Tolerance Implementation High-Speed Link Main Server Mirrored Server Work Stations
Actual Implementation of Fault Tolerance • Level 1 is universally deployed • Level 2 requires additional hardware • Best deployed by using the RAID storage subsystem • Level 3 requires considerably more hardware and software resources • Largely used in networks that support mission critical applications
MODULE RAID Storage Subsystem
RAID Storage • Redundant Array of Independent Disks • Data is stored striped over different disk in a RAID storage subsystem
Purpose of RAID • Provide fault tolerance • Offer better performance
RAID Basics • Data is stored striped over multiple disks • Data striping is the fundamental concept pursued by RAID • Data can be recreated from the redundant disks • MTBF (Mean Time Between Failure) is reduced (MTBF of a disk/number of disks in the subsystem???)
RAID Storage Standards • RAID 0 through RAID 5 • Popular RAID formats • RAID 0, RAID 1, RAID 5 • Other formats • RAID 10 and RAID 50
RAID 0 • Data is simply stored striped over multiple disks • Does not offer fault tolerance • Offers better performance • Multiple heads access the data stored on the different drives for faster data access
RAID 0 Striping Source: Adaptech
More on Striping • Striping logically divides each hard disk into stripes • The stripes are arranged interleaved in a rotating sequence among the various disks • Data stored in the stripes for a logical sequence of storage space composed alternatively of stripes from each disk (drive) • A stripe can be as small as a sector (512 bytes) or as large as several megabytes • In general, a record falls entirely within one stripe
RAID 0 Data Access Performance Source: Adaptech
Multiple I/O Access • Most operating systems support concurrent disk I/O • I/O load must be balanced on the disks for optimum performance • Striping promotes load balancing and hence improves disk I/O performance
RAID 0 Configuration • Large stripes for multiple users • Small stripes for single users
Advantages and Disadvantages • Fast access • If one disk fails, the entire system will no more be able to use the data on all the disks
Windows Support for RAID 0 • Windows 2003 supports RAID 0 • 2 to 32 disks can be used in a set known as a striped volume