Data Protection: RAID

Section 1 : Storage System Data Protection: RAID Chapter 3

Why RAID • Performance limitation of disk drive • An individual drive has a certain life expectancy • Measured in MTBF (Mean Time Between Failure) • The more the number of HDDs in a storage array, the larger the probability for disk failure. For example: • If the MTBF of a drive is 750,000 hours, and there are 100 drives in the array, then the MTBF of the array becomes 750,000 / 100, or 7,500 hours • RAID was introduced to mitigate this problem • RAID provides: • Increase capacity • Higher availability • Increased performance

Chapter objectives After completing this chapter, you will be able to: • Describe what is RAID and the needs it addresses • Describe the concepts upon which RAID is built • Define and compare RAID levels • Recommend the use of the common RAID levels based on performance and availability considerations • Explain factors impacting disk drive performance

Host RAID Array Components Physical Array Logical Array RAIDController Hard Disks RAID Array

RAID-redundant array of inexpensive disks • It is an enclosure that contains a number of HDDs • HDD inside RAID contained in smaller sub –enclosure called physical array=hold a fixed num of HDD • A subset of disks within a RAID array can be grouped to form logical associations called logical array.

RAID Implementations • Hardware (usually a specialized disk controller card) • Controls all drives attached to it • Array(s) appear to host operating system as a regular disk drive • Provided with administrative software • Software • Runs as part of the operating system • Performance is dependent on CPU workload • Does not support all RAID levels

RAID levels • Are defined on the basis of striping, mirroring and parity • These techniques determine the data availability and performance characteristics of an array.

RAID Levels • 0 Striped array with no fault tolerance • 1 Disk mirroring • Nested RAID (i.e., 1 + 0, 0 + 1, etc.) • 3 Parallel access array with dedicated parity disk • 4 Striped array with independent disks and a dedicated parity disk • 5 Striped array with independent disks and distributed parity • 6 Striped array with independent disks and dual distributed parity

striping • The process of dividing a body of data into blocks and spreading the data blocks across several partitions on several HD • A RAID is a group of disks • Within each disk, a predefined num of contiguous addressable disk blocks are defined as strips • The set of aligned strips that spans across all the disks with the RAID set is called stripe

Strip Stripe Stripe Data Organization: Striping Strip 1 Strip 2 Strip 3 Stripe 1 Stripe 2 Strips

mirroring • Techniques whereby data is stored on two different HDD-have two copies of data. • Involved duplication of data • Expensive • Improved read performance • But not the write performance A A B B

Parity • Method used to protect striped data from HDD failures without the cost of mirroring. • Additional HDD is added to the stripe width to hold parity. • Mathematical construct that allow re-creation of the missing data. • A redundancy check that ensure a full protection of data without maintaining a full set of duplicate data. • The even number =0 • The odd number = 1

Parity-example 0 1 0 1 0 0 1 1 1 1

RAID 0 • Data is distributed across the HDDs in the RAID set. • Allows multiple data to be read or written simultaneously, and therefore improves performance. • Does not provide data protection and availability in the event of disk failures.

RAIDController Host RAID 0 0 1 5 9 2 6 10 3 7 11

RAID 1 • Data is stored on two different HDDs, yielding two copies of the same data. • Provides availability. • In the event of HDD failure, access to data is still available from the surviving HDD. • When the failed disk is replaced with a new one, data is automatically copied from the surviving disk to the new disk. • Done automatically by RAID the controller. • Disadvantage: The amount of storage capacity is twice the amount of data stored. • Mirroring is NOT the same as doing backup!

Block 1 Block 0 Block 1 Block 1 Block 0 Block 0 RAID 1 RAIDController Host

Nested RAID • Combines the performance benefits of RAID 0 with the redundancy benefit of RAID 1. • RAID 0+1 – Mirrored Stripe • Data is striped across HDDs, then the entire stripe is mirrored. • If one drive fails, the entire stripe is faulted. • Rebuild operation requires data to be copied from each disk in the healthy stripe, causing increased load on the surviving disks. • RAID 1+0 – Striped Mirror • Data is first mirrored, and then both copies are striped across multiple HDDs. • When a drive fails, data is still accessible from its mirror. • Rebuild operation only requires data to be copied from the surviving disk into the replacement disk.

RAID 3 and RAID 4 • Stripes data for high performance and uses parity for improved fault tolerance. • One drive is dedicated for parity information. • If a drive files, data can be reconstructed using data in the parity drive. • For RAID 3, data read / write is done across the entire stripe. • Provide good bandwidth for large sequential data access such as video streaming. • For RAID 4, data read/write can be independently on single disk.

RAID 5 and RAID 6 • RAID 5 is similar to RAID 4, except that the parity is distributed across all disks instead of stored on a dedicated disk. • This overcomes the write bottleneck on the parity disk. • RAID 6 is similar to RAID 5, except that it includes a second parity element to allow survival in the event of two disk failures. • The probability for this to happen increases and the number of drives in the array increases. • Calculates both horizontal parity (as in RAID 5) and diagonal parity. • Has more write penalty than in RAID 5. • Rebuild operation may take longer than on RAID 5.

RAID Comparison

RAIDController Hot Spares

Hot spare • Refer to a spare HDD in a RAID array that temporary replaces a failed HDD of a RAID set. • One of the following method of data recovery is performed depending on the RAID implementation: • If parity RAID is used, then the data is rebuild on the HS from the parity and data on the surviving HDDs in the RAID set. • If mirroring is used then the data from the surviving mirror is used to copy the data. • When the failed HDD is replaced with a new HDD, one of the following takes place; • The HS replace the new HDD permanently – no longer HS, need to configured new HS • When a new HDD is added to the system, data from the HS is copied to it. • The HS returns to idle state

RAID penalty • I/O operations =R/W • Read operation =no penalty since we can read many times • Write operation= have penalty – change the data , every write operation translates into more I/O overhead for the disks=known as write penalty. • Example 1: in RAID 1, write has to be performed to disks-data has to be written twice –once on each disks (RAID 1 has 2 disks-mirror set} • Therefore the write penalty =2 • Example 2: RAID 5 , have write penalty =4 since data has to be performed 4 times to disks- read existing data, read parity, write new data, write new parity!

Continue.. • RAID level 6 has 6 write penalty…why? • Why we need to know this? • To calculate the disk load in different types of RAID • Example 1: • we have an application that generates 5,200 input output per second (IOPS), with 60% of them being read. The disk load in RAID 5 is calculated as below: = 0.6 x 5200 + 4 X (0.4 x 5200) = 3120 + 4 X 2080 =3120+8320 = 11,440 IOPS

Continue… • Example 2: the disk load in RAID 1 is calculated below: = 0.6 X 5200 + 2 x( 0.4 x 5200) = 3120 + 2 x 2080 = 3120+4160 = 7280 IOPS From these calculation, we can determine the number of disk required for an applications. Example, an HDD with a specification of a maximum 180 IOPS for the application needs to be used, the number of disks required to meet the workload for the RAID configuration will be: RAID 5: 11,440/180 =64 disks RAID 1 : 7280/180 = 42 disk ( approx to the nearest even number).

Chapter Summary Key points covered in this chapter: • What RAID is and the needs it addresses • The concepts upon which RAID is built • Some commonly implemented RAID levels

#1 IT company For more information visit http://education.EMC.com

Data Protection: RAID