330 likes | 492 Views
Data Protection: RAID. Chapter 3. Presented by: Anupam Mittal. Lecture 7. Data protection: Concept of RAID and its Components. Chapter Objectives. After completing this chapter, you will be able to: Describe what is RAID and the needs it addresses
E N D
Data Protection: RAID Chapter 3 Presented by: Anupam Mittal
Lecture 7 • Data protection: Concept of RAID and its Components Data Protection: RAID
Chapter Objectives After completing this chapter, you will be able to: • Describe what is RAID and the needs it addresses • Describe the concepts upon which RAID is built • Define and compare RAID levels • Recommend the use of the common RAID levels based on performance and availability considerations • Explain factors impacting disk drive performance Data Protection: RAID
Why RAID • Performance limitation of a single drive disk drive • Limited Capacity • Limited access speed • An individual drive has a certain life expectancy • Measured in MTBF • Example - If the MTBF of a drive is 750,000 hours, and there are 100 drives in the array, then the MTBF of the array becomes 750,000 / 100, or 7,500 hours • RAID was introduced to mitigate this problem • RAID provides: • Increase capacity • Higher availability • Increased performance Data Protection: RAID
RAID - Redundant Array of Independent Disks RAIDController Host RAID Array RAID Arrays
Host RAID Array Components Physical Array Logical Array RAIDController Hard Disks RAID Array Data Protection: RAID
RAID Implementations • Hardware (usually a specialized disk controller card) • Controls all drives attached to it • Array(s) appear to host operating system as a regular disk drive • Provided with administrative software • Software • Runs as part of the operating system • Performance is dependent on CPU workload • Does not support all RAID levels Data Protection: RAID
RAID Levels • 0 Striped array with no fault tolerance • 1 Disk mirroring • 3 Parallel access array with dedicated parity disk • 4 Striped array with independent disks and a dedicated parity disk • 5 Striped array with independent disks and distributed parity • 6 Striped array with independent disks and dual distributed parity • Nested RAID (i.e., 1 + 0, 0 + 1, etc.) Data Protection: RAID
0 4 8 RAID Redundancy: Parity 1 5 9 2 RAIDController 6 10 3 7 Host 11 0 1 2 3 4 5 6 7 8 9 10 11 Parity Disk RAID Arrays
The middle drive fails: Parity Calculation Data 5 5 + 3 + 4 + 2 = 14 Data 3 Data 4 5 + 3 + ? + 2 = 14 ? = 14 – 5 – 3 – 2 ? = 4 Data 2 Parity 14 RAID Array RAID Arrays
Lecture 8, 9, 10 • Different RAID levels and their suitability for different application environments: RAID 0, RAID 1 RAID Arrays
Strips Stripe Stripes Data Organization: Striping Stripe 1 Strip 2 Strip 3 Strip 1 Strip 1 Strip 2 Strip 3 Stripe 1 Stripe 2 Data Protection: RAID Strips
RAIDController Host RAID 0 – Striped Array with no Fault Tolerance 0 1 5 9 2 6 10 3 7 11 Data Protection: RAID
Block 1 Block 0 Block 1 Block 1 Block 0 Block 0 RAID 1 – Disk Mirroring RAIDController Host Data Protection: RAID
RAID 1 RAIDController RAID 0 Host Block 3 Block 2 Block 1 Block 0 Block 3 Block 2 Block 1 Block 0 Nested RAID – 0+1 (Striping and Mirroring) Data Protection: RAID
RAID 1 RAIDController RAID 0 Host Block 3 Block 1 Block 2 Block 3 Block 0 Block 1 Block 3 Block 0 Block 0 Block 1 Block 2 Block 2 Nested RAID – 0+1 (Striping and Mirroring) Data Protection: RAID
RAID 0 RAIDController RAID 1 Host Block 3 Block 3 Block 1 Block 0 Block 0 Block 1 Block 2 Block 2 Nested RAID – 1+0 (Mirroring and Striping) Data Protection: RAID
RAID 0 RAIDController RAID 1 Host Block 0 Block 0 Block 2 Block 2 Block 3 Block 3 Block 1 Block 1 Block 0 Block 2 Nested RAID – 1+0 (Mirroring and Striping) Data Protection: RAID
RAID 0+1 vs. RAID 1+0 • Benefits are identical under normal operations • Rebuild operations are very different • RAID 1+0 uses a mirrored pair – only 1 disk is rebuilt if a disk fails • RAID 0+1 if a single drive fails, the entire stripe is faulted • RAID is 0+1 is a poorer solution and is less common RAID Arrays
0 4 8 RAID Redundancy: Parity 1 5 9 2 RAIDController 6 10 3 7 Host 11 0 1 2 3 4 5 6 7 8 9 10 11 RAID Arrays Parity Disk
RAIDController Host The middle drive fails: RAID Redundancy: Parity 0 4 1 6 5 9 1 ? 3 7 7 11 Parity calculation 4 + 6 + 1 + 7 = 18 0 1 2 3 4 5 6 7 18 4 + 6 + ? + 7 = 18 ? = 18 – 4 – 6 – 7 ? = 1 Data Protection: RAID Parity Disk
RAIDController ParityGenerated Host Block 1 Block 2 Block 3 Block 0 Block 3 Block 2 Block 1 Block 0 P 0 1 2 3 RAID 3 – Parallel Transfer with Dedicated Parity Disk Data Protection: RAID
P 0 1 2 3 P 0 1 2 3 Block 1 Block 3 Block 2 Block 0 Block 0 P 4 5 6 7 Block 5 Block 7 Block 6 Block 4 RAIDController Block 0 Block 0 ParityGenerated P 0 1 2 3 RAID 4 – Striping with Dedicated Parity Disk Host RAID Arrays
P 0 1 2 3 P 0 1 2 3 Block 0 Block 0 Block 3 Block 1 Block 2 ParityGenerated P 4 5 6 7 P 4 5 6 7 Block 5 Block 6 Block 4 Block 4 Block 7 RAIDController Block 0 Block 4 Block 0 Block 4 P 4 5 6 7 ParityGenerated Host P 0 1 2 3 RAID 5 – Independent Disks with Distributed Parity Data Protection: RAID
RAID 6 – Dual Parity RAID • Two disk failures in a RAID set leads to data unavailability and data loss in single-parity schemes, such as RAID-3, 4, and 5 • Increasing number of drives in an array and increasing drive capacity leads to a higher probability of two disks failing in a RAID set • RAID-6 protects against two disk failures by maintaining two parities • Horizontal parity which is the same as RAID-5 parity • Diagonal parity is calculated by taking diagonal sets of data blocks from the RAID set members • Even-Odd, and Reed-Solomon are two commonly used algorithms for calculating parity in RAID-6 Data Protection: RAID
RAID Implementations • Hardware (usually a specialized disk controller card) • Controls all drives attached to it • Performs all RAID-related functions, including volume management • Array(s) appear to the host operating system as a regular disk drive • Dedicated cache to improve performance • Generally provides some type of administrative software • Software • Generally runs as part of the operating system • Volume management performed by the server • Provides more flexibility for hardware, which can reduce the cost • Performance is dependent on CPU load • Has limited functionality RAID Arrays
Lecture 11 • Comparison of RAID Levels Data Protection: RAID
RAID Comparison Data Protection: RAID
D4 D2 D1 P0 D3 RAID Impacts on Performance RAID Controller • Small (less than element size) write on RAID 3 & 5 • Ep = E1 + E2 + E3 + E4 (XOR operations) • If parity is valid, then: Ep new = Ep old – E4 old + E4 new (XOR operations) • 2 disk reads and 2 disk writes • Parity Vs Mirroring • Reading, calculating and writing parity segment introduces penalty to every write operation • Parity RAID penalty manifests due to slower cache flushes • Increased load in writes can cause contention and can cause slower read response times Ep new Ep old E4 old E4 new = - + 2 XOR Ep new Ep old E4 old E4 new Data Protection: RAID
RAIDController Hot Spares Data Protection: RAID
Check Your Knowledge • What is a RAID array? • What benefits do RAID arrays provide? • What methods can be used to provide higher data availability in a RAID array? • What is the primary difference between RAID 3 and RAID 5? • What is advantage of using RAID 6? • What is a hot spare? Data Protection: RAID