230 likes | 239 Views
This chapter discusses the concept of ideal disks and introduces RAIDs (Redundant Arrays of Inexpensive Disks), which are a data storage virtualization technology used for data redundancy and performance improvement. It covers various types of RAIDs, including RAID level 0 (striping), RAID level 1 (mirroring), RAID level 4 (saving space with parity), and RAID level 5 (rotating parity). The chapter also explores how to evaluate a RAID based on capacity, reliability, and performance. Additional topics such as RAID comparison and other related issues are discussed.
E N D
Chapter 38Redundant Arrays of Inexpensive Disks(RAIDs) Joongsuk Park (jspark@dcslab.snu.ac.kr),Jun Heo (heojun18@gmail.com) School of Computer Science and Engineering Seoul National University
Outline • Ideal Disks? • RAIDs • Various Types ofRAIDs • RAID level 0: Striping • RAID level 1: Mirroring • RAID level 4: Saving space with parity • RAID level 5: Rotating parity • RAID Comparison • Other Issues • Summary
Ideal Disk? • Fast • Fast • Ideal • Disk • Ideal • Disk RAIDs • Large • Large • Reliable • Reliable
RAIDs • RAID is a data storage virtualization technology for the purpose of data redundancy or performance improvement • Advantages of RAIDs • Performance • Parallelism • Capacity • Multiple disk drive components • Reliability • Mirroring • Striping with parity • RAIDs provide these advantages transparently • Improve deployability
How to Evaluate a RAID • Capacity • A set of Ndisks with B blocks • Without redundancy: N ∙ B • Keep onecopy of each block: (N ∙ B) / 2 • Reliability • Fault-tolerance • Performance • Depend on the workload
Evaluating RAID Performance • Two types of performance metrics • Single-request latency • How much parallelism can exist during a single logical I/O operation • Steady-state throughput • Total bandwidth of many concurrent requests • Two types of workloads • Sequential (S MB/s) • - Spend little time seeking and waiting for rotation • - Spend most of its time transferring data • Random (R MB/s) • - The opposite of a sequential type
RAID Level 0: Striping RAID 0 0 1 2 3 4 5 6 7 Stripe 8 9 10 11 Block 12 13 14 15 DISK 0 DISK 1 DISK 2 DISK 3 • Excellent upper-bound on performance and capacity • Simple round-robin striping over multiple disks • Extract the most parallelism in I/O requests • The stripe consists of multiple blocks of data in the same row
RAID Level 0: Striping RAID 0 0 2 4 6 1 3 5 7 8 10 12 14 Chunk Size = 2 blocks 9 11 13 15 DISK 0 DISK 1 DISK 2 DISK 3 • Chunk size mostly affects performance • Performance tradeoff with chunk size • Parallelism vs. Positioning time to access blocks on multiple disks • Small chunk size increases parallelism of read/write • Big chunk size reduces positioning time
RAID Level 0: Analysis • Capacity • Useful capacity: N (the number of disks)∙ B (the number of blocks) • Reliability • Any disk failure leads to data loss • Performance • All disks are utilized, often in parallel (N: the number of disks, S: sequential throughput, R: random throughput, T: Single disk’s latency)
RAID Level 1: Mirroring RAID 1 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 DISK 0 DISK 1 DISK 2 DISK 3 • Simply make copies of each block • Each copy should be placed on a separate disk • Tolerate a disk failure • Read/Write a block • When reading a block, RAID 1 can choose either copy • When writing a block, RAID 1 must update both copies
RAID Level 1: Analysis (N: the number of disks, S: sequential throughput, R: random throughput, T: Single disk’s latency) • Mirroring Level = 2 • Capacity • Half of our peak useful capacity: (N ∙ B) / 2 • Reliability • Tolerate the failure of any one disk (up to N / 2 failure depending on which disk fails) • Performance
RAID Level 4: Saving Space With Parity RAID 4 0 1 2 P0 3 4 5 6 P1 7 8 9 10 P2 11 12 13 14 P3 15 DISK 0 DISK 1 DISK 2 DISK 3 Parity • Add a parity disk • Overcome the huge space penalty by mirroring • Add a single parity block for that stripe of blocks
RAID Level 4: Saving Space With Parity • How to compute parity • The number of 1s in any row must be even(not odd) • Simply perform a bitwise XOR across each bit of the data blocks • Put the result of each bitwise XOR into the corresponding bit slot in the parity block
RAID Level 4: Analysis (N: the number of disks, S: sequential throughput, R: random throughput, T: Single disk’s latency) • Capacity • One disk for parity information: (N - 1) ∙ B • Reliability • Tolerate one disk failure and no more • Performance
RAID Level 4: Analysis • Performance • Steady-state throughput • Sequential read • Utilize all of the disks except for the parity disk: (N - 1) ∙ S • Sequential write • When writing a big chunk of data to disk, RAID 4 performs a optimization, called full-stripe write • Full-stripe write • - Sequential write: (N - 1) ∙ S • - Random read: (N - 1) ∙ R • Random write • If we wish to overwrite one block • How can we update a parity block both correctly and efficiently • - Two methods for updating a parity block
RAID Level 4: Analysis Old 2. Read old data from all disks 3. Compute new parity XOR 4. Write new parity New 1. Write new data 1 • Additive parity • Larger RAIDs require a high number of reads to compute parity
RAID Level 4: Analysis Old 1. Read old data 2. Read old parity XOR XOR 1 Pnew = (Cold ⊕ Cnew) ⊕ Pold 3. Write new data 4. Write new parity New Subtractive parity
RAID Level 4: Analysis • Small-write problem • The parity disk is the bottleneck • Even though the data disks could be accessed in parallel, the parity disk reduces the parallelism of RAID’s architecture • Random write = R / 2
RAID Level 5: Rotating Parity RAID 5 0 1 2 3 P0 4 5 6 P1 4 10 11 P2 8 9 15 P3 12 13 14 P4 16 17 18 19 DISK 0 DISK 1 DISK 2 DISK 3 DISK 4 • Rotate the parity block across drives • Remove the parity-disk bottleneck for RAID 4 • Parity information is spread over all disks
RAID Level 5: Analysis (N: the number of disks, S: sequential throughput, R: random throughput, T: Single disk’s latency) • Much of the analysis for RAID 5 is identical to RAID 4 • Effective capacity and failure tolerance • Sequential read/write performance is also identical to RAID 4 • Random read performance is a little better • Due to the utilization of all disks • Random write performance improves over RAID 4 • Allow for parallelism across requests
RAID Comparison: A Summary • RAID Capacity, Reliability, and Performance
Other Issues • Other RAID designs • Levels 2, 3, and 6 • Solution to a disk fails • Hot spare • Automatically fill in for the failed disk • Software RAID systems • Pros • Lower cost due to lack of RAID-dedicated h/w • Cons • Lower RAID performance as CPU also powers the OS and applications
Summary • Ideal disk • Large, Fast, and Reliable disk • RAIDs • Variant RAIDs levels • RAID level 0 : Striping • RAID level 1 : Mirroring • RAID level 4 : Saving space with parity • RAID level 5 : Rotating parity • RAID comparison • Other Issues