220 likes | 352 Views
T111. RAID (not the fly spray…). RAID. Redundant Array of Inexpensive Disks HDDs are the main type of storage used in computing They hold vast amounts of Data for their size They are relatively cheap but… They are relatively slow. Question…. What is better? 1 x 1TB HDD 2 x 512GB HDD.
E N D
T111 RAID (not the fly spray…)
RAID • Redundant Array of Inexpensive Disks • HDDs are the main type of storage used in computing • They hold vast amounts of Data for their size • They are relatively cheap but… • They are relatively slow
Question… • What is better? • 1 x 1TB HDD • 2 x 512GB HDD
RAID • One HDD is limited to: • One read / write at a time • One interface • Multiple HDDs working together can overcome these limitations and more! • Disks used in a RAID array must be the same size, otherwise they will all be reduced to the size of the smallest disk
Performance • When a file or several files are shared across two or more disks, they can be accessed / read / written faster • Imagine a large video file 3GB in size: • 1 disk can copy the file in 3 units of time • 2 disks can copy the file in 1.5 units of time • 3 disks can copy the file in 1 unit of time • 4 disks can copy the file in… ? • This is how RAID improves performance (in theory…)
Reliability • Disks are prone to failure • The more disks you have – the more likely it is that one will fail • If however, you have two disks that both have a copy of exactly the same data – you have halved your chance of loosing that data due to disk failure
Management • Windows assigns a drive letter to each HDD installed in the computer • If you have 10 HDDs then you have 10 different drive letters to manage • If your 10 HDDs are all 90% full you cannot store a file that would take up 11% of one of your HDDs • RAID can combine these disks to appear as one
RAID 0 • RAID is defined in levels • Level 0 RAID uses two or more disks, which spread files across all the disks in the array RAID 0 Array HDD1 HDD2 One HDD without RAID
RAID 0 • Good for performance • The read / write performance will increase for every disk added to the array • If one of the disks fail – all data is lost! Bye bye data…
RAID 0 • High performance • High risk – low reliability • Good for storage of temporary files where performance needs to be high but reliability does not • Would the page file benefit from this?
RAID 1 • Also called Disk Mirroring • Requires two disks • All data is written twice – once to each disk • At any time, there are two copies of all the data e.g.
RAID 1 • Performance is similar to having one disk • Write performance is slightly slower as everything has to be written to both disks – albeit simultaneously • Read performance can be slightly faster as there are two copies to read from • High reliability – if one disk fails, not only is there another copy of the data but the computer can carry on unaffected • Low efficiency – one of the disks is essentially wasted as two 500GB disks will be seen as one 500GB disk • Good for when reliability needs to be high • When read performance needs to be higher than write performance e.g. the operating system
RAID 5 • Similar to RAID 0 except that another disk is used as Parity • Minimum of 3 disks
RAID 5 • Uses Parity to add redundancy to the array • If the first disk fails, it can be remade using the other two disks • But how?! • Using XOR (Exclusive Or)
XOR • You do not need to know how XOR works… • But you do need to know the rules • The symbol for XOR is ⊕ X, Y and Z are three numbers. If X ⊕ Y = Z then Y ⊕ X = Z and X ⊕ Z = Y and Z ⊕ Y = X etc..
XOR • This rule is easily extended so that if W is also some number then: If X ⊕ Y ⊕ Z = W then X ⊕ Y = W ⊕ Z and X ⊕ Z = W ⊕ Y and Z ⊕ Y = W ⊕ X and Z = W ⊕ X ⊕ Y and X = W ⊕ Z ⊕ Y etc.. • Given that 0x45⊕0x76 = 0x33 what is 0x76 ⊕ 0x33 ?
RAID 5 • RAID 5 uses XOR to add redundancy to an array • In RAID 5 the parity is spread across the disks
RAID 5 • Disk 2 has failed… • This is how the RAID controller will recover your data: Block A: read directly from disk 1. Block B: read block A from disk 1 and (A ⊕ B) from disk 3, calculate B = A ⊕ (A ⊕ B) Block C: read directly from disk 1. Block D: read directly from disk 3. Block E: read block E F from disk 1 and F from disk 3, calculate E = (E ⊕ F) ⊕ F Block F: read directly from disk 3. • If you want to save new data B’, this is what the RAID controller does: • Read Block A from disk 1 and calculate new (A ⊕ B)’ = A ⊕ B’ and write this to disk 3.
RAID 5 • Read performance is better than a single disk as all disks can be read simultaneously • Write performance will be poor because we need to write the data onto one disk, calculate the parity and then store the parity on another disk • Reliability • One disk can fail without data loss • Read and write performance will be terrible as a simple read may need to be calculated using the parity of the other disk • Efficiency • The equivalent of one disk in the array will be used for parity, this means that you loose one disk worth of storage
RAID 5 • Good all rounder • Lower wastage of disk space (compared to RAID 1) • Fair performance • Your H: drive probably uses RAID 5 or similar • Not so good for operating system drives as the OS often crashes when one disk fails due to the poor performance
Disk Reads / Writes • The number of times a disk has to be read or written for a given file affects the performance of the operation • The performance of the operation can be increased by performing these reads / writes at the same time on different disks
RAID Quiz • If you have four 500GB drives in a RAID 5 array, how big is the logical drive as seen by the operating system? • What about five 500GB drives?