190 likes | 282 Views
Three-Dimensional Redundancy Codes for Archival Storage. J.-F. Pâris, U. of Houston D. D. E. Long, U. C. Santa Cruz W. Litwin, U. Paris-Dauphine. Background. Archival files Must be kept a long time At lowest possible cost Emphasis on Providing highest reliability at lowest cost
E N D
Three-Dimensional Redundancy Codesfor Archival Storage J.-F. Pâris, U. of HoustonD. D. E. Long, U. C. Santa CruzW. Litwin, U. Paris-Dauphine
Background • Archival files • Must be kept a long time • At lowest possible cost • Emphasis on • Providing highest reliability at lowest cost • Update speed is less important • Focus on multi-dimensional RAID arrays • Highly reliable • Very space-efficient
A two-dimensional RAID array D12 D11 P1 P2 D22 D21 Q1 Q2 • Four parity disks • Four parity stripes • Four data disks
A better array P2 P1 P3 D23 D13 D34 D14 D24 P4 • Four parity disks • Four parity stripes • Six data disks D12
Can we do better? • Use a three-dimensional organization • Replace parity stripes by parity planes • Each parity plane will contain one parity disk • Place data disks will at the intersections of three parity planes
Example • Parity planes α, β, γ and δ • Four data disks αβγ,αβδ, αγδ and βγδ
Advantages • With npparity disks, we can protect data disks against all triple failures • 2-D organizations with same number of parity disks could only protect data disks and only against all double failures
Drawback of 3-D arrays • More complex update procedure • Each time we modify a data block, we have to update three parity blocks • Not an issue for data that are rarely updated • Archives, media
Handling quadruple failures • Only a few specific quadruple failures are fatal • We show that array can tolerate fractionof all quadruple failures
Selected results Compared the MTTDL of a 3-D array with 20 data disks and 6 parity disks with those of Two RAID arrays with 10 data disks and 3 parity disks each 60 disks using three-way mirroring to store the equivalent content of 20 data disks A 2-D array with 21 data disks and 7 parity disks under standard stochastic assumptions
System Parameters Disk mean time to fail was assumed to be 100,000 hours (11 years and 5 months) Corresponds to a failure rate l of 8 to 9 percent per year High end of failure rates observed by Schroeder and Gibson and Pinheiro et al. Disk repair times varied between 12 hours and one week
Conclusion • 3-D RAID arrays require • Fewer parity disks than comparable RAID array organizations to achieve • Higher MTTDLs • Sole limitation is cost of updates
Work in Progress • Can we build zero-maintenance disk arrays? • Start with a 3-D RAID array • Add enough spares to last several years • Critical factor is failure rate of unused spares • Potential for one or two MS theses • Require willingness to learn Python
Our Model Device failures are mutually independent and follow a Poisson law A reasonable approximation Device repairs can be performed in parallel Device repair times follow an exponential law Not true but fairly robust H.-W. Kao, J.-F. Paris, T. Schwarz, S. J., and D. D. E. Long, A Flexible Simulation Tool for Estimating Data Loss Risks in Storage Arrays, Proc. MSST Symposium, May 2013.
State Diagram • State 0 is initial state • a is the fraction of quadruple disk failures that result in a data loss