290 likes | 424 Views
Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays. Daniel Stodolsky Garth Gibson Mark Holland. Contents. Overview of some Raid systems Small write problem Parity logging Floating data and parity Comparison between different models Concluding remarks
E N D
Parity LoggingOvercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland
Contents • Overview of some Raid systems • Small write problem • Parity logging • Floating data and parity • Comparison between different models • Concluding remarks • Questions
Small Write Problem • RAID 5 Small write may require prereading old data, writing new data, prereading corresponding old parity value, and writing new parity value. • RAID level 5 ,therefore, is penalized by a factor of four over nonredundant arrays for workloads of mostly small writes. • Mirrored disks are only penalized by a factor of two since data only needs to be written to two separate disks
OLTPand Small write • OLPT (On-line transaction processing) systems represent a substantial segment in of the secondary storage market . Bank System is an example • OLTP systems require update-intensive database services • Performance of OLTP is largely determined by small write performance.
Disk Bandwidth • The three components of disk access are: seek time, rotational positioning time, and data transfer time. • Small disk writes make inefficient use of disk bandwidth • Random cylinder accesses move data twice as fast as random track accesses which, in turn, move data ten times faster than random block accesses.
Parity Logging • A powerful mechanism for eliminating small write penalty. • Based on the much higher disk bandwidth of large accesses over small • A technique for logging or journaling events to transform small random accesses into large sequential accesses to log and parity disks
Basic Parity Logging Model • A RAID level 4 disk array with one additional disk, a log disk. • parity update image is held in a fault tolerant buffer • When enough parity update images are buffered, they are written to the end of the log on the log disk. • When the log disk fills up, the out-of-date parity and the log of parity update information are read into memory. • The out-of-date parity is updated (in memory) and rewritten with large sequential writes.
Reliability of Basic Logging Model • Data disk failure => • update parity disk • Reconstruct the lost data • Log or Parity disk failure • Install new empty log disk (or parity disk) • Reconstruct parity
Parity Maintenance Time analysis (basic model vs Raid 4) • Every D small writes issued cause one track write to the log to occur • Every TVD small writes issued cause the log disk to fill up then 3 full disk accesses at cylinder data rate • => parity writes for TVD small writes consumes as much disk time as TV(D/10) + 3V(T/2xD/10) = TVD/4 • Result “Parity consumed by the parity update I/Os is reduced by about a factor of eight
Enhancing Basic parity Logging Model • Limitation • The Basic Parity Logging model is completely impractical since an entire disk’s capacity of random access memory is required to hold the parity during the application of the parity updates. • Enhancement (Parity Logging Regions) • dividing the array into regions. • Every region is treated the same way as an entire disk in the basic model • Each region has its own fault tolerant buffer
Enhancing Parity Logging Regions • Limitation • Log and parity disks may become performance bottlenecks if there are many disks in the array. • Enhancement (Log and parity Rotation) • Distributing parity and Logs across all the disks in the array
Enhancing Log and parity Rotation • Limitation • The log and parity bandwidth for a particular region is still that of a single disk. • Enhancement (Block Parity Striping) • Distributing the parity log for each region over multiple disks.
Analytical Model • Single small write access in parity logging will on average take Which can be simplified to S + (3 + 2/D) R Without preread S + (1 + 2/D) R • More analysis • Writing fault tolerant buffers to Parity log regions. • Log parity integration
Parity Logging Overheads vs RAID 5 Overhead (per small write) • Contributions to disk busy time for the example disk array ( previous slide) • Extra I/O done by RAID 5 cost nearly 35 milliseconds
Alternative SchemesFloating Data and Parity • Organizing data and parity into cylinders that contain either data only or parity only and • Maintaining a single track of empty space per cylinder
Floating Data and Parity (analysis) • For RAID 5, busy time for each data and parity update is S + R + 2R/D + (2R – 2R/D) + 2R/D • With new technique (2R – 2R/D) term is replaced with a head switch and a short rotational delay ( 0.76 data units using the sample array mentioned before) • Small random write in floating data and parity is 2S+(2+11.04/D)R + 2H • This is close to mirroring performance if D is large and H is small
Model Estimates (as predicted by analysis ) I/O per second per disk
Concluding Remarks • Parity logging achieves better performance than Raid Level 5 arrays • When data must be preread before being overwritten, Parity Logging is comparable to floating parity and data • Performance is superior to mirroring and floating parity and data when the data to be overwritten is cached
Questions • What is parity logging • Describe the general technique of Parity logging. • What is the small write problem, and why it is so important • What are the advantages and disadvantages of floating data and parity