Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays

Parity LoggingOvercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Contents • Overview of some Raid systems • Small write problem • Parity logging • Floating data and parity • Comparison between different models • Concluding remarks • Questions

RAID systems consideredin this paper.

Small Write Problem • RAID 5 Small write may require prereading old data, writing new data, prereading corresponding old parity value, and writing new parity value. • RAID level 5 ,therefore, is penalized by a factor of four over nonredundant arrays for workloads of mostly small writes. • Mirrored disks are only penalized by a factor of two since data only needs to be written to two separate disks

OLTPand Small write • OLPT (On-line transaction processing) systems represent a substantial segment in of the secondary storage market . Bank System is an example • OLTP systems require update-intensive database services • Performance of OLTP is largely determined by small write performance.

Disk Bandwidth • The three components of disk access are: seek time, rotational positioning time, and data transfer time. • Small disk writes make inefficient use of disk bandwidth • Random cylinder accesses move data twice as fast as random track accesses which, in turn, move data ten times faster than random block accesses.

Parity Logging • A powerful mechanism for eliminating small write penalty. • Based on the much higher disk bandwidth of large accesses over small • A technique for logging or journaling events to transform small random accesses into large sequential accesses to log and parity disks

Basic Parity Logging Model • A RAID level 4 disk array with one additional disk, a log disk. • parity update image is held in a fault tolerant buffer • When enough parity update images are buffered, they are written to the end of the log on the log disk. • When the log disk fills up, the out-of-date parity and the log of parity update information are read into memory. • The out-of-date parity is updated (in memory) and rewritten with large sequential writes.

Basic Parity Logging Model

Reliability of Basic Logging Model • Data disk failure => • update parity disk • Reconstruct the lost data • Log or Parity disk failure • Install new empty log disk (or parity disk) • Reconstruct parity

Tracks, Cylinders, and Sectors

Parity Maintenance Time analysis (basic model vs Raid 4) • Every D small writes issued cause one track write to the log to occur • Every TVD small writes issued cause the log disk to fill up then 3 full disk accesses at cylinder data rate • => parity writes for TVD small writes consumes as much disk time as TV(D/10) + 3V(T/2xD/10) = TVD/4 • Result “Parity consumed by the parity update I/Os is reduced by about a factor of eight

Enhancing Basic parity Logging Model • Limitation • The Basic Parity Logging model is completely impractical since an entire disk’s capacity of random access memory is required to hold the parity during the application of the parity updates. • Enhancement (Parity Logging Regions) • dividing the array into regions. • Every region is treated the same way as an entire disk in the basic model • Each region has its own fault tolerant buffer

Parity Logging Regions

Enhancing Parity Logging Regions • Limitation • Log and parity disks may become performance bottlenecks if there are many disks in the array. • Enhancement (Log and parity Rotation) • Distributing parity and Logs across all the disks in the array

Log and parity Rotation

Enhancing Log and parity Rotation • Limitation • The log and parity bandwidth for a particular region is still that of a single disk. • Enhancement (Block Parity Striping) • Distributing the parity log for each region over multiple disks.

Block Parity Striping

Analytical Model • Single small write access in parity logging will on average take Which can be simplified to S + (3 + 2/D) R Without preread S + (1 + 2/D) R • More analysis • Writing fault tolerant buffers to Parity log regions. • Log parity integration

Simulation Parameters

Parity Logging Overheads vs RAID 5 Overhead (per small write) • Contributions to disk busy time for the example disk array ( previous slide) • Extra I/O done by RAID 5 cost nearly 35 milliseconds

Alternative SchemesFloating Data and Parity • Organizing data and parity into cylinders that contain either data only or parity only and • Maintaining a single track of empty space per cylinder

Floating Data Parity

Floating Data and Parity (analysis) • For RAID 5, busy time for each data and parity update is S + R + 2R/D + (2R – 2R/D) + 2R/D • With new technique (2R – 2R/D) term is replaced with a head switch and a short rotational delay ( 0.76 data units using the sample array mentioned before) • Small random write in floating data and parity is 2S+(2+11.04/D)R + 2H • This is close to mirroring performance if D is large and H is small

Model Estimates (as predicted by analysis ) I/O per second per disk

Response Times and Utilization.

Response Time Standard Deviation

Concluding Remarks • Parity logging achieves better performance than Raid Level 5 arrays • When data must be preread before being overwritten, Parity Logging is comparable to floating parity and data • Performance is superior to mirroring and floating parity and data when the data to be overwritten is cached

Questions • What is parity logging • Describe the general technique of Parity logging. • What is the small write problem, and why it is so important • What are the advantages and disadvantages of floating data and parity

Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays

Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays

Presentation Transcript

RAID Redundant Arrays of Independent Disks

Reductions to the Noisy Parity Problem

Disk Storage Arrays

Parity Declustering for Continous Operation in Redundant Disk Arrays

Raid: redundant arrays of inexpensive disks INDEPENDENT

A Case for Redundant Arrays Of Inexpensive Disks

Accretion disk Small bodies in the Solar System

O vercoming Your Worst Job Search Enemy

LOGGING IN :

A Case for Heterogeneous Disk Arrays

Issues and Challenges in the Performance Analysis of Real Disk Arrays

Disk Arrays

Disk I/O

Disk Arrays

Disk Arrays Mar. 26, 2004

DISK I/O

Issues and Challenges in the Performance Analysis of Real Disk Arrays

O vercoming I mpediments to R eform

Configuring Large Disk Arrays in an Oracle Environment

Recovery I: The Log and Write-Ahead Logging