310 likes | 443 Views
Automatic RAID Construction. Ba-Quy Vuong (Bryan) and Yiying Zhang Department of Computer Sciences University of Wisconsin-Madison. Outline. Introduction Architecture and Design Implementation Performance Evaluation Conclusion & Future Work. What is RAID?.
E N D
Automatic RAID Construction Ba-QuyVuong(Bryan) and Yiying Zhang Department of Computer Sciences University of Wisconsin-Madison
Outline • Introduction • Architecture and Design • Implementation • Performance Evaluation • Conclusion & Future Work
What is RAID? • Redundant Arrays of Inexpensive Disks • Purposes: • Reliability • Performance
RAID Implementation • Hardware RAID: • Using dedicated hardware to control the disk array • Host independent • Software RAID: • Using a software layer sitting above the disk drivers to control the disk array • Host dependent
Problems with Software RAID • There are many ways to build RAID systems, including: • Different checksum-based schemes • Different parity-based scheme • Not Flexible: Each RAID level requires a specific RAID driver • Not Robust: Writing a new RAID driver is time-consuming andmay have lots of bugs
Our Solution: Automatic RAID Construction • Approach: • A way to describe checksum and parity-based schemes • Mapping the specified scheme to a RAID driver • Advantages: • Flexibility • Robustness
Outline • Introduction • Architecture and Design • Implementation • Performance Evaluation • Conclusion & Future Work
Architecture • Design Consideration • Parity on top of Checksum • Checksum on top of Parity
Architecture • Example: • 3-disk RAID 5 • Mirroring checksum
Automatic Parity • Goals: Allows any parity scheme • Two data structures • Layout matrix: How blocks are laid out • The whole matrix corresponds to a stripe • Each row corresponds to one strip • Zeros mean data blocks, ones mean parity blocks • Number of columns is the number of disks 4-disk RAID 0+1 4-disk RAID 4 4-disk RAID 5
Automatic Parity • Two data structures • Parity matrix: What data blocks contribute to a parity block • #rows: #parity blocks in one stripe • #columns: #data blocks in one stripe • The element at row i, column j is one means the data block j is used to calculate the parity block i 4-disk RAID 4 4-disk RAID 0+1 4-disk RAID 5
Automatic Checksum - Goals • Checksum over data and parity blocks • Flexible number of blocks as a checksum unit • Flexible checksum size • Flexible functions • Flexible locations
Automatic Checksum - Design • User specified parameters: • # ofblocks as a checksum unit • Checksum size for each block • Checksum function • Example: • 3 blocks as a checksum unit • 1 block for checksums • One more level mapping
Outline • Introduction • Architecture and Design • Implementation • Performance Evaluation • Conclusion & Future Work
Implementation • RAID driver is implemented as a device driver in Linux • Checksums and parities are specified by users • Checksum functions • Provided: sum, hash-based
Implementation • Memory-based version • Uses each memory chunk as a disk • Easy to build and debug • No significant effect on the overall code • Disk-based version • Uses real disks • Communicates with disk drivers through bio structure • Problems of synchronization due to asynchronous IOs
Outline • Introduction • Architecture and Design • Implementation • Performance Evaluation • Conclusion & Future Work
Performance Evaluation: Setup • Host: VMWare, Fedora 8, Intel Core 2 Duo 2.2GHz, 1GB RAM • Memory-based • Simulating disk delay • Each low-level disk read: 15ms • Each low-level disk write: 17ms • Simulating disk failure • Unable to read (20%) • Read inconsistency (20%)
Performance Evaluation: Settings • Evaluation settings • With and without reconstruction • Different layouts, parity logics, and checksum functions • Different workloads • Systems: • System 1: 4-disk no parity, no checksum • System 2: 4-disk Raid 0+1 with hash-based checksum • System 3: 4-disk Raid 0+1 with sum checksum • System 4: 4-disk Raid 5 with hash-based checksum • System 5: 4-disk Raid 5 with sum checksum • Workload: • reading, writing 30KB files • mkfs, mount
Outline • Introduction • Architecture and Design • Implementation • Performance Evaluation • Conclusion & Future Work
Conclusion • Why automatic RAID? • Flexible vs. fixed raid drivers • Robustness • Approach • Automatic Parity with two matrices • Automatic Checksum with user-defined parameters • Lessons learned • Performance is a big issue • Disk-based RAID is much harder to implement than Memory-based RAID
Future Work • Complete the disk-based version • Improve the performance • Check for input correctness • Extend the parity and checksum layers to handle more schemes