180 likes | 194 Views
Data Integrity and Protection. Jiwoong Park(jwpark@dcslab.snu.ac.kr) School of Computer Science and Engineering Seoul National University. Overview. Disk Failure Modes Fail-Stop Fail-Partial Types of Failures Latent-Sector Error ( LSE ) Block Corruption Solutions
E N D
Data Integrity and Protection Jiwoong Park(jwpark@dcslab.snu.ac.kr) School of Computer Science and Engineering Seoul National University
Overview • Disk Failure Modes • Fail-Stop • Fail-Partial • Types of Failures • Latent-Sector Error (LSE) • Block Corruption • Solutions • Redundancy Mechanism for Recovery • Checksum for Error Detection • Block Corruption • Misdirected Write • Lost Write • Overhead • Space / Time
Disk Failure Modes New States! • Fail-Stop • The entire disk is working • The entire disk fails completely The Fail-Stop model cannot describe the modern view of disk failure !! • Fail-Partial • The entire disk is working • The entire disk fails completely • The disk is seemingly working • but have one or more inaccessible blocks • but hold the wrong contents
Types of Failures: Latent-Sector Error (LSE) Head Crash Source: http://en.wikipedia.org/wiki/Head_crash (retrieved on 2015/06/07) Cosmic Rays Source: http://apod.nasa.gov/apod/ap060814.html (retrieved on 2015/06/07) • Latent-Sector Error (LSE) • Cause • Disk sectors have been damaged in some way • Head Crash • Can damage the surface, making the bits unreadable • Cosmic Rays • Can flip bits, leading to incorrect contents • Detection • Easily detected with in-diskError Correcting Code (ECC) • The disk returns an error in case of LSE • Recovery • Can be solved by redundancy mechanism
Types of Failures: Block Corruption Wrong Location Source: http://www.derby-solicitors.co/consumer-rights/faulty-goods/ (retrieved on 2015/06/07) • Block Corruption • Cause • Disk blocks become corrupt in a way no detectable by the disk itself • Buggy Disk Firmware • Can write a block to the wronglocation • Faulty Bus • Can transfer the corrupt data • Detection • Not detectable by the disk itself (Silent Faults) • Can be solved by checksum • Recovery • Can be solved by redundancy mechanism
Solutions: Redundancy Mechanism for Recovery Disk Fail Recovery D3 = D1 XOR D2 XOR P1 P2 = D4 XOR D5 XOR D6 D8 = D7 XOR P3 XOR D9 D11 = P4 XOR D10 XOR D12 D1 D2 D3 D3 P1 LSE D4 D5 P2 P2 D6 D5 D7 P3 D8 D8 D9 P4 P10 P11 P11 P12 Physical Disk 1 Physical Disk 2 Physical Disk 3 Physical Disk 4 • How to Recover from Disk Failures? • Redundancy Mechanism • Example • Mirrored RAID (RAID-4, RAID-5) • Can reconstruct the block from the other blocks in the parity group • Problem • Recovery fails when full-disk faults and LSEs occur at the same time • Solution • RAID-DP • Two parity disks instead of one + Write Anywhere File Layout (WAFL)
Solutions: Checksum for Error Detection • How to Detect Silent Faults? • Checksum • A primary mechanism used by modern storage systems for data integrity • The result of the checksum function that • Take a chunk of dataas input • Compute a function over the data • Produce a small summary of the contents of the data • How to Use Checksum? • When reading a block D • Read the stored checksum • Compute the checksum over the retrieved block D • Compare the stored checksum and the computed checksum • If equal, (No Error) • Return the data to the user • Else if not equal, (data corruption) • Use redundant mechanism or return an error
Common Checksum Function Two-bit Corruption 0011 1011 1110 0100 0110 1010 1100 0000 0101 0001 1110 1011 1110 0100 1111 1110 1100 1000 0010 1111 0100 1010 1100 0110 1100 1001 0011 0110 1101 0010 1010 0110 1011 1101 XOR 0010 0000 0000 0001 1011 1001 0100 0000 0011 Detection Failed • XOR-based Checksum • Basic Idea • XOR each chunk of the data block being checksummed • Produce a single value • Example • 4-byte checksum over a block of 16 bytes • Limitation • Cannot detect two bits corruptions in the same position within each checksummed unit
Common Checksum Function 2’s Complement Conversion ~X + 1 0011 1011 1110 0100 1100 0100 0001 1011 0110 1010 1100 0000 1001 0101 0011 1111 0101 0001 1110 1011 1010 1110 0001 0100 1110 0100 1111 1110 0001 1011 0000 0001 0011 0111 1101 0000 1100 1000 0010 1111 1011 0101 0011 1001 0100 1010 1100 0110 1100 1001 0011 0110 0011 0110 1100 1001 1101 0010 1010 0110 0011 1110 0110 1010 ADD 1 1110 0001 1101 1110 1000 1110 0000 0001 • Addition-based Checksum • Basic Idea • Perform 2’s complement addition over each chunk of the data • Produce a single value, ignoring overflow • Example • 4-byte checksum over a block of 16 bytes • Limitation • Not good if the data is shifted
Common Checksum Function Initialize: s1 = s2 = 0 Start More data? Yes s1 = (s1 + di) mod 255 s2 = (s2 + s1) mod 255 No 16-bit checksum s2 s1 Stop Output: • Fletcher Checksum • Basic Idea • Compute two check bytes, s1 and s2 • s1 = (s1 + di) mod 255 • s2 = (s2 + s1) mod 255 • Example • 8-bit Fletcher Checksum • Can detect all single-bit errors, all double-bit errors, a large percentage of burst errors
Common Checksum Function Input • 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0 0 • % 1 0 0 0 0 0 1 1 1 • = 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0 0 • % 1 0 0 0 0 0 1 1 1 • = 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 0 • % 1 0 0 0 0 0 1 1 1 • = 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 0 • % 1 0 0 0 0 0 1 1 1 • = 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 • % 1 0 0 0 0 0 1 1 1 • = 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 • % 1 0 0 0 0 0 1 1 1 • = 0 0 0 0 0 0 1 0 1 0 1 0 1 1 0 0 • % 1 0 0 0 0 0 1 1 1 • = 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 • % 1 0 0 0 0 0 1 1 1 • = 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 Divisor Output • Cyclic Redundancy Check (CRC) • Basic Idea • Treat the data block as if it is a large binary number • Divide it by an agreed upon value (k) • Produce a single value • The remainder of this division • Example • CRC-8-ATM with Polynomial x8+x2+x+1 • 1x8+0x7+0x6+0x5+0x4+0x3+1x2+1x1+1x0 • Polynomial (Divisor) = 100000111(2) • Input Di = 01010111(2) • Output CRC(Di) = 10100010(2)
Good Checksum Function TNSTAAFL Source: http://www.hagencartoons.com/ (retrieved on 2015/06/07) • Collision • Two data blocks with non-identical contents can have identical checksums • There’s No Such Thing As A Free Lunch (TNSTAAFL) • The more protection you get, the costlier it is • Good Checksum Function • Easy to compute • Lower chance of collisions
Checksum Layout D0 D1 D2 D3 D4 C[D0] C[D1] C[D2] C[D3] C[D4] C[D0] C[D1] C[D2] C[D3] C[D4] D0 D1 D2 D3 D4 • Basic Approach • Simply store a checksum with each disk sector (or block) • Problem • Disks only can write in sector-sized chunks (512 bytes) • Solution • Format the drive with 520-byte sectors • Single write is needed to overwrite block D • Store the n checksums together in a sector, followed by n data blocks • One read and two writes are needed to overwrite block D • A Read for stored checksum + Writes for data and checksum
Misdirected Writes C[D0] Disk=1 Block=0 D0 C[D1] Disk=1 Block=1 D1 C[D2] Disk=1 Block=2 D2 Disk 1 C[D0] Disk=0 Block=0 C[D1] Disk=0 Block=1 C[D2] Disk=0 Block=2 D0 D1 D2 Disk 0 • Cause • Write the data to disk correctly, but in the wrong location • Solution • Add a little more information to each checksum • Physical Identifier of the disk and sector number of the block
Lost Writes • Cause • The device informs the host that a write has completed but in fact it hasn’t • Previous approaches cannot help to detect lost writes because • The old block likely has a matching checksum • The physical ID will also be correct • Solution • Write Verify (Read-After-Write) • Immediately read back the data after a write • Quite slow, doubling the number of I/Os needed to complete a write • Checksum in elsewhere • Used in Zettabyte File System (ZFS) • Store checksum in inode / indirect block for every block included within a file • Can fail iffthe writes to both the inode and the data are lostsimultaneously
Scrubbing • Periodical checking • Disk Scrubbing • Periodicallyread through every block of the system • Check if checksums are still valid • Schedule scans on a nightly or weekly basis Source: http://stockfresh.com/royalty-free-stock-photos/dozing (retrieved on 2015/06/07) • When Checksum Actually Get Checked • When Accessed by Applications • Most data is rarely accessed, remaining unchecked • Unchecked data is problematic for a reliable storage system
Overheads of Checksumming • Space • Disk • The space for stored checksum in the disk cannot be used for user data • Memory • Accessing data needs room in memory for both checksums and the data itself • Time • CPU • CPU must compute the checksum over each block • Whenever the data is stored / accessed • Combined copying / checksumming can be quite effective • I/O • Extra I/Os to access the checksums stored distinctly from the data • Can be reduced by design • Extra I/Os needed for background scrubbing • Can limit the impact by controlling the start time of disk scrubbing (e.g., midnight)
Summary • Disk Failure • Latent-Sector Errors (LSEs) are easily detectable • Silent faults are hard to detect • Block corruption, Misdirected Writes, Lost Writes • Solution • Redundancy Mechanism • Checksum • Time overheads can be noticeable • Space overheads are negligible • Best Checksum? • It depends • Addition checksum for very lightweight ones • Fletcher checksum for most intermediate complexity ones • CRC for ones that needs the best error detection with more computational cost