200 likes | 342 Views
TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time. Authors: Qing Yang,Weijun Xiao,Jin Ren University of Rhode Island Presented By: Anuradharthi T. Outline. Introduction Background Related Work TRAP- 4 Architecture Results Conclusion.
E N D
TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time Authors: Qing Yang,Weijun Xiao,Jin Ren University of Rhode Island Presented By: Anuradharthi T
Outline • Introduction • Background • Related Work • TRAP- 4 Architecture • Results • Conclusion
Introduction RAID Architecture - Most prominent architecture advance in disk I/O systems - In use for more than two decades - Types of RAID: • RAID1- provides 2N data redundancy to protect data • RAID3, 4 & 5- store data in parity stripes across multiple disks to improve space efficiency & performance
RAID Architecture • Advantages: - Can recover data from more than one disk failure - Improves data reliability • Disadvantages: - Recovery not possible when damaged data are not confined to 1 or 2 disks - Accounts for 60% to 80% of data losses - Examples of such damages are: software defects, virus attacks, power failure or site failure
Solution Timely Recovery to Any Point-in-time(TRAP) - Keep log of all previous versions of changed data blocks in time sequence - Utilizes a fast & simple encoding scheme => less space - Provides faster data recovery to any-point-in-time due to drastically smaller amount of storage space used =>improved performance - Thus achieves an optimal space & performance characteristics
Background Recovery of data in real world is measured by ‘2’ key parameters: 1. RPO (Recovery Point Objective) - measures maximum acceptable age of data at time of outage 2. RTO (Recovery Time Objective) - maximum acceptable length of time to resume normal data processing operations after an outage
Classification of storage architecture • Storage architectures are capable of recovering data upon an outage • Based on the 2 key parameters • TRAP-1 • TRAP-2 • TRAP-3 • TRAP-4
Related work • TRAP-1 • Uses periodical backups & snapshots • Time consuming & degrades application performance • Data transferred to tapes or disk for backup • TRAP-2 • Performs file versioning that records a history of changes to files • Versioning has to be done manually • Have controllable RTO & RPO • File system dependent • TRAP-3 • Keep a log of changed data for each data block in a time sequence (time stamps) • Continuous Data Protection (CDP) • Huge amount of storage space required
TRAP-4 Architecture Keeps a log of parities as a result of each write on the block • Suppose a host writes into data block with logic address Ai that belongs to a data stripe(A1,A2….Ab….An) • RAID controller perform the parity calculation as follows: Where, PT(k) – new parity for corresponding stripe Ai(k) – new data for data block Ai Ai(k-1) – old data of data block Ai PT(k-1) – old parity of the stripe
Optimizing the parity • P’T(k)= is appended to the parity log stored in the TRAP disk after a simple encoding • Only 5% to 20% of bits inside a data block actually change on a write operation • Parity P’T(k) reflects the exact changes at bit level of new write operation on the existing block • As a result, this parity block contains mostly zeros with a very small portion of bit stream that is nonzero • Thus it can be easily encoded to a small size parity block to be appended to the parity log reducing the amount of storage space
Recovery on outage • Consider the parity log corresponding to a data block, Ai after a series of write operations. • Log contains P’T(k) P’T(k-1) …. P’T(2) P’T(1) with timestamps T(k), T(k-1), …. T(2) and T(1) associated with parities • When an outage occurs at time t1, and we would like to recover data as at time t0 (t0 < t1) • Note that for all l = 1,2, … r
Conclusions • A new disk array architecture capable of providing timely recovery to any point-in-time for user data stored in array • Up to 2 orders of magnitude improvements in terms of storage efficiency • Has quick recovery time • Provides continuous data protection capability