MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES

MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES D. Colarelli D. Grunwald U. Colorado, Boulder

Highlights • Paper proposes • To replace tape libraries by large non-redundant arrays of disks • To cache on active drives • Files that have been recently accessed • Update logs for other files • To keep other drives mostly inactive by spinning them down between accesses

Introduction (I) • Robotic tape libraries are now the standard solution for archiving very large amounts of data • Disadvantages include • Slow access times:average search time of 41s for T9940 drives • Not much cheaper than disk drives • Could we replace tem by massive arrays of hard drives?

Introduction (II) • Major limitation of hard drive solution is power consumption • Almost ten times that of equivalent tape library • Could power down disks that are not currently accessed • 50% of data are likely to be never accessed • 25% of data are likely to be accessed once

Introduction (III) • Must be at least as reliable as tape libraries • No need to use a redundant scheme • Solution is Massive Array of Inactive Drives • Paper investigates design issues through trace-driven simulations

Design Issues • Two major design decisions • Data migration or duplication (caching) • File system or block-level interface

Migration would move “hot” data to active drives Migration uses disk space more efficiently Requires a mapor directory mechanism that maps the storage across all drives Caching would cache read data and act as a write log for write data Keeps two copies of all cached files Maps or directories are proportional to size of cache Migration or caching

Could use file system information to cache entire files Would probably perform better Would require system modifications Would work with existing systems File system or block interface

MAID with caching Passive drives(spin up/down) Active drives (always on) Passive Drive Manager Cache Manager Virtualization Manager

Design choices (I) • Compared MAID-cache and MAID-no cache • MAID-cache • Caches read and writes on active drives • Caching unit is “chunk” of 64 sectors • Cache policy is LRU • All writes are placed in the cache write-log where they wait to be committed to the non-active (passive) drives

Design choices (II) • Must always check write log before reading data from the cache or the passive drives • Passive drives remain on standby until • A cache miss occurs • The write log becomes too long • Return to standby when spin-down inactivity time limit is reached • Varying time limit is primary way to affect system performance and energy consumption

Simulation parameters • Power management policy: • Always on • Fixed-delay spin-down • Adaptive spin-down • Data layout • Linear: keep successive blocks on same drive • Striped: the opposite • Caching/No caching

Simulation results • Based on a supercomputer center workload • All MAID configurations achieve similar power consumptions • 15 to 16 % of that of always on configuration • MAID configurations w/o cache have average response times comparable to that of always on configuration • Workload had little locality

Simulation results (II) • Average response times of MAID configurations with cache much worse than that of always on configuration • 0.680 to 0.720 s compared to 0.303 s • Striped configuration with fixed spin-down delay has lowest average response time of all MAID configurations • 0.309 s

Conclusion • MAID can achieve average response times comparable to that of an always on configuration with a much lower power consumption IMPORTANT In a more recent paper, the authors found out that cached configurations worked much better for workloads exhibiting more locality of accessesthan their supercomputer center workload

MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES

MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES

Presentation Transcript

RAID Redundant Arrays of Independent Disks

Disk Storage Arrays

The Storage Hierarchy and Magnetic Disks

Raid: redundant arrays of inexpensive disks INDEPENDENT

A Case for Redundant Arrays Of Inexpensive Disks

Storage and DISKS

Deconstructing Storage Arrays

Storage and Disks

Disks, toroids and the formation of massive stars

idle

Secondary Storage Devices: Magnetic Disks Optical Disks Floppy Disks Magnetic Tapes

Disks, RAIDs, and Stable Storage

Disks, Massive Stars, Proplyds, Planets John Bally

Secondary Storage Devices: Magnetic Disks

HARD DISKS AND OTHER STORAGE DEVICES

Deconstructing Storage Arrays

Storage Strategies: Static Arrays

Storage and Disks

Disks and Storage Systems