80 likes | 92 Views
Ensuring data longevity in large-scale archival systems. Mehul Shah Mary Baker HP Labs NSF Grand Challenges Workshop. Long-term assets turning digital. Drivers Denser, cheaper storage New applications Enterprise data Media companies’ strategic assets are digital
E N D
Ensuring data longevity in large-scale archival systems Mehul Shah Mary Baker HP Labs NSF Grand Challenges Workshop
Long-term assets turning digital • Drivers • Denser, cheaper storage • New applications • Enterprise data • Media companies’ strategic assets are digital • E.g. Disney, Dreamworks, Electronic Arts, etc. • Legislation: Sarbanes-Oxley, etc. • Personal data • Email, photos, videos, tax-records, etc. • Soon can store all you see and hear • Keep stuff for a while • Decades, centuries …
The Holy Grail • Today’s storage systems designed for • Availability and performance • Not longevity • Challenge: Build large-scale archival system • Maintain integrity of information forever • Efficient retrieval forever • Affordable forever
Wait, why is this hard ? • Indefinite time-horizon • Infrequent threats eventually occur • Limited Budget: $/byte/year • Organizational failure • Media and format obsolescence • Increasing system scale • More interacting components • More component complexity • More opportunities for failure less reliable Global Warming ?
Why is this “Grand” ? • Must take end-to-end perspective • Deal with errors at all levels • device application user organization • Need diverse expertise • systems, data management, security, HCI, etc.
More “Grand”ness … • Ongoing battle we must continually win • Need proactive solutions • Vigilance against known threats • Find hidden errors before it is too late • Monitor and develop defenses for new threats • Passive methods doomed • Integrate process and technology
Immediate Challenges • Develop prototype • Show robustness over 100 years • Need rapid-aging strategies • Understand failures • Track visible and hidden errors • Pinpoint causes • Understand costs: initial + ongoing