60 likes | 166 Views
The Oceanic Data Utility: (OceanStore) Global-Scale Persistent Storage. John Kubiatowicz. Properties of the ODU. Motivation: Growing quantity of inconsistent data Widespread mobility of producers and consumers Simplicity: subsume Web, email, filesystems, databases
E N D
The Oceanic Data Utility:(OceanStore)Global-Scale Persistent Storage John Kubiatowicz
Properties of the ODU • Motivation: • Growing quantity of inconsistent data • Widespread mobility of producers and consumers • Simplicity: subsume Web, email, filesystems, databases • Nomadic Data: Serverless, Homeless • Sharing of information between anyone, anywhere • Promiscuous caching of data enabled by tacit information (option 5/introspection) • Efficient dissemination of information (multicast) • Federation of many different companies, just like phone service or electric grid. • Highly-available: data always duplicated • Higher-probability access • Copies placed with low probability of correlated failure • Shares technology with options 1,4,5, and 8
Technical Challenges • Scalability: performance easy to destroy • vast number of entities: ~billions • cross-administrative domains • Security is not optional: data never cleartext • Availability • Should bootstrap redundancy available on global scale • Economies of scale applied to achieving data reliability • Maintainability • Too large for human intervention in normal operation • Naming: How to maintain global namespace? • Indexability • Must enable efficient location/searching of data • Consistency/Conflict resolution • Multiple copies must have well-defined relationship
State of the Art? • Remote file-system community: NFS, AFS • All have single points of failure • Only caching at endpoints • Mobile computing community: Coda • Small scale, fixed coherence mechanism • Web caching community: Inktomi, others? • Specialized, incremental solutions • Caching along client/server path, various bottlenecks • Database Community: Mariposa • Still small scale, specialized types of queries • Economic model not quite right but on right track • Internet backup companies: Medley • Very limited in scope and flexibility • PalmPilot: inspired general conflict-resolution
Our Enabling Technologies • Data Economy • User pays monthly fee to a primary utility provider who is responsible for reliability of data • Utilities buy and sell capacity (both data and bandwidth); prices set for quantity and reliability • Authoritative naming servers paid per query? • Underlying database organization • User-visible structure (e.g. filesystem) synthesized • Federation of overlapping data location structures (indices) + Introspection • Separate the absolute authority for data location from moment-to-moment “hearsay” authorities • Partially consistent indices continually adapted to improve performance • Conflict Resolution, not consistency • policies set via domain specific language
3 Year Plan for Success • Year1: • Initial design and refinement of four components: • naming & security scheme (security based on name) • fluid, partially coherent index structures • introspection for intelligent migration of data • initial take on economic models • Begin prototype implementation with all components • Year2: • Finish prototyping and refinement of first-generation • Client implementation for Windows and/or UNIX • Year 3: • Second-generation prototype on Millenium infrastructure • formulate plan for large-scale test • Final evaluation and usability results