100 likes | 121 Views
OceanStore is a global data store that manages itself, providing durability, resistance to attack/failures, fault tolerance, and churn resistance. It scales to billions of users and exabytes of data.
E N D
OceanStore Maintenance-Free Global Data Storage, S. Rhea, C. Wells, P. Eaton, D. Geels, B. Zhao, H. Weatherspoon, J. Kubiatowicz, IEEE Internet Computing, 5(5):40-49, 2001. . David Choffnes, Winter 2006
Data makes the world go round • We’re addicted to persistent storage • Wouldn’t it be great if it would follow us globally? • And automatically make itself resilient to failures? • But that would require 1,000s or millions of PCs! • OceanStore • A global data store that manages itself • Scales to billions of users and exabytes of data • Features: • Durability • Resistance to attack/failures • Fault tolerant • Churn-resistant • Catchy name CS 395/495 Autonomic Computing SystemsEECS,Northwestern University
A bit of this, a byte of that • Routing messages/data • Self-maintaining (DHT) • Durabilitly • M x N erasure encoding • Security/Fault tolerance • Byzantine updates, secure hashes, encryption • Availability • Introspective replica management CS 395/495 Autonomic Computing SystemsEECS,Northwestern University
Don’t call it a comeback • Erasure codes • Break object into m chunks of size n; n*m > n • Encode chunks such that any k set of them can reconstruct entire object CS 395/495 Autonomic Computing SystemsEECS,Northwestern University
An object by any other GUID would smell as sweet • Each object assigned a GUID based on 160-bit SHA-1 hash • OceanStore supports versioning, so each object has an active GUID that points to a list of GUIDs from different versions • Each GUID is a B-tree of links to chunks CS 395/495 Autonomic Computing SystemsEECS,Northwestern University
Magic Tapestry Ride • Tapestry = DHT • Allows nodes to join and leave network relatively seamlessly • Named objects found in deterministic # of hops • OceanStore uses multiple “root nodes” for each object • More redundancy • Lower latency • Inner ring made responsible for objects CS 395/495 Autonomic Computing SystemsEECS,Northwestern University
He says, she says • Possibility for faulty, malicious servers • Byzantine protocol ensures correctness if less than 1/3 of servers is faulty/misbehaving • New technique vastly reduces n^2 message complexity • Cached data is signed • Proactive signature threshold allows same public key despite inner ring membership change CS 395/495 Autonomic Computing SystemsEECS,Northwestern University
Know thyself • Introspection • Servers measure independence of failure rates and change encoding rate appropriately • Auto repair: root node can check redundancy, regenerate/redistribute blocks • Durability: sweep/repair CS 395/495 Autonomic Computing SystemsEECS,Northwestern University
Put it all together CS 395/495 Autonomic Computing SystemsEECS,Northwestern University
Some Performance Numbers CS 395/495 Autonomic Computing SystemsEECS,Northwestern University