200 likes | 427 Views
HAIL (High-Availability and Integrity Layer) for Cloud Storage. Alina Oprea Joint with Kevin Bowers and Ari Juels RSA Laboratories. Cloud storage. Mostly static data: Back-up Archival . Cloud Storage Provider. Storage server. Web server. Is my data available ?. Client.
E N D
HAIL (High-Availability and Integrity Layer) for Cloud Storage Alina Oprea Joint with Kevin Bowers and Ari Juels RSA Laboratories
Cloud storage Mostly static data: • Back-up • Archival Cloud Storage Provider Storage server Web server Is my data available ? Client
Proofs of Retrievability (PORs) Cloud Storage Provider Corrects small corruption F Encoding k Client
Proofs of Retrievability (PORs) Cloud Storage Provider F F Challenge Response Requires integrity checks on server or client Detects large corruption k Client
When PORs fail Cloud Storage Provider F F decoder Challenge Response Unrecoverable k Client
HAIL Goals • Resilience against cloud provider failure or temporary unavailability • Amazon S3 went down several times, once for 8 hours • Linkup lost 45% of its customer data • Use multiple cloud providers to construct a reliable cloud storage service out of unreliable components • RAID (Reliable Array of Inexpensive Disks) for cloud storage • Provide clients verification capabilities • Efficient proofs of file availability by interacting with cloud providers
Replicate across multiple providers Google EMC Atmos Amazon S3 F F F Naïve approach F Sample and check consistency across providers Client
Roadmap • Adversarial model for HAIL • Small-corruption attack on replication scheme • Encoding layer for each replica individually • Reduce storage overhead by dispersal • Increasing file lifetime with secret keys
Adversarial model • Static: corrupts a fixed number b of the n total providers over time • Create enough redundancy in the file to handle this (b+1 replicas) • Is this realistic? • Mobile (proactive): corrupts b out of n providers in each epoch • Separate each server into code base and storage base • At the beginning of an epoch code base of all servers is cleaned (through reboot, for instance) • All servers might have residual data corruption • Reactive design: check integrity and redistribute
Attack on replication scheme Google EMC Atmos Amazon S3 F F F F F F File can not be recovered after [n/b] epochs The probability that client samples the corrupted block is low Client
Replication with POR Google EMC Atmos Amazon S3 F F F POR POR POR F ECC Cons: requires integrity checks for each replica Client
Replication with POR Google EMC Atmos Amazon S3 F F F F Sample and check consistency across providers Client
Replication with POR Google EMC Atmos Amazon S3 >єc >єc >єc F F F F єd єd єd • Large storage overhead due to replication • File lifetime still limited by [n/b] (єc/ єd) • єc correction threshold of POR encoding • єd detection threshold of POR Sample and check consistency across providers Client
Reduce storage overhead F decode m fragments n fragments dispersal (n,m) F Client
Dispersal code P1 P2 P3 P4 P5 dispersal (n,m) F F Dispersal code parity blocks Client
Dispersal code P1 P2 P3 P4 P5 Dispersal code parity Stripe POR encoding F Dispersal code parity blocks How to increase file lifetime? Check that stripe is a codeword in dispersal code POR encoding to correct small corruption Client
Increasing file lifetime with MACs P1 P2 P3 P4 P5 MAC MAC MAC MAC MAC Can we reduce storage overhead? Client
Integrity-protected dispersal code P1 P2 P3 P4 P5 hk1(m) hk2(m) m UHF + PRF Reed-Solomon dispersal code Client
Integrity-protected dispersal code P1 P2 P3 P4 P5 + PRF m MACs embedded into parity symbols Client
Current work and open problems • Proofs of Retrievability • Lower bounds akin to Naor and Rothblum’s lower bounds for memory checking • What is the cost of file updates? • HAIL • K. Bowers, A. Juels and A. Oprea – “HAIL (High-Availability and Integrity Layer) for Cloud Storage”, CCS 2009 • Different adversarial models • Investigate alternative constructions • Supporting file updates