ObliviStore High Performance Oblivious Cloud Storage

ObliviStoreHigh PerformanceOblivious Cloud Storage UC Berkeley UMD http://www.emilstefanov.net/Research/ObliviousRam/

Cloud Storage Dropbox Amazon S3, EBS Windows Azure Storage SkyDrive EMC Atmos Mozy iCloud Google Storage

Data Privacy • Data privacy is a growing concern. • So, many organizations encrypt their data. • Encryption is not enough. • Access patterns leak sensitive information. • E.g., 80% of search queries (Islam et. al)

Oblivious Storage (ORAM) • Goal:Conceal access patterns to remote storage. • An observer cannot distinguish a sequence of read/write operations from random. Untrusted Cloud Storage Read(x) Write(y, data) Read(z) ... etc Client • Proposed by Goldreich and Ostrovsky.[GO96, OS97] • Recently: [WS08, PR10, GM10, GMOT11, BMP11, SCSL11, SSS12, GMOT12, KLO12, WR12, LPMRS13, … ]

Hybrid Cloud Public Cloud (untrusted) heavyweight(offers scalability) ORAM Node ORAM Node ORAM Node Oblivious Load Balancer lightweight(stores 0.25% of data) Private Cloud (trusted) (e.g., corporate cloud) Client Client Client Client Client Client

Trusted Hardware in the Cloud entire storage system untrusted ORAM Node ORAM Node ORAM Node networking untrusted Oblivious Load Balancer few machines with trusted hardware Client Client Client Client Client Client

Contributions • Built end-to-end oblivious storage system. • Open source code available. • Fully asynchronous design – no blocking on I/O • Efficiently handles thousands of simultaneous operations. • High performance (throughput & response time) • High throughput over high latency connections. • Much faster than existing systems. • Oblivious load balancing technique for distributing the ORAM workload. • Optimized for both SSDs and HDDs.

Performance Challenges Untrusted Cloud Client Server Storage(HDD/SSD) bandwidth cost, response time, block size storage IO cost, seeks client storage focus on exact(not asymptotic)performance scalability to multiple servers

Security Challenges • Goals: • Oblivious asynchronous scheduling. • Scheduling should not leak private information. • Oblivious load balancing across multiple machines. • Load distribution should be independent of access pattern. • Adversary can: • Observe raw storage locations accessed. • Observe network traffic patterns. • Maliciously delay storage and network IO. • Attempt to corruptdata. • etc.

Overview

Partitioned ORAM

Partition • Based on Goldreich-Ostrovsky scheme. Level has blocks levels

Reading from a Partition • Read one block from each level. • One of them is the real block. Client Server

Writing to a Partition (shuffling) Server (after) Client Server (before) • Shuffle consecutively filled levels. • Write into next unfilled level. shuffle blocks

Challenge • Parallelism • Overlapping reading & shuffling • Maintaining low client storage • Preserving security

Architecture

Partitions Server Client Storage Cache Fetch(addr)Store(addr, block) CacheIn (addr)CacheOut(addr) CacheIn (addr) Fetch(addr) Partition Reader PartitionStates Background Shuffler increment decrement increment Fetch(partition) Semaphores Eviction Cache ReadPartition(partition, blockId) Fetch (blockId)Store (partition, block) decrement ORAM Main Read (blockId)Write (blockId, block)

Partitions Server Client Storage Cache Fetch(addr)Store(addr, block) CacheIn (addr)CacheOut(addr) CacheIn (addr) Fetch(addr) Partition Reader PartitionStates Background Shuffler increment decrement increment Fetch(partition) Semaphores Eviction Cache ReadPartition(partition, blockId) Fetch (blockId)Store (partition, block) decrement ORAM Main Read (blockId)Write (blockId, block) • ORAM Read/Write requests enter the system.

Partitions Server Client Storage Cache Fetch(addr)Store(addr, block) CacheIn (addr)CacheOut(addr) CacheIn (addr) Fetch(addr) Partition Reader PartitionStates Background Shuffler increment decrement increment Fetch(partition) Semaphores Eviction Cache ReadPartition(partition, blockId) Fetch (blockId)Store (partition, block) decrement ORAM Main Read (blockId)Write (blockId, block) • The requests are then partitioned.

Partitions Server Client Storage Cache Fetch(addr)Store(addr, block) CacheIn (addr)CacheOut(addr) CacheIn (addr) Fetch(addr) Partition Reader PartitionStates Background Shuffler increment decrement increment Fetch(partition) Semaphores Eviction Cache ReadPartition(partition, blockId) Fetch (blockId)Store (partition, block) decrement ORAM Main Read (blockId)Write (blockId, block) • The partition reader reads levels of the partitions.

Partitions Server Client Storage Cache Fetch(addr)Store(addr, block) CacheIn (addr)CacheOut(addr) CacheIn (addr) Fetch(addr) Partition Reader PartitionStates Background Shuffler increment decrement increment Fetch(partition) Semaphores Eviction Cache ReadPartition(partition, blockId) Fetch (blockId)Store (partition, block) decrement ORAM Main Read (blockId)Write (blockId, block) • The background shuffler writes and shuffles levels of the partitions.

Partitions Server Client Storage Cache Fetch(addr)Store(addr, block) CacheIn (addr)CacheOut(addr) CacheIn (addr) Fetch(addr) Partition Reader PartitionStates Background Shuffler increment decrement increment Fetch(partition) Semaphores Eviction Cache ReadPartition(partition, blockId) Fetch (blockId)Store (partition, block) decrement ORAM Main Read (blockId)Write (blockId, block) • Semaphores bound the client memory.

Partitions Server Client Storage Cache Fetch(addr)Store(addr, block) CacheIn (addr)CacheOut(addr) CacheIn (addr) Fetch(addr) Partition Reader PartitionStates Background Shuffler increment decrement increment Fetch(partition) Semaphores Eviction Cache ReadPartition(partition, blockId) Fetch (blockId)Store (partition, block) decrement ORAM Main Read (blockId)Write (blockId, block) • The storage cache temporarily stores data for the background shuffler and helps ensure consistency.

Pipelined Shuffling

Background Shuffler • Each ORAM Read/Write operation creates a shuffling job. before after A job consists of (on average): • Read blocks. • Shuffle locally. • Write the blocks (along with 1 additional block) block to be written (associated with shuffle job)

Without Pipelining • Without pipelining shuffle jobs are latency limited: • round-trips for ORAM operations. • Example: • 50ms latency, 1 TB ORAM, 4 KB blocks • Want to write 1 MB file ( ORAM operations). • Total time: • About 50ms * 256 * 30 = 384 seconds Without pipelining it would take over6 minutes to write a 1 MB file!

Pipelining Across One Job • Pipelining the IO operations of each job: • Reduction in round-trips:to • Still not enough – 15 seconds for 1 MB file even if bandwidth is plenty. • Actually even worse. • Distribution of job sizes highly skewed. • Need to pipeline IOs across multiple shuffle jobs.

Asynchronous Shuffling Pipeline Release memory resources after writing blocks startreads complete all reads shufflelocally start writes complete all writes 1 startreads complete all reads shufflelocally start writes complete all writes Shuffle Job 2 startreads complete all reads shufflelocally start writes complete all writes 3 Reserve memory resources before reading blocks Note: meanwhile, blocks may be read by the partition reader time

Semaphores

Semaphores • Carefully designed semaphores • Enforce bound on client memory. • Control de-amortized shuffling speed. • Independent of the access pattern. • Eviction • Unshuffled blocks that were recently accessed. • Early cache-ins • Blocks read during shuffling of a partition. • Shuffling buffer • Blocks currently being shuffled. • Shuffling I/O • Pending work for the shuffler.

Security • Secure in the malicious model. • Adversary only observes (informally): • Behavior of synchronous system • i.e., without ObliviStore’s optimizations. • Proven secure. • Semaphore values • Independentof the access pattern. • Timings • Independentof the access pattern. • Security proof in full online paper.

Evaluation

Performance 1 node (2x1TB SSD) (300 GB ORAM) (50ms simulated client latency) Speed: 3.2 MB/s

Scalability 10 nodes (2x1TB SSD each) (3 TB ORAM) (50ms simulated client latency) Speed: 31.5 MB/s Response time: 66ms (full load)

HDD “Friendly” 4 to 10 seeks per operation Works well on both SSDs and HDDs

Comparison to otherORAM implementations About 17 times higher throughput than PrivateFS(under a very similar configuration)

Comparison to otherORAM implementations • Lorch et. al. also implemented ORAM. • Built on top of real-world secure processors. • Lots of overhead from limitations of secure processors • Very limited I/O bandwidth • Very limited computation capabilities • Many other ORAM constructions exist, but not many full end-to-end implementations.

Conclusion • Fully asynchronous. • High performance. • Full end-to-end implementation(open source). • Already been used for mining biometric data.(Bringer et. al) Thank you! http://www.emilstefanov.net/Research/ObliviousRam/

ObliviStore High Performance Oblivious Cloud Storage

ObliviStore High Performance Oblivious Cloud Storage

Presentation Transcript

High Performance Computing Course Notes 2007-2008 High Performance Storage

High Performance Cloud Storage Technical Architecture

RAID: HIGH PERFORMANCE, RELIABLE SECONDARY STORAGE

Cloud Storage

High Performance Cloud Storage Introduction

Cloud Storage

Gecko: A Contention-Oblivious Design for Cloud Storage

Cloud Storage

Cloud Storage

Gecko: Contention-Oblivious Disk Arrays for Cloud Storage

Cloud Storage

Gecko: Contention-Oblivious Disk Arrays for Cloud Storage

Revolutionizing High Performance SAN Storage

High Performance Storage System

Oracle Use and Best Practices for High Performance Cloud Storage

High-Performance Reliable Distributed Storage Systems

High-Performance & Customizable Data Storage Server

Verifiable Oblivious Storage

Cloud Storage

High Performance Computing Course Notes 2007-2008 High Performance Storage

High Performance Storage System

High Performance Storage Service Virtualization