390 likes | 501 Views
Providing Secure Storage on the Internet. Barbara Liskov & Rodrigo Rodrigues MIT CSAIL April 2005. Internet Services. Store critical state Are attractive targets for attacks Must continue to function correctly in spite of attacks and failures. Replication Protocols.
E N D
Providing Secure Storage on the Internet Barbara Liskov & Rodrigo Rodrigues MIT CSAIL April 2005
Internet Services • Store critical state • Are attractive targets for attacks • Must continue to function correctly in spite of attacks and failures
Replication Protocols • Allow continued service in spite of failures • Failstop failures • Byzantine failures • Byzantine failures really happen! • Malicious attacks
Internet Services 2 • Very large scale • Amount of state • Number of users • Implies lots of servers • Must be dynamic • System membership changes over time
BFT-LS • Provide support for Internet services • Highly available and reliable • Very large scale • Changing membership • Automatic reconfiguration • Avoid operator errors • Extending replication protocols
Outline • Application structure • MS specification • MS implementation • Application methodology • Performance and analysis
C C C C S S S S S S Unreliable Network System Model • Many servers, clients • Service state is partitioned among servers • Each “item” has a replica group • Example applications: file systems, databases
s s s s s s s s s s s s s s s s Client accesses current replica group C
s s s s s s s s s s s s s s s s Client accesses new replica group C
s s s s s s s s s s s s s s s s Client contacts wrong replica group C
The Membership Service (MS) • Reconfigures automatically to reduce operator errors • Provides accurate membership information that nodes can agree on • Ensures clients are up-to-date • Works at large scale
System runs in Epochs • Periods of time, e.g., 6 hours • Membership is static during an epoch • During epoch e, MS computes membership for epoch e+1 • Epoch duration is a system parameter • No more than f failures in any replica group while it is useful
Server IDs • Ids chosen by MS • Consistent hashing • Very large circular id space
Membership Operations • Insert and delete node • Admission control • Trusted authority produces a certificate • Insert certificate includes • ip address, public key, random number, and epoch range • MS assigns the node id ( h(ip,k,n) )
Monitoring • MS monitors the servers • Sends probes (containing nonces) • Some responses must be signed • Delayed response to failures • Timing of probes, number of missed probes, are system parameters • BF nodes (code attestation)
Ending Epochs • Stop epoch after fixed time • Compute the next configuration: Epoch number Adds and Deletes • Sign it • MS has a well known public key • Propagated to all nodes • Over a tree plus gossip
C MS Guaranteeing Freshness • Clients sends a challenge to MS • Response gives client a time periodT during which it may execute requests • T is calculated using client clock <nonce> <nonce, epoch #>σMS
Implementing the MS • At a single dedicated node • Single point of failure • At a group of 3f+1 • Running BFT • No more than f failures in system lifetime • At the servers themselves • Reconfiguring the MS
System Architecture • All nodes run application • 3F+1 run the MS
Implementation Issues • Nodes run BFT • State machine replication (e.g., add, delete) • Decision making • Choosing MS membership • Signing
Decision Making • Each replica probes independently • Removing a node requires agreement • One replica proposes • 2F+1 must agree • Then can run the delete operation • Ending an epoch is similar
Moving the MS • Needed to handle MS node failures • To reduce attack opportunity • Move must be unpredictable • Secure multi-party coin toss • Next replicas are h(c,1), …, h(c,3F+1)
Signing • Configuration must be signed • There is a well-known public key • Proactive secret sharing • MS replicas have shares of private key • F+1 shares needed to sign • Keys are re-shared when MS moves
Changing Epochs: Summary of Steps • Run the endEpoch operation on state machine • Select new MS replicas • Share refreshment • Sign new configuration • Discard old shares
Example Service • Any replicated service • Dynamic Byzantine Quorums dBQS • Read/Write interface to objects • Two kinds of objects • Mutable public-key objects • Immutable content-hash objects
dBQS Object Placement • Consistent hashing • 3f+1 successors of object id are responsible for the object 14 16
Byzantine Quorum Operations • Public-key objects contain • State, signature, version number • Quorum is 2f+1 replicas • Write: • Phase 1: client reads to learn highest v# • Phase 2: client writes to higher v# • Read: • Phase 1: client gets value with highest v# • Phase 2: write-back if some replicas have a smaller v#
dBQS Algorithms – Dynamic Case • Tag all messages with epoch numbers • Servers reject requests for wrong epoch • Clients execute phases entirely in an epoch • Must be holding a valid challenge response • Servers upgrade to new configuration • If needed, perform state transfer from old group • A methodology
Evaluation • Implemented MS, two example services • Ran set of experiments on PlanetLab, RON, local area
MS Scalability • Probes – use sub-committees • Leases – use aggregation • Configuration distribution • Use diffs and distribution trees
Time to reconfigure • Time to reconfigure is small • Variability stems from PlanetLab nodes • Only used F = 1, limitation of APSS protocol
Failure-free Computation • Depends on no more than F failures while group is useful • How likely is this?
Conclusion • Providing support for Internet services • Scalable membership service • Reconfiguring the MS • Dynamic replication algorithms • dBQS – a methodology • Future research • Proactive secret sharing • Scalable applications
Providing Secure Storage on the Internet Barbara Liskov and Rodrigo Rodrigues MIT CSAIL April 2005