1 / 39

Providing Secure Storage on the Internet

Providing Secure Storage on the Internet. Barbara Liskov & Rodrigo Rodrigues MIT CSAIL April 2005. Internet Services. Store critical state Are attractive targets for attacks Must continue to function correctly in spite of attacks and failures. Replication Protocols.

aulii
Download Presentation

Providing Secure Storage on the Internet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Providing Secure Storage on the Internet Barbara Liskov & Rodrigo Rodrigues MIT CSAIL April 2005

  2. Internet Services • Store critical state • Are attractive targets for attacks • Must continue to function correctly in spite of attacks and failures

  3. Replication Protocols • Allow continued service in spite of failures • Failstop failures • Byzantine failures • Byzantine failures really happen! • Malicious attacks

  4. Internet Services 2 • Very large scale • Amount of state • Number of users • Implies lots of servers • Must be dynamic • System membership changes over time

  5. BFT-LS • Provide support for Internet services • Highly available and reliable • Very large scale • Changing membership • Automatic reconfiguration • Avoid operator errors • Extending replication protocols

  6. Outline • Application structure • MS specification • MS implementation • Application methodology • Performance and analysis

  7. C C C C S S S S S S Unreliable Network System Model • Many servers, clients • Service state is partitioned among servers • Each “item” has a replica group • Example applications: file systems, databases

  8. s s s s s s s s s s s s s s s s Client accesses current replica group C

  9. s s s s s s s s s s s s s s s s Client accesses new replica group C

  10. s s s s s s s s s s s s s s s s Client contacts wrong replica group C

  11. The Membership Service (MS) • Reconfigures automatically to reduce operator errors • Provides accurate membership information that nodes can agree on • Ensures clients are up-to-date • Works at large scale

  12. System runs in Epochs • Periods of time, e.g., 6 hours • Membership is static during an epoch • During epoch e, MS computes membership for epoch e+1 • Epoch duration is a system parameter • No more than f failures in any replica group while it is useful

  13. Server IDs • Ids chosen by MS • Consistent hashing • Very large circular id space

  14. Membership Operations • Insert and delete node • Admission control • Trusted authority produces a certificate • Insert certificate includes • ip address, public key, random number, and epoch range • MS assigns the node id ( h(ip,k,n) )

  15. Monitoring • MS monitors the servers • Sends probes (containing nonces) • Some responses must be signed • Delayed response to failures • Timing of probes, number of missed probes, are system parameters • BF nodes (code attestation)

  16. Ending Epochs • Stop epoch after fixed time • Compute the next configuration: Epoch number Adds and Deletes • Sign it • MS has a well known public key • Propagated to all nodes • Over a tree plus gossip

  17. C MS Guaranteeing Freshness • Clients sends a challenge to MS • Response gives client a time periodT during which it may execute requests • T is calculated using client clock <nonce> <nonce, epoch #>σMS

  18. Implementing the MS • At a single dedicated node • Single point of failure • At a group of 3f+1 • Running BFT • No more than f failures in system lifetime • At the servers themselves • Reconfiguring the MS

  19. System Architecture • All nodes run application • 3F+1 run the MS

  20. Implementation Issues • Nodes run BFT • State machine replication (e.g., add, delete) • Decision making • Choosing MS membership • Signing

  21. Decision Making • Each replica probes independently • Removing a node requires agreement • One replica proposes • 2F+1 must agree • Then can run the delete operation • Ending an epoch is similar

  22. Moving the MS • Needed to handle MS node failures • To reduce attack opportunity • Move must be unpredictable • Secure multi-party coin toss • Next replicas are h(c,1), …, h(c,3F+1)

  23. Signing • Configuration must be signed • There is a well-known public key • Proactive secret sharing • MS replicas have shares of private key • F+1 shares needed to sign • Keys are re-shared when MS moves

  24. Changing Epochs: Summary of Steps • Run the endEpoch operation on state machine • Select new MS replicas • Share refreshment • Sign new configuration • Discard old shares

  25. Example Service • Any replicated service • Dynamic Byzantine Quorums dBQS • Read/Write interface to objects • Two kinds of objects • Mutable public-key objects • Immutable content-hash objects

  26. dBQS Object Placement • Consistent hashing • 3f+1 successors of object id are responsible for the object 14 16

  27. Byzantine Quorum Operations • Public-key objects contain • State, signature, version number • Quorum is 2f+1 replicas • Write: • Phase 1: client reads to learn highest v# • Phase 2: client writes to higher v# • Read: • Phase 1: client gets value with highest v# • Phase 2: write-back if some replicas have a smaller v#

  28. dBQS Algorithms – Dynamic Case • Tag all messages with epoch numbers • Servers reject requests for wrong epoch • Clients execute phases entirely in an epoch • Must be holding a valid challenge response • Servers upgrade to new configuration • If needed, perform state transfer from old group • A methodology

  29. Evaluation • Implemented MS, two example services • Ran set of experiments on PlanetLab, RON, local area

  30. MS Scalability • Probes – use sub-committees • Leases – use aggregation • Configuration distribution • Use diffs and distribution trees

  31. Fetch Throughput

  32. Time to reconfigure • Time to reconfigure is small • Variability stems from PlanetLab nodes • Only used F = 1, limitation of APSS protocol

  33. dBQS Performance

  34. Failure-free Computation • Depends on no more than F failures while group is useful • How likely is this?

  35. Probability of Choosing a Bad Group

  36. Probability of Choosing a Bad Group

  37. Probability that the System Fails

  38. Conclusion • Providing support for Internet services • Scalable membership service • Reconfiguring the MS • Dynamic replication algorithms • dBQS – a methodology • Future research • Proactive secret sharing • Scalable applications

  39. Providing Secure Storage on the Internet Barbara Liskov and Rodrigo Rodrigues MIT CSAIL April 2005

More Related