1 / 23

PeerReview: Practical Accountability for Distributed Systems

Explore the significance of accountability in distributed systems to detect faults, identify nodes, and maintain accuracy. Learn about challenges, advantages, and a practical approach to achieve accountability without relying on a trusted entity. Discover how tamper-evident logs and witness nodes play a crucial role in ensuring system reliability and integrity.

jpascarella
Download Presentation

PeerReview: Practical Accountability for Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PeerReview: Practical Accountability for Distributed Systems SOSP 07

  2. Why have Accountability? • Nodes can fail • An attacker can compromise a node • Accidental Mis-configuration • Multiple administrative domains

  3. Distributed state, incomplete information • General case: Multiple admins with different interests Admin www.sosp2007.org/talks/sosp118-haeberlen.ppt

  4. What is Accountability? • Fault = Anything besides expected behavior • Ideal Accountability: • Detect a fault • Identify the faulty node (Completeness) • Correct node can prove its correctness (Accuracy) • Expose the faulty node

  5. A few advantages: • Deterring faults • Augment fault tolerant systems • Augmenting best-effort systems

  6. Challenges: What can/cannot be detected? • Un-observable faults: • Node’s internal state • CPU overheating, Display failed • Need trusted probes! • Observable faults: • Affect a correct node causally • No trusted entity required! • How to verify if a node reports correctly? • How to distinguish omission from long delays?

  7. Request • Grant • Release

  8. Accountability: How much can we do? • Completeness: • Eventually suspected • Eventually exposed • Accuracy • No correct node is forever suspected • No correct node ever exposed by a correct node

  9. FullReview • Characteristics: • A trusted entity exists • All messages go through trusted entity • Each node maintains a log for every other node • Check the log • Suspect/Expose a deviant node • Complete? • Accurate? • Practical?

  10. PeerReview:Practical Accountability • No trusted entity • Nodes only keep their own log • May retrieve others when needed • Logs are tamper-evident • Witness nodes: check correctness of a node • Challenge/Response protocol

  11. System Model • Each node modeled as: • A state machine • A detector • An application • Assumptions: • Deterministic state machine • Correct nodes can communicate • A reference implementation of node SW • A secure signature mechanism available

  12. Overview • Nodes maintain a log of I/O • Witnesses of a node audit its log • If faulty, gather evidence • Make it known

  13. Tamper-evident logs • Append-only list of I/O • Log-entries connected in a hash-chain • Authenticator: A signed statement by a node • If a node tampers the log, it will be evident • Logs must be complete • No entries missed • Logs must be correct • No forged entries • No multiple logs

  14. Module A State machine Module B Network Log Module A Module B Input if ≠ =? Output Fault Detection • Audit • Replay the inputs to a reference implementation • Output == Log ? • Evidence Transfer • Fetch evidence from witnesses

  15. PeerReview: Applications • Overlay Multicast • Large amounts of data • Freeloaders • Network File System • Latency-sensitive • Data tampering • Message loss in the network • Peer-to-peer email • DoS attack

  16. Results: Multicast with Freeloader

  17. Results: Throughput

  18. Results:

  19. Discussion • What if all witnesses are faulty? • How to choose Ttrunc, Taudit, Tbuf

More Related