1 / 76

HQ Replication: Efficient Quorum Agreement for Reliable Distributed Systems

HQ Replication: Efficient Quorum Agreement for Reliable Distributed Systems. James Cowling 1 , Daniel Myers 1 , Barbara Liskov 1 Rodrigo Rodrigues 2 , Liuba Shrira 3 1 MIT CSAIL 2 INESC-ID and Instituto Superior T é cnico 3 Brandeis University. Byzantine Fault Tolerance.

glyn
Download Presentation

HQ Replication: Efficient Quorum Agreement for Reliable Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HQ Replication:Efficient Quorum Agreement forReliable Distributed Systems James Cowling1, Daniel Myers1, Barbara Liskov1 Rodrigo Rodrigues2, Liuba Shrira3 1MIT CSAIL 2INESC-ID and Instituto Superior Técnico 3Brandeis University

  2. Byzantine Fault Tolerance • Reliable client-server distributed systems • Server replicated across group of replica machines • General operations • Bounded number f of Byzantine replicas • Must ensure correct system state • Consistent ordering of client operations

  3. State of the Art • Approaches: • State Machine Replication – BFT • 3f+1 replicas • Byzantine Quorums – Q/U • 5f+1 replicas • Increased performance • Degradation when writes contend

  4. Contributions • Low overhead Byzantine Fault Tolerance • Performance of Byzantine Quorums without 5f+1 replicas or contention degradation • Hybrid Quorum scheme for Byzantine Fault Tolerance • Quorum approach in normal-case • Use Byzantine agreement to resolve write contention

  5. Outline • Current Approaches • HQ Replication • BFT Improvements • Performance Evaluation • Conclusions

  6. Request Pre-Prepare Prepare Commit Reply Client Primary Replica 2 Replica 3 Replica 4 State Machine Replication • BFT - Castro and Liskov TOCS ’02 • Operations ordered by primary • Agreed upon by replicas

  7. Update Reply Client Replica 1 Replica 2 Replica 3 Replica 4 Replica 5 Replica 6 Byzantine Quorums • Q/U - Abd-El-Malek et al. SOSP ’05 • Client controlled protocol • Replicas order operations independently • Optimistic • Best case one-phase protocol • Worst case unbounded • Randomized backoff

  8. Advantages/Disadvantages Q/U • Good • Best-case performance • One-phase write • Low replica load • Bad • 5f+1 replicas • Degraded performance when writes contend BFT • Good • 3f+1 replicas • Bounded number of phases • Bad • Higher latency • Quadratic communication

  9. HQ Replication • 3f+1 replicas • Supports general operations • No all-to-all communication in normal-case • BFT used to resolve contention

  10. Write1 Write1 OK Write2 Write2 OK Client Replica 1 Replica 2 Replica 3 Replica 4 HQ Replication • One-phase read • Two-phase write

  11. High-level Write Protocol • Two-phase write protocol • Phase 1: • Client obtains timestamp grant from each replica • Phase 2: • Client forms certificate from 2f+1 matching grants • Sends to replicas to complete write

  12. Grants • Promise to execute operation at given sequence number • Assuming agreement from quorum • Grant • Client ID • Object ID • Hash over requested operation • Sequence Number (timestamp) • Replica signature

  13. Certificates • Certificate • Quorum (2f+1) matching grants • Proves quorum of replicas agree to ordering of operation • Uniquely identify client, operation and sequential ordering • Existence of certificate precludes existence of conflicting certificate

  14. Replica State • Multiple independent objects • State per-object • Certificate supporting most recent write • Operation status • Active • Write in progress, outstanding grant • Quiescent • No current write operation

  15. Write Phase 1 • Client sends write request to replicas • If quiescent, replica assigns new grant to client • If active, replica sends currently outstanding grant • Several Possibilities • All grants match • Grants for different client • Grants conflict

  16. Isolated Write

  17. State: Quiescent State: Quiescent State: Quiescent replica 1 Client: ? Client: ? Client: ? Grant Grant Grant Seq No: 0 Seq No: 0 Seq No: 0 Operation: ? Operation: ? Operation: ? replica 2 client 1 replica 3 Isolated Write

  18. State: Quiescent State: Quiescent State: Quiescent replica 1 Client: ? Client: ? Client: ? Grant Grant Grant Seq No: 0 Seq No: 0 Seq No: 0 Operation: ? Operation: ? Operation: ? replica 2 client 1 replica 3 Isolated Write Write A Write A Write A

  19. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 1 replica 3 Isolated Write Write A Write A Write A

  20. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 1 replica 3 Isolated Write Grant <1,1,A>1 Grant <1,1,A>2 Grant <1,1,A>3

  21. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 1 replica 3 Isolated Write Grant <1,1,A>1 Grant <1,1,A>2 Grant <1,1,A>3 Matching grants: Phase 2 write

  22. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 1 replica 3 Isolated Write Cert {G1,G2,G3} Cert {G1,G2,G3} Cert {G1,G2,G3} Matching grants: Phase 2 write

  23. replica 1 replica 2 client 1 replica 3 Isolated Write execute A Cert {G1,G2,G3} Cert {G1,G2,G3} execute A Cert {G1,G2,G3} execute A

  24. State: Quiescent State: Quiescent State: Quiescent replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 1 replica 3 Isolated Write Result A Result A Result A

  25. State: Quiescent State: Quiescent State: Quiescent replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 1 replica 3 Isolated Write Result A result Result A Result A Write Complete

  26. Incomplete Write

  27. State: Quiescent State: Quiescent State: Quiescent replica 1 Client: ? Client: ? Client: ? Grant Grant Grant Seq No: 0 Seq No: 0 Seq No: 0 Operation: ? Operation: ? Operation: ? replica 2 client 2 client 1 replica 3 Incomplete Write

  28. State: Quiescent State: Quiescent State: Quiescent replica 1 Client: ? Client: ? Client: ? Grant Grant Grant Seq No: 0 Seq No: 0 Seq No: 0 Operation: ? Operation: ? Operation: ? replica 2 client 2 client 1 replica 3 Incomplete Write Write A Write A Write A

  29. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 2 client 1 replica 3 Incomplete Write Write A Write A Write A

  30. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 2 client 1 replica 3 Incomplete Write Grant <1,1,A>1 Grant <1,1,A>2 Grant <1,1,A>3

  31. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 2 client 1 replica 3 Incomplete Write Grant <1,1,A>1 Grant <1,1,A>2 Grant <1,1,A>3 Client 1 slow or failed

  32. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 2 client 1 replica 3 Incomplete Write Write B Write B Write B

  33. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 2 client 1 replica 3 Incomplete Write Grant<1,1,A>1 Grant <1,1,A>2 Grant <1,1,A>3 Replicas active: Return current grant

  34. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 2 client 1 replica 3 Incomplete Write Grant<1,1,A>1 Grant <1,1,A>2 Grant <1,1,A>3 Grants for different client: Perform Writeback

  35. State: Active State: Active State: Active replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 2 client 1 replica 3 Incomplete Write Cert {G1,G2,G3}, Write B Cert {G1,G2,G3}, Write B Cert {G1,G2,G3}, Write B Grants for different client: Perform Writeback

  36. replica 1 replica 2 client 1 client 2 replica 3 Incomplete Write execute A Cert {G1,G2,G3}, Write B execute A Cert {G1,G2,G3}, Write B Cert {G1,G2,G3}, Write B execute A

  37. State: Quiescent State: Quiescent State: Quiescent replica 1 Client: 1 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: A replica 2 client 2 client 1 replica 3 Incomplete Write Cert {G1,G2,G3}, Write B Cert {G1,G2,G3}, Write B Cert {G1,G2,G3}, Write B

  38. State: Active State: Active State: Active replica 1 Client: 2 Client: 2 Client: 2 Grant Grant Grant Seq No: 2 Seq No: 2 Seq No: 2 Operation: B Operation: B Operation: B replica 2 client 2 client 1 replica 3 Incomplete Write Grant<2,2,B>1 Grant <2,2,B>2 Grant <2,2,B>3

  39. State: Active State: Active State: Active replica 1 Client: 2 Client: 2 Client: 2 Grant Grant Grant Seq No: 2 Seq No: 2 Seq No: 2 Operation: B Operation: B Operation: B replica 2 client 2 client 1 replica 3 Incomplete Write Grant<2,2,B>1 Grant <2,2,B>2 Grant <2,2,B>3 Matching grants: Phase 2 write

  40. Write Contention

  41. State: Quiescent State: Quiescent State: Quiescent replica 1 Client: ? Client: ? Client: ? Grant Grant Grant Seq No: 0 Seq No: 0 Seq No: 0 Operation: ? Operation: ? Operation: ? replica 2 client 2 client 1 replica 3 Write Contention Write A

  42. State: Quiescent State: Active State: Quiescent replica 1 Client: 1 Client: ? Client: ? Grant Grant Grant Seq No: 0 Seq No: 0 Seq No: 1 Operation: A Operation: ? Operation: ? replica 2 client 2 client 1 replica 3 Write Contention Write A Write A

  43. State: Quiescent State: Active State: Active replica 1 Client: 1 Client: 1 Client: ? Grant Grant Grant Seq No: 0 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: ? replica 2 client 2 client 1 replica 3 Write Contention Write A Write A Write A Write B

  44. State: Active State: Active State: Active replica 1 Client: 2 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: B replica 2 client 2 client 1 replica 3 Write Contention Write A Write A Write A Write B

  45. State: Active State: Active State: Active replica 1 Client: 2 Client: 1 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: B replica 2 client 2 client 1 replica 3 Write Contention Grant <1,1,A>1 Grant <1,1,A>2 Grant <2,1,B>3

  46. State: Active State: Active State: Active replica 1 Client: 1 Client: 2 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: B replica 2 client 1 client 2 replica 3 Write Contention Grant <1,1,A>1 Grant <1,1,A>2 Grant <2,1,B>3 Conflicting grants: Request resolution

  47. State: Active State: Active State: Active replica 1 Client: 1 Client: 2 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: B replica 2 client 1 client 2 replica 3 Write Contention Resolve Request Cert {G1,G2,G3} Cert {G1,G2,G3} Cert {G1,G2,G3} Conflicting grants: Request resolution

  48. State: Active State: Active State: Active replica 1 Client: 1 Client: 2 Client: 1 Grant Grant Grant Seq No: 1 Seq No: 1 Seq No: 1 Operation: A Operation: A Operation: B replica 2 Contention Resolution client 2 client 1 replica 3 Write Contention Resolve Request Cert {G1,G2,G3} Cert {G1,G2,G3} Cert {G1,G2,G3}

  49. replica 1 replica 2 client 1 client 2 replica 3 Write Contention Resolve Request execute A Cert {G1,G2,G3} Cert {G1,G2,G3} execute A Cert {G1,G2,G3} execute A

  50. replica 1 replica 2 client 1 client 2 replica 3 Write Contention Resolve Request execute B Cert {G1,G2,G3} Cert {G1,G2,G3} execute B Cert {G1,G2,G3} execute B

More Related