1 / 34

Byzantine fault tolerance

Byzantine fault tolerance. Jinyang Li With PBFT slides from Liskov. What we’ve learnt so far: tolerate fail-stop failures. Traditional RSM tolerates benign failures Node crashes Network partitions A RSM w/ 2f+1 replicas can tolerate f simultaneous crashes. Req-x. Req-y. Reply-x. Reply-y.

keaton
Download Presentation

Byzantine fault tolerance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Byzantine fault tolerance Jinyang Li With PBFT slides from Liskov

  2. What we’ve learnt so far:tolerate fail-stop failures • Traditional RSM tolerates benign failures • Node crashes • Network partitions • A RSM w/ 2f+1 replicas can tolerate f simultaneous crashes

  3. Req-x Req-y Reply-x Reply-y vid=0, seqno=1,reply vid=0, seqno=2,reply vid=0, seqno=2,req-y vid=0, seqno=1,req-x A traditional RSM client N0: primary N1: backup N2: backup What must N2 check before executing <vid=0, seqno=1, req>?

  4. A reconfigurable RSM (lab 6) client Req-x Req-x N0: primary vid=1, seqno=2,req-x N1: backup vid=0, seqno=1,req-x Prepare vid=1, myh=N0:1 decide vid=1, N0:1, {N0,N1}, lastseq=1 Accept vid=1, N0:1, {N0,N1}, lastseq=1 N2: backup

  5. Byzantine fault tolerance • Nodes fail arbitrarily • they lie • they collude • Causes • Malicious attacks • Software errors • Seminal work is PBFT • Practical Byzantine Fault Tolerance, M. Castro and B. Liskov, SOSP 1999

  6. What does PBFT achieve? • Achieve sequential consistency (linearizability) if … • Tolerate f faults in a 3f+1-replica RSM • What does that mean in a practical sense?

  7. Practical attacks that PBFT prevents (or not) • Prevent consistency attacks • E.g. a bad node fools clients into accepting at a stale bank balance • Protection is achieved only when <= f nodes fail • With enough bad nodes, attacker can fool clients into accept arbitrary bank balance, not just a stale one. • If an attacker hacks into one server, how likely is he to hack into the rest? • Does not prevent attacks like: • Turn a machine into a botnet node • Steal SSNs from servers

  8. Why doesn’t traditional RSM work with Byzantine nodes? • Cannot rely on the primary to assign seqno • Malicious primary can assign the same seqno to different requests! • Cannot use Paxos for view change • Paxos uses a majority accept-quorum to tolerate f benign faults out of 2f+1 nodes • Does the intersection of two quorums always contain one honest node? • Bad node tells different things to different quorums! • E.g. tell N1 accept=val1 and tell N2 accept=val2

  9. Paxos under Byzantine faults Prepare vid=1, myn=N0:1 OK val=null N2 N0 N1 nh=N0:1 nh=N0:1 Prepare vid=1, myn=N0:1 OK val=null

  10. Paxos under Byzantine faults accept vid=1, myn=N0:1, val=xyz OK N2 X N0 N1 N0 decides on Vid1=xyz nh=N0:1 nh=N0:1

  11. Paxos under Byzantine faults prepare vid=1, myn=N1:1, val=abc OK val=null N2 X N0 N1 N0 decides on Vid1=xyz nh=N0:1 nh=N0:1

  12. Agreement conflict! Paxos under Byzantine faults accept vid=1, myn=N1:1, val=abc OK N2 X N0 N1 N0 decides on Vid1=xyz nh=N0:1 nh=N1:1 N1 decides on Vid1=abc

  13. PBFT main ideas • Static configuration (same 3f+1 nodes) • To deal with malicious primary • Use a 3-phase protocol to agree on sequence number • To deal with loss of agreement • Use a bigger quorum (2f+1 out of 3f+1 nodes) • Need to authenticate communications

  14. BFT requires a 2f+1 quorum out of 3f+1 nodes … … … … 1. State: 2. State: 3. State: 4. State: A A A Servers X write A write A write A write A Clients For liveness, the quorum size must be at most N - f

  15. BFT Quorums … … … … 1. State: 2. State: 3. State: 4. State: A A B B B Servers X write B write B write B write B Clients For correctness, any two quorums must intersect at least one honest node: (N-f) + (N-f) - N >= f+1 N >= 3f+1

  16. PBFT Strategy Primary runs the protocol in the normal case Replicas watch the primary and do a view change if it fails

  17. Replica state A replica id i (between 0 and N-1) Replica 0, replica 1, … A view number v#, initially 0 Primary is the replica with id i = v# mod N A log of <op, seq#, status> entries Status = pre-prepared or prepared or committed

  18. Normal Case Client sends request to primary or to all

  19. Normal Case Primary sends pre-prepare message to all Pre-prepare contains <v#,seq#,op> Records operation in log as pre-prepared Keep in mind that primary might be malicious Send different seq# for the same op to different replicas Use a duplicate seq# for op

  20. Normal Case Replicas check the pre-prepare and if it is ok: Record operation in log as pre-prepared Send prepare messages to all Prepare contains <i,v#,seq#,op> All to all communication

  21. Normal Case Replicas wait for 2f+1 matching prepares Record operation in log as prepared Send commit message to all Commit contains <i,v#,seq#,op> Trust the group, not the individuals

  22. Normal Case • Replicas wait for 2f+1 matching commits • Record operation in log as committed • Execute the operation • Send result to the client

  23. Normal Case • Client waits for f+1 matching replies

  24. Request Pre-Prepare Prepare Commit Reply Client Primary Replica 2 Replica 3 Replica 4 BFT

  25. View Change Replicas watch the primary Request a view change Commit point: when 2f+1 replicas have prepared

  26. View Change Replicas watch the primary Request a view change send a do-viewchange request to all new primary requires 2f+1 requests sends new-view with this certificate Rest is similar

  27. Additional Issues State transfer Checkpoints (garbage collection of the log) Selection of the primary Timing of view changes

  28. Possible improvements • Lower latency for writes (4 messages) • Replicas respond at prepare • Client waits for 2f+1 matching responses • Fast reads (one round trip) • Client sends to all; they respond immediately • Client waits for 2f+1 matching responses

  29. BFT Performance Table 2: Andrew 100: elapsed time in seconds M. Castro and B. Liskov, Proactive Recovery in a Byzantine-Fault-Tolerant System, OSDI 2000

  30. PBFT inspires much follow-on work • BASE: Using abstraction to improve fault tolerance, R. Rodrigo et al, SOSP 2001 • R.Kotla and M. Dahlin, High Throughput Byzantine Fault tolerance. DSN 2004 • J. Li and D. Mazieres, Beyond one-third faulty replicas in Byzantine fault tolerant systems, NSDI 07 • Abd-El-Malek et al, Fault-scalable Byzantine fault-tolerant services, SOSP 05 • J. Cowling et al, HQ replication: a hybrid quorum protocol for Byzantine Fault tolerance, OSDI 06 • Zyzzyva: Speculative Byzantine fault tolerance SOSP 07 • Tolerating Byzantine faults in database systems using commit barrier scheduling SOSP 07 • Low-overhead Byzantine fault-tolerant storage SOSP 07 • Attested append-only memory: making adversaries stick to their word SOSP 07

  31. A2M’s main insights Main worry in PBFT is that malicious nodes lie differently to different replicas Decide on Seq1=req-x Decide on Seq1=req-y

  32. A2M’s proposal • Introduce a trusted abstraction: attested append-only-memory • A2M properties: • Trusted (Attacker can corrupt the RSM implementation, but not A2M itself) • Prevent malicious nodes from making different lies to different replicas

  33. A2M’s abstraction • A2M implements a trusted log • Append: append a value to log • Lookup: lookup the value at position i • End: lookup the value at the end of log • Truncate: garbage collection old values • Advance: skip positions in log

  34. What A2M achieves • Smaller quorum size for BFT • Achieve correctness & liveness if <= f Byzantine faults with 2f+1 nodes Or • Achieve correctness if <= 2f nodes fail. Achieve liveness if <=f nodes fail

More Related