370 likes | 636 Views
ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE. R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin. Best Paper Award at SOSP 2007. Motivation. Why implement Byzantine Fault-Tolerant replication? Increasing value of data and decreasing cost of hardware
E N D
ZYZZYVA:SPECULATIVE BYZANTINEFAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. WongU. T. Austin Best Paper Award at SOSP 2007
Motivation • Why implement Byzantine Fault-Tolerant replication? • Increasing value of data and decreasing cost of hardware • More non-stop-fail behaviors than believed • BFT is becoming cheaper • Cost of 3-way non-BFT replication close to cost of BFT replication
Zyzzyva (I) • Uses speculation to reduce the cost of BFT replication • Primary replica proposes order of client requests to all secondary replicas (standard) • Secondary replicas speculatively execute the request without going through an agreement protocol to validate that order (new idea)
Zyzzyva (II) • As a result • States of correct replicas may diverge • Replicas may send diverging replies to client • Zyzzyva’s solution • Clients detect inconsistencies • Help convergence of correct replicas to a single total ordering of requests • Reject inconsistent replies
How? • Clients observe a replicated state machine • Replies contain enough information to let clients ascertain if the replies and the history are stable and guaranteed to be eventually committed • Replicas have checkpoints
Byzantine agreement (I) • No solution for less than four entities
Byzantine agreement (II) • To achieve agreement in the presence of f failed nodes (“traitors”) we need • 3f + 1 entities
Practical BFT (I) • Practical Byzantine Fault-Tolerant protocol (PBFT) [Castro and Liskov 1999]
Practical BFT (II) Replicas decide on correct ordering
Practical BFT (III) • Client sends signed request to primary replica • Primary assigns a sequence number to the request and sends to all other replicas aPRE-PREPARE message • Secondary replicas validate the message and send a PREPARE message to all replicas • Replicas that can collect 2fPREPARE messages send a COMMIT message to all replicas • Replicas that can collect 2f+ 1COMMIT message send a REPLY to the client
A shortened version Faster agreement is achieved thanks toa more complex view change protocol
The explanation (I) • "No replicated service that uses the traditional view change protocol can be live without an agreement protocol that includes both the prepare and commit full exchanges" • "The traditional view change protocol lets correct replicas commit to a view change and become silent in a view without any guarantee that their action will lead to the view change."
The explanation (II) • Zyzzyva • Adds an extra phase to its view change protocol • Guarantees that a correct replica will not abandon a view unless every other correct replica does it
Zyzzyva Agreement (I) • Common case: no faulty replicas
Explanations • Secondary replicas assume that • Primary replica gave the right ordering • All secondary replicas will participate in transaction • Initiate speculative execution • Client receives 3f + 1mutually consistent responses
Zyzzyva Agreement (II) • With a faulty replica
Explanations (I) • Client receives 3f mutually consistent responses • Gathers at least 2f + 1 mutually consistent responses • Distributes a commit certificate to the replicas • Once at least 2f + 1 replicas acknowledge receiving a commit certificate, the client considers the request completed
Explanations (II) • If enough secondary replicas suspect that the primary replica is faulty, a view change is initiated and anew primaryelected
Explanations (I) • Each replica maintains • A history of the requests it has executed • A copy of the max commit certificate it has received • Let it distinguish between committed history and speculative history
Explanations (II) • Each replica constructs a checkpoint every CP_INTERVAL requests • It maintains one stable checkpoint with a corresponding stable application state snapshot • It might also have up to one speculative checkpoint with its corresponding speculative application state snapshot
Explanations (III) • Checkpoints and application state become committed through a process similar to that of earlier BFT agreement protocols • Replicas send signed checkpoint messages to all replicas when they generate a tentative checkpoint • Commit checkpoint after they collect f + 1 signed matching checkpoint messages
Explanations • Two-phase protocol • Elects a new primary • Guarantees that it will not introduce any changes in a history that has already completed at a correct client
Comments • Zyzzyva-5 is a special version of Zyzziva requiring more replicas but having a lower overhead
CONCLUSIONS • Systematically exploiting speculative execution results in a protocol much faster than conventional BFT agreement protocols. • Observe that Zyzzyva is optimized for the most frequent case but provides the correct result in all cases • A good rule to follow