210 likes | 232 Views
Byzantine fault-tolerance. COMP 413 Fall 2002. Overview. Models Synchronous vs. asynchronous systems Byzantine failure model Secure storage with self-certifying data Byzantine quorums Byzantine state machines. Models.
E N D
Byzantine fault-tolerance COMP 413 Fall 2002
Overview • Models • Synchronous vs. asynchronous systems • Byzantine failure model • Secure storage with self-certifying data • Byzantine quorums • Byzantine state machines
Models Synchronous system: bounded message delays (implies reliable network!) Asynchronous system: message delays are unbounded In practice (Internet): reasonable to assume that network failures are eventually fixed (weak synchrony assumption).
Model (cont’d) • Data and services (state machines) can be replicated on a set of nodes R. • Each node in R has iid probability of failing • Can specifiy bound f on the number of nodes that can fail simultaneously
Model (cont’d) Byzantine failures • no assumption about nature of fault • failed nodes can behave in arbitrary ways • may act as intelligent adversary (compromised node), with full knowledge of the protocols • failed nodes may conspire (act as one)
Byzantine quorums • Data is not self-certifying (multiple writers without shared keys) • Idea: replicate data on sufficient number of replicas (relative to f) to be able to rely on majority vote
Byzantine quorums: r/w variable Representative problem: implement a read/write variable Assuming no concurrent reads, writes for now Assuming trusted clients, for now
Byzantine quorums: r/w variable How many replicas do we need? • clearly, need at least 2f+1, so we have a majority of good nodes • write(x): send x to all replicas, wait for acknowledgments (must get at least f+1) • read(x): request x from all replicas, wait for responses, take majority vote (if no concurrent writes, must get f+1 identical votes!) R W
Byzantine quorums: r/w variable Does this work? Yes, but only if • system is synchronous (bounded msg delay) • faulty nodes cannot forge messages (messages are authenticated!)
Byzantine quorums: r/w variable Now, assume • Weak synchrony (network failures are fixed eventually) • messages are authenticated (e.g., signed with sender’s private key)
Byzantine quorums: r/w variable Let’s try 3f+1 replicas (known lower bound) • write(x): send x to all replicas, wait for 2f+1 responses (must have at least f+1 good replicas with correct value) • read(x): request x from all replicas, wait for 2f+1 responses, take majority vote (if no concurrent writes, must get f+1 identical votes!? – no, it is possible that the f nodes that did not respond were good nodes!) R W
Byzantine quorums: r/w variable Let’s try 4f+1 replicas • write(x): send x to all replicas, wait for 3f+1 responses (must have at least 2f+1 good replicas with correct value) • read(x): request x from all replicas, wait for 3f+1 responses, take majority vote (if no concurrent writes, must get f+1 identical votes!? – no, it is possible that the f faulty nodes vote with the good nodes that have an old value of x!) R W
Byzantine quorums: r/w variable Let’s try 5f+1 replicas • write(x): send x to all replicas, wait for 4f+1 responses (must have at least 3f+1 good replicas with correct value) • read(x): request x from all replicas, wait for 4f+1 responses, take majority vote (if no concurrent writes, must get f+1 identical votes!) • Actually, can use only 5f replicas if data is written with monotonically increasing timestamps R W
Byzantine quorums: r/w variable Still rely on trusted clients • Malicious client could send different values to replicas, or send value to less than a full quorum • To fix this, need a byzantine agreement protocols among the replicas Still don’t handle concurrent accesses Still don’t handle group changes
Byzantine state machine BFT (Castro, 2000) • Can implement any service that behaves like a deterministic state machine • Can tolerate malicious clients • Safe with concurrent requests • Requires 3f+1 replicas • 5 rounds of messages
Byzantine state machine • Clients send requests to one replica • Correct replicas execute all requests in same order • Atomic multicast protocol among replicas ensures that all replicas receive and execute all requests in the same order • Since all replicas start in same state, correct replicas produce identical result • Client waits for f+1 identical results from different replicas
BFT: Protocol overview • Client c sends m = <REQUEST,o,t,c>σc to the primary. (o=operation,t=monotonic timestamp) • Primary p assigns seq# n to m and sends <PRE-PREPARE,v,n,m> σp to other replicas. (v=current view, i.e., replica set) • If replica i accepts the message, it sends <PREPARE,v,n,d,i> σi to other replicas. (d is hash of the request). Signals that i agrees to assign n to m in v.
BFT: Protocol overview • Once replica i has a pre-prepare and 2f+1 matching prepare messages, it sends <COMMIT,v,n,d,i> σi to other replicas. At this point, correct replicas agree on an order of requests within a view. • Once replica i has 2f+1 matching prepare and commit messages, it executes m, then sends <REPLY,v,t,c,i,r> σi to the client. (The need for this last step has to do with view changes.)
BFT • More complexity related to view changes and garbage collection of message logs • Public-key crypto signatures are bottleneck: a variation of the protocol uses symmetric crypto (MACs) to provide authenticated channels. (Not easy: MACs are less powerful: can’t prove authenticity to a third party!)