250 likes | 423 Views
Abstracting out Byzantine Behavior. Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems. Why Distributed ≠ Centralized ?. Failures: a process can deviate from its specification
E N D
Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr KouznetsovMax Planck Institute for Software Systems P. Kouznetsov, 2006
Why Distributed ≠ Centralized ? • Failures: a process can deviate from its specification • There are problems that cannot be solved fault-tolerant (even if just one process might fail)
Crash failures • Crash fault-tolerant consensus cannot be achieved in an asynchronous system [FLP85] • A process crashes = prematurely halts all its activities
Abstracting out crash failures • Failure detectors [Chandra and Toueg, 1996] • Engineering side: can be specified and implemented independently of algorithms • Theory side: can be used for comparing and classifying problems (the weakest failure detectors)
Using failure detectors Eventually strong FD <>S [Chandra and Toueg, 1996]: outputs a list of suspected processes. There is a time after which: • every crashed process is suspected by every correct process • some correct process is never suspected by any correct process • Consensus is solvable with <>S and a majority of correct processes
Using failure detectors, contd. • Abstracting out a majority assumption : Quorum failure detector Σ[DFG, 2004]: outputs a list of processes, called a quorum • Every two quorums (output at any processes at any times) intersect • There is a time after which every output quorum contain only correct processes
The weakest failure detector • <>S is necessary to solve consensus [CHT, 1996] • Σ is the weakest FD to implement a RW register [DFG, 2004] => (<>S, Σ) is the weakest FD to solve consensus
State machine replication [Lamport, 1984; Schneider, 1993;…] requests response Clients Servers
State machine replication Client: broadcast request to all servers wait until a response is received Server: repeat forever if there are unserved requests use consensus (<>S, Σ) to agree on the order in which the requests are served send the results of served requests to the clients
Useful abstractions • SMR (Totally ordered broadcast) = reliable broadcast + consensus [Toueg, Hadzilacos, 1993] • Consensus = (<>S, Σ)
Detectable Byzantine failures Ignorant Crash Mute Detectable Byzantine Byzantine failures
Byzantine failure detectors • BFDs are parameterized with the specification of the correct system behavior • The output of BFD depends solely on detectable failures: no information about steps performed by correct processes can be extracted (necessary to distinguish algorithms from BFDs)
Byzantine FD abstraction Application Monitoring algorithms (Peerreview, HotDep 2006) Automaton Ai BFD Enforcing algorithms (SMR) Network
State machine replication: classics Client: broadcast requests to all servers wait until a response is received Server: repeat forever if there are unserved requests use consensus to agree on the order in which the requests are served send the results of served requests to the clients (!) a single malicious process can ignore correct requests and inject bogus requests
BFT state machine replication [Doudou et al, 2005] reliable broadcast + weak interactive consistency WIConsistency: every correct process proposes a value and decides on a set of values • the decided set contains at least one value proposed by a correct process • no two correct processes decide differently SMR can be implemented using RB and WIConsistency
The question • SMR = RB + WIConsistency? • No: (<>SB, ΣB) can implement SMR but cannot implement WIConsistency => WIConsistency > SMR
<>SB [MR97,DS98,KMM03] Outputs a list of suspected to be mute processes. There is a time after which: • every mute process is suspected by every correct process • somecorrect process is never suspected by any correct process
Byzantine quorum FD ΣB Outputs a list of processes, called quorum • Every two quorums (output at any two correct processes at any times) share at least one correct process • There is a time after which every output quorum contain only correct processes
SMR using (<>SB, ΣB) • (<>SB, ΣB) can be used to implement BFT replication system • Adaptation of BFT [Castro, Liskov, 1999]: • wait until receive acks from 2f+1 processes => wait until receive acks from ΣB • If the primary replica is timed-out then initiate a view change => If the primary replica is in <>SB then initiate a view change
WIConsistency using (<>SB, ΣB) ? Assume an algorithm exists • Let processes in Q be correct and the rest crash initially • E: Q decide on V (set of values proposed by Q) • E’: an extension of E in which some pi not in Q decides V • E’’: an extension of E in which all processes in V are faulty and pi is correct => contradiction
Related work • State machine replication [Lamport 84, 89; Schneider, 1990; Doudou et al., 2005;…] • Failure detectors [Chandra, Toueg, 1991; Chandra et al., 1992; Delporte et al., 2003;…] • Byzantine quorum systems [Malkhi, Reiter, 1997] • Byzantine failure detection [MR97; DS98; KMM03; AMPR01; BAR, 2005; …]
Conclusions Byzantine FD abstraction does make sense! • BFT state machine replication using (<>SB, ΣB) • BFT SMR is strictly weaker than WIConsistency • Is the lower bound tight? • How to implement Byzantine FDs?
Monitoring: PeerReview [HKD06] BFD produces three types of indications for the application layer: trusted, suspected, and exposed. Completeness: • Eventually, every detectably ignorant node is forever suspected by every correct node • Eventually, every detectably malicious node is exposed by every correct node Accuracy: • No correct node is forever suspected by a correct node • No node is exposed by a correct node, unless it is detectably malicious
PeerReview approach • Nodes locally observe message traffic and classify other nodes as trusted, suspected, or exposed • Quick overview: • Every node keeps a log of allits local inputs and outputs • Use crypto techniques to ensurethat log is accurate & linear • Nodes can audit each others' log at any time • To check for faulty behavior,auditors replay the contents of the log • In case of misbehavior, produce evidence that can be verified independently by other nodes • Eventually complete and accurate! {trusted,suspected,exposed} Application PeerReviewdetector State machine(e.g. NFS) Network
Typical consensus algorithm repeat round++ c = round mod n if p=c then try to “lock” the current estimate help in locking until a decided value is received from c, or c is suspected by <>S until a decided value is received