Impossibility of Distributed Consensus with One Faulty Process

Impossibility of Distributed Consensus with One Faulty Process Michael J. Fischer Nancy A. Lynch Michael S. Paterson

Introduction • Problem • Reach agreement among remote processes • Example: transaction commit problem • Easy if the participating processes and network are completely reliable • Real systems are subject to possible faults • Process Crash • Unreliable Communication • Byzantine failure

Result • No completely asynchronous consensus protocol can tolerate even a single unannounced process death

Assumption • Don’t consider Byzantine failure • Reliable message system: • messages are delivered correctly and exactly once • Asynchronous: • No assumption the relative speeds of processes or the delay time in delivering a message • No synchronized clock • Algorithms based on time-out can’t be used • No ability to detect the death of a process

The weak consensus problem • Initial state: 0 or 1 • Decision state: • Nonfaulty process decides on a value in {0, 1} • Requirement: • All nonfaulty processes that make a decision must choose the same value. • Some processes eventually make a decision(for proof) • Trivial solution is ruled out

System model • Processes are modeled as automata and communicate by messages • One atomic step • receive a message, perform local computation, send arbitrary but finite messages • Atomic broadcast • send a message to all other processes in one step • all nonfaulty processes or none will receive it

Consensus Protocols • A consensus protocol P is an asynchronous system of N processes • Each process p has a one-bit input register xp, an output register yp, with values in {b, 0, 1} • Internal state: • input/output register, pc, internal storage • Initial state: • all fixed starting values except the input register • output register starts with b

Decision state • output register has the value 0 or 1 • Once a decision is made, the value can’t be changed • p acts deterministically according to a transition function • Processes communicate by messages • Message system • send(p, m) • receive(p) • nondeterministically

A configuration consists of • All internal state of each process, the contents of message buffer • A step is a transition of one configuration C to another e(C), including 2 phases: • First, receive(p) to get a message m • Based on p’s internal state and m, p enters a new internal state and sends finite messages to other • e = (p, m) is called an event and said e can be applied to C

Schedule, run, reachable and accessible • A schedule from C • a finite or infinite sequence σ of events that can be applied, in turn, starting from C • The associated sequence of steps is called a run • σ(C) denotes the reulting configuration and is said to be reachable from C • An accessible configuration C • If C is reachable from some initial configuration

Lemma 1 • Suppose that from some configuration C, the schedules σ1 and σ2 lead to configuration C1 ,C2 , respectively. If the sets of Processes taking steps in σ1 and σ2 respectively, are disjoint, then σ2 can be applied to C1 and σ1 can be applied to C2 , and both lead to the same configuration.

Decision value, partially correct, nonfaulty • A configuration C has decision value v if some process p is in a decision state with yp =v. • P is partially correct if • No accessible configuration has more than one decision value • For each v {0, 1}, some accessible configuration has decision value v. • A process is nonfaulty • If it takes infinitely many steps

Admissible and deciding run, totally correct • Admissible run • At most one process is faulty and all messages sent to nonfaulty processes are eventually received • Deciding run • Some process reaches a decision state • A consensus protocol P is totally correct in spite of one fault if it is partially correct and every admissible run is a deciding run

Theorem 1 • No consensus protocol is totally correct in spite of one fault.

Bivalent, 0-valent/1-valent • Let C be a configuration, V the set of decision values of configurations reachable from C. C is bivalent if |V| = 2. C is univalent if |V| = 1. • 0-valent or 1-valent according to the corresponding decision value.

Lemma 2 • P has a bivalent initial configuration

Lemma 3 • Let C be a bivalent configuration of P, and let e = (p, m) be an event that is applicable to C. Let C be the set of configurations reachable from C without applying e, and let D= e(C) ={e(E) | E C and e is applicable to E}. Then, D contains a bivalent configuration.

Theorem 2 • There is a partially correct consensus protocol in which all nonfaulty processes always reach a decision, provided no processes die during its execution and a strict majority of the processes are alive initially

Conclusion • The problem of fault-tolerant cooperative computing cannot be solved in a totally asynchronous model of computation.. • To solve this problem in practice, more refined models of distributed computing is needed

Impossibility of Distributed Consensus with One Faulty Process

Impossibility of Distributed Consensus with One Faulty Process

Presentation Transcript

Consensus, impossibility results and Paxos

Distributed Algorithms 6. Distributed Consensus with Process failure 7. More Consensus Problems 8. Modeling II: Asynchro

Distributed systems Consensus

CS5412: Consensus and the FLP Impossibility Result

Distributed Consensus with Limited Communication Data Rate

Impossibility of Consensus

Consensus Impossibility

CS5412: Consensus and the FLP Impossibility Result

Impossibility of Consensus in Asynchronous Systems (FLP)

Distributed Consensus

Distributed Consensus (continued)

Impossibility of Distributed Consensus with One Faulty Process

Consensus and Its Impossibility in Asynchronous Systems

Distributed Consensus

Distributed Consensus

Distributed Systems : Raft Consensus Alg.

Distributed Consensus

Distributed Consensus

Minimizing Faulty Executions of Distributed Systems