200 likes | 317 Views
Consistency without consensus Linearizable Resilient Data Types (LRDT). Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani. Consistency & consensus. Add(The Hobbit). GetCart(). No deterministic algorithm in the presence of failures [FLP]. Add(Kindle).
E N D
Consistency without consensusLinearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani
Consistency & consensus Add(The Hobbit) GetCart() No deterministic algorithm in the presence of failures [FLP] Add(Kindle) GetCart() Processes agree on ordering of operations
Commuting updates • What if all update operations commute? • Ordering of updates doesn’t matter! • Eventual consistency reduces to eventual message delivery • Single round trip latency • What if we desire linearizability? • Updates don’t commute with arbitrary reads • Reads must be consistently ordered with updates • Semantics of queries like the current top(k) elements well understood
Commuting updates Add(The Hobbit) GetCart() {} Add(Kindle) GetCart() {The Hobbit, Kindle} Reads must observe comparable sets of operations
Linearizable resilient data types Possible Impossible Don’t know S1 S1 op2 op1 op1 op1 S’ S S S op1 op2 op2 op2 op1 op2 S2 S2 P1 : commutes(s,op1,op2) P2 : nullify(s,op1,op2)
Examples • Read write register : every pair of writes nullify • Read write memory : writes to the same location nullify, writes to different locations commute
Examples • Set : add, remove and read the whole set • Add(u), Remove(v) commute • Add(u), Remove(u) nullify • Add(*), Add(*) commute • Remove(*) Remove(*) commute • Counter : IncrBy(x), DecrBy(x), SetTo(v), Read() • SetTo(v) nullifies all other operations • Other pairs of updates commute • Other examples Heaps, union-find, atomic snapshot objects…
Lattice agreement • Consistency reduces to lattice agreement • Weaker problem than consensus • Solvable in an asynchronous distributed system • Assumptions • t < n/2 failures • Eventual message delivery
Lattice agreement • processes, each process starts with a value belonging to a join semi lattice • Each non-faulty process outputs a value • (Validity) Each process’ output is a join of one or more input values including its own • (Consistency) Any two output values are comparable • (Liveness) Every correct process eventually outputs a value
Lattice agreement a = Add(The Hobbit) b = Add(Kindle) c = Add(Lumia)
PROPOSERS ACCEPTORS Initially On receiving Send to all acceptors wait for majority of acceptors to respond All Acks? Output N Y Y S S N
Safety and liveness • Safety always guaranteed • Lattice agreement is t-resilient • Liveness guaranteed if quorum of processes are non-faulty and communication is reliable • Processes output value in at-most n round trips, where n is the number of processes
Generalized lattice agreement • Generalization of lattice agreement • Processes receivesequence of values • Values belong to an infinite lattice • Processes output a sequence of values • (Validity) Every output value is a join of some received values • (Consistency) Any two output values are comparable (i.e. output values form a chain) • (Liveness) Every value received by a correct process is eventually included in an output value
GLA algorithm • Liveness (t-resilient) • Every received value is eventually included in some output in n round trips • Adaptive, complexity depends on contention • Fast path • Received values output in one round trip • Reconfigurable • Replicas can be added/removed dynamically
From GLA to linearizability • Update commands form power set lattice • Updates return once majority of processes have learnt a command set that includes the update command • Read performed by (ABD style algorithm) • reading the learnt command set from a quorum of processes • Writing back the largest among these to a quorum • Constructing state corresponding to the largest command set by exploiting commutativity and nullification • Multi-master replication • Does not require a single primary/leader
Impossibility • Consensus reduction Consensus(b) Si S0 if(b) then op1 else op2 s = read() if(s = S1,S12) return true else return false Pair of idempotent update operations that neither commute nor nullify at some state s0 op1 op2 S12 S1 op1 Op* Si S0 op2 op1 S2 S21 op2
Implications for designing ADTs Most commands commute
Implications for designing ADTs neither commute nor nullify at ;
The Gap : Open problems Doubly saturating counter Decr() Decr() Decr() Decr() 1 n 0 2 Incr() Incr() Incr() Incr() Incr() and Decr() commute at 1 … n-1 Incr() and Dect() nullify at 0 and n Don’t know if this is possible or impossible
Summary Possible Impossible ?? Saturating counter queues, sequences graph, RW mem…