550 likes | 791 Views
Brewer’s Conjecture and the feasibility of CAP web services. (Eric Brewer) Seth Gilbert Nancy Lynch Presented by Kfir Lev-Ari. introduction. Brewer’s Conjecture (At PODC 2000) - It is impossible for a web service to provide the following three guarantees: Consistency Availability
E N D
Brewer’s Conjecture and the feasibility of CAP web services (Eric Brewer)Seth GilbertNancy LynchPresented by Kfir Lev-Ari
introduction • Brewer’s Conjecture (At PODC 2000) - It is impossible for a web service to provide the following three guarantees: • Consistency • Availability • Partition-tolerance
Story time The story of 2 servers 1 CAP
Motivation (1) And you have a brilliant idea! You’ll create a web service named : In order to make some you decided to start your own company in the “cloud”.
Motivation (2) • You’ll give your users the following API: • SetValue(Key, Value) • GetValue(Key) • And you’ll promise them two basic things: • 1. To be available 24/7 • 2. GetValue will return the last value that was set for a given key.
Motivation (5) ? Send email
Formal model (1) • Atomic / Linearizable Consistency (of a web service) – • There must exist a total order on all operations such that each operation looks as if it were completed at a single thread. • i.e. Each server returns the right response to each request. • Equivalent to having a single up-to-date copy of the data.
Formal model (2) • Availability (of a web service) – • Every request received by a non-failing node in the system must result in a response. • In other words – any algorithm used by the service must eventuallyterminate. • Note that there is no bound on how long the algorithm may ran before terminating, and therefore the theorem allows unbounded computation. • On the other hand, even when severe network failures occur, every request must terminate.
Formal model (3) • Partition Tolerance (of a web service) – • When a network is partitioned, all messages sent from nodes in one component of the partition to nodes in another component are lost. • Note that unlike the previous two requirements, partition tolerance is really a statement about the underlying system rather than the service itself : it is the communication among the servers that is unreliable.
NOTE – This CAP isn’t made of ACID • The ACID (Atomicity, Consistency, Isolation, Durability) properties focus on consistency and are the traditional approach of databases.CAP properties describe desirable network shared-data system. • In ACID C means that a transaction preserves all the database rules, such as unique keys. (ACID consistency cannot be maintained across partitions.) • The C in CAP refers only to single-copy consistency (request/response operation sequence), a strict subset of ACID consistency.
Asynchronous networks (1) • Theorem 1It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: • • Availability • • Atomic consistency • in all fair executions (including those in which messages are lost).
Asynchronous networks (2) • Proof: We prove this by contradiction. • Assume an algorithm A exists that meets the three criteria: atomicity, availability, and partition tolerance. • We construct an execution of A in which there exists a request that returns an inconsistent response. • Assume that the network consists of at least two nodes. • Thus it can be divided into two disjoint, non-empty sets: {G1, G2}. • The basic idea of the proof is to assume that all messages between G1 and G2 are lost. • If a write occurs in G1, and later a read occurs in G2, then the read operation cannot return the results of the earlier write operation.
Asynchronous networks (3) The good scenario: 1. A writes a new value of V, which we'll call V1. 2. Then a message (M) is passed from N1 to N2 which updates the copy of V there. 3. Now any read by B of V will return V1. If the network partitions (that is messages from N1 to N2 are not delivered) then N2 contains an inconsistent value of V when step (3) occurs.
Asynchronous networks (4) • Corollary 1.1 It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: • • Availability, in all fair executions. • • Atomic consistency, in fair executions in which no messages are lost.
Asynchronous networks (5) • Proof: The main idea is that in the asynchronous model an algorithm has no way of determining whether a message has been lost, or has been arbitrarily delayed in the transmission channel. • Therefore if there existed an algorithm that guaranteed atomic consistency in executions in which no messages were lost, then there would exist an algorithm that guaranteed atomic consistency in all executions. This would violate Theorem 1.
from a transactional perspective • Say we have a transaction called α – in α1A writes new values of V and in α2B reads values of V. • On a local system this would be easily handled by a database with some simple locking, isolating any attempt to read in α2 until α1 completes safely. • In the distributed model though, with nodes N1 and N2 to worry about, the intermediate synchronizing message has also to complete. • Unless we can control when α2 happens, we can never guarantee it will see the same data values α1 writes. • All methods to add control (blocking, isolation, centralized management, etc.) will impact either partition tolerance or the availability of α1 (A) and/or α2 (B).
Solutions in the asynchronous model (1) “2 of 3” : • CP (Atomic consistency, Partition Tolerant) :By using stronger liveness criterion, many distributed databases provide this type of guarantee, especially algorithms based on distributed locking or quorums: if certain failure patterns occur, then the liveness condition is weakened and the service no longer returns responses. If there are no failures, then liveness is guaranteed. • CA (Atomic consistency, Available) :Systems that run on intranets and LANs are an example of these types of algorithms. • AP (Available, Partition Tolerant) : Web caches are one example of a weakly consistent network.
Circumvent the impossibility? (1) • Partially Synchronous Model • In the real world, most networks are not purely asynchronous. • If we allow each node in the network to have a clock, it is possible to build a more powerful service • In partially synchronous model - every node has a clock and all clocks increase at the same rate • However, the clocks themselves are not synchronized, in that they might display different variables at the same real time • In effect : the clocks act as timers : local state variables that the process can observe to measure how much time has passed • A local timer can be used to schedule an action to occur a certain interval of time after some other event • Furthermore, assume that every message is either delivered within a given, known time or it is lost • Also, every node processes a received message within a given, known time and local processing time
Circumvent the impossibility? (2) • Theorem 2It is impossible in the partially synchronous network model to implement a read/write data object that guarantees the following properties: • Availability • Atomic consistency • in all executions (even those in which messages are lost)
Circumvent the impossibility? (3) • Proof: • Same methodology as in case of Theorem 1 is used. • We divide the network into two components {G1, G2} and construct an admissible execution in which a write happens in one component, followed by a read operation in the other component. This read operation can be shown to return inconsistent data.
Circumvent the impossibility? (4) • (Reminder fromasynchronous model) Corollary 1.1 It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: • • Availability, in all fair executions, • • Atomic consistency, in fair executions in which no messages are lost. • Inpartially synchronous model - the analogue of Corollary 1.1 does not hold, the proof of this corollary depends on nodes being unaware of when a message is lost. • There are partially synchronous algorithms that will1. return atomic data when all messages in an execution are delivered (i.e. there are no partitions) 2. return inconsistent data only when messages are lost.
Circumvent the impossibility? (5) • An example of such an algorithm is the centralized protocol with single object state store modified to time-out lost messages : • On a read (or write) request, a message is sent to the central node • If a response from central node is received, then the node delivers the requested data (or an acknowledgement) • If no response is received within 2 ∗ + , then the node concludes that the message was lost • The client is then sent a response : either the best known value of the local node (for a read operation) or an acknowledgement (for a write) operation. In this case, atomic consistency may be violated.
Why “2 of 3” is misleading? (1) • Partitions are rare, and there is little reason to forfeit C or A when the system is not partitioned. • The choice between C and A can occur many times within the same system at very fine granularity. Not only can subsystems make different choices, but the choice can change according to the operation or even the specific data or user involved. • All three properties are more continuous than binary:1. Availability is obviously continuous from 0% to 100%. 2. There are many levels of consistency.3. Partitions have nuances, including disagreement within the system about whether a partition exists.
CAP-Latency connection • The essence of CAP takes place during a timeout, a period when the program must make a fundamental decision – the partition decision: • cancel the operation and thus decrease availability,or • proceed with the operation and thus risk inconsistency. • In its classic interpretation, the CAP theorem ignores latency, although in practice, latency and partitions are deeply related. • Partition is a time bound on communication. • Failing to achieve consistency within the time bound (due to high latency) implies a partition and thus a choice between C and A for this operation. • In addition, some systems (for example Yahoo’s PNUTS) gives up consistency not for the goal of improving availability, but for lower latency.
More problems with CAP? • We saw that there is no real use in CP systems (systems that aren’t available?!) so the real meaning is that availability is only sacrificed when there is a network partition. • In practice, this means that the roles of A and C in CAP are asymmetric - Systems that sacrifice consistency (AP systems) tend to do so all the time, not just when there is a network partition. • Is there any practical difference between CA and CP systems? • As written above, CP system sacrificed availability when there is a network partition. • CA systems are not tolerance for network partitions, thus they won’t be available if there is a partition. • So practically speaking, CA and CP are identical. • The only real question is – what are you going to give up on partition, C or A?“2 out of 3” is just confusing..
What’s The real story here? (1) • The tradeoff between consistency and availability in a partition-prone system is an example of the general tradeoff between safety and liveness in an unreliable system: • Atomic consistency claiming that in every execution, every response is correct with respect to the “prior” operations [i.e. safety property]. • Availability if an execution continues for long enough, then eventually we will get a response (something desirable happens) [i.e. liveness property]. • Understanding the relationship between safety and liveness has been long-standing challenge in distributed computing.
What’s The real story here? (2) • Replicated state machine paradigm is one of the most common approached for building reliable distributed services. • This paradigm achieves availability by replicating the service across a set of servers. The servers then agree [aka consensus] on every operation performed by the service. • The impossibility of fault-tolerant consensus implies that services built according to the replicated state machine paradigm cannot achieve both availability and consistency in an asynchronous network.(consensus impossibility was proved in 1985)
Conclusion • We have shown that it impossible to reliably provide atomic consistent data when there are partitions in the network. • It is feasible, however, to achieve any two of the three properties : consistency, availability and partition tolerance. • In an asynchronous model, when no clocks are available, the impossibility result is fairly strong : it is impossible to provide consistent data, even allowing stale data to be returned when messages are lost. • However, in partially synchronous models it is possible to achieve a practical compromise between consistency and availability.
References • Seth Gilbert and Nancy Lynch “Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services” SigAct News, June, 2002 • Eric Brewer “CAP Twelve Years Later: How the “Rules” Have Changed” IEEE Computer (Volume:45 , Issue: 2 ) Feb. 2012 • Seth Gilbert and Nancy Lynch “Perspectives on the CAP theorem” IEEE Computer (Volume:45 , Issue: 2 ) Feb. 2012
APPENDIX A – Proof of Theorem 1 • Let be the initial value of the atomic object. • Let be the prefix of an execution of A in which a single write of a value not equal to occurs in G1, ending with the termination of the write operation. • Assume that no other client requests occur in either G1 or G2. • Further, assume that no messages from G1 are received in G2, and no messages from G2 are received in G1. We know that this write completes, by the availability requirement. • Similarly, let be the prefix of an execution in which a single read occurs in G2, and no other client requests occur, ending with the termination of the read operation. • During no messages from G2 are received in G1, and no messages from G1 are received in G2. • Again we know that the read returns a value by the availability requirement. • The value returned by this execution must be , as no write operation has occurred in . • Let be an execution beginning with and continuing with . To the nodes in G2, is indistinguishable from , as all the messages from G1 to G2 are lost (in both and , which together make up ), and does not include any client requests to nodes in G2. • Therefore in the execution, the read request (from ) must still return . • However the read request does not begin until after the write request (from ) has completed. • This therefore contradicts the atomicity property, proving that no such algorithm exists.
APPENDIX B – Proof of Corollary 1.1 • Assume for the sake of contradiction that there exists an algorithm A that always terminates, and guarantees atomic consistency in fair executions in which all messages are delivered. • Further, Theorem 1 implies that A does not guarantee atomic consistency in all fair executions, so there exists some fair execution of A in which some response is not atomic. • At some finite point in execution , the algorithm A returns a response that is not atomic. • Let be the prefix of ending with the invalid response. • Next, extend to a fair execution , in which all messages are delivered. • The execution is now a fair execution in which all messages are delivered. • However this execution is not atomic. Therefore no such algorithm A exists.
APPENDIX C – Proof of Theorem 2 • We construct execution : a single write request and acknowledgement occurs in G1, and all messages between the two components {G1, G2} are lost. • Let be an execution that begins with a long interval of time during which no client requests occur. This interval must be at least long as the entire duration of . • Then append to the events of in following manner : a single read request and response in G2 assuming all messages between the two components are lost. • Finally - we construct α by superimposing two execution and • The long interval of time in ensures that the write request competes before the read request begins.However, the read request returns the initial value, rather than the new value written by the write request, violating atomic consistency.
Weaker Consistency Conditions (1) While it is useful to guarantee that atomic data will be returned in executions in which all messages are delivered, it is equally important to specify what happens in executions in which some of the messages are lost We discuss possible weaker consistency condition that allows stale data to be returned when there are partitions, yet place formal requirements on the quality of stale data returned This consistency guarantee will require availability and atomic consistency in executions in which no messages are lost and is therefore impossible to guarantee in the asynchronous model as a result of corollary In the partially synchronous model it often makes sense to base guarantees on how long an algorithm has had to rectify a situation This consistency model ensures that if messages are delivered, then eventually some notion of atomicity is restored
Weaker Consistency Conditions (2) In a atomic execution, we define a partial order of the read and write operations and then require that if one operation begins after another one ends, the former does not precede the latter in the partial order. We define a weaker guarantee, t-Connected Consistency, which defines a partial order in similar manner, but only requires that one operation not precede another if there is an interval between the operations in which all messages are delivered
Weaker Consistency Conditions (3) • A timed execution, α of a read-write object is t-Connected Consistent if two criteria hold. First in executions in which no messages are lost, the execution is atomic. Second, in executions in which messages are lost, there exists a partial order P on the operations in α such that : • 1. P orders all write operations, and orders all read operations with respect to the write operations • 2. The value returned by every read operation is exactly the one written by the previous write operation in P or the initial • value if there is no such previous write in P • 3. The order in P is consistent with the order of read and write requests submitted at each node • 4. Assume that there exists an interval of time longer than t in which no messages are lost. Further, assume an operation, θ completes before the interval begins, and another operation φ, begins after the interval ends. Then φ does not precede θ in the partial order P
Weaker Consistency Conditions (4) • t-Connected Consistency • This guarantee allows for some stale data when messages are lost, but provides a time limit on how long it takes for consistency to return, once the partition heals. • This definition can be generalized to provide consistency guarantees when only some of the nodes are connected and when connections are available only some of the time.
Weaker Consistency Conditions (5) • A variant of ”centralized algorithm” is t-Connected Consistent.Assume node C is the centralized node. The algorithm behaves as follows: • Read at node A : A sends a request to C from the most recent value. If A receives a response from C within time 2 ∗ tmsg + tlocal, it saves the value and returns it to the client. • Otherwise, A concludes that a message was lost and it returns the value with the highest sequence number that has ever been received from C, or the initial value if no value has yet been received from C. (When a client read request occurs at C it acts like any other node, sending messages to itself)