710 likes | 732 Views
Replicated data consistency explained through baseball. Landon Cox April 4, 2018. Lecture by Doug Terry. Distributed systems guru (PARC, MSR, now at Amazon). Baseball fan. Replicated-data consistency. A set of invariants on each read operation
E N D
Replicated data consistency explained through baseball Landon Cox April 4, 2018
Lecture by Doug Terry Distributed systems guru (PARC, MSR, now at Amazon) Baseball fan
Replicated-data consistency • A set of invariants on each read operation • Which writes are guaranteed to be reflected the data that’s returned? • i.e., what write orders are guaranteed? • Consistency is an application-level concern • When consistency is too weak, applications break • Example: auction site must not tell two people they won • What are the negative consequences of too-strong consistency? • Worse performance (for reads and writes) • Worse availability (for reads and writes)
Assumptions for our discussion • Clients perform reads and writes • Data is replicated among a set of servers • Writes are serialized (one logical writer) • Eventually performed in the same order at all servers • Write order consistent with write-request order • Reads reflect one or more past writes
Consistency models What this really means: If writes stop, reads eventually see effect of all writes • Strong consistency • Reader sees effect of all prior writes • Eventual consistency • Reader sees effect of some subset of prior writes • Consistent prefix • Reader sees effect of initial sequence of writes • Bounded staleness • Reader sees effect of all “old” writes • Monotonic reads • Reader sees effect of increasing subset of writes • Read my writes • Reader sees effect of all writes performed by reader Major caveats: writes may never stop, “eventually” could be very far into the future
Baseball rules • Time is measured in innings • Games are normally nine innings long • Can be longer if the score is tied after nine innings • Points are called runs • During an inning • Visiting team bats until it is out three times • Home team bats until it is out three times • Goto next inning
Pseudo-code baseball game Write (“visitors”, 0); Write (“home”, 0); for inning = 1..9 outs = 0; while outs < 3 visiting player bats; for each run scored score = Read (“visitors”); Write (“visitors”, score + 1); outs = 0; while outs < 3 home player bats; for each run scored score = Read (“home”); Write (“home”, score + 1); end game; Primary game thread. Only thread that issues writes.
Baseball applications • Application: entity that accesses the score Umpire Radio reporter Score keeper Game recapper Statistician
Baseball applications • Applications have different requirements • Some must have up-to-date score • Others are more tolerant of stale scores • Nearly all need some kind of guarantee across score accesses
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V V V H H H W R W R R R R Reader Writer (also reads) Reader
Example 1: score keeper score = Read (“visitors”); Write (“visitors”, score + 1); … score = Read (“home”); Write (“home”, score + 1);
Example 1: score keeper The score keeper makes sure that both scores increase monotonically. Write (“home”, 1); Write (“visitors”, 1); Write (“home”, 2); Write (“home”, 3); Write (“visitors”, 2); Write (“home”, 4); Write (“home”, 5); Visitors = 2 Home = 5
Example 1: score keeper What invariant must the store provide so the score keeper can ensure monotonically increasing scores? Write (“home”, 1); Write (“visitors”, 1); Write (“home”, 2); Write (“home”, 3); Write (“visitors”, 2); Write (“home”, 4); Write (“home”, 5); Reads must show effect of all prior writes (strong consistency) Visitors = 2 Home = 5
Example 1: score keeper Under strong consistency, what possible scores can the score keeper read after this write completes? Write (“home”, 1); Write (“visitors”, 1); Write (“home”, 2); Write (“home”, 3); Write (“visitors”, 2); Write (“home”, 4); Write (“home”, 5); 2-5 Visitors = 2 Home = 5
Example 1: score keeper Under read-my-writes, what possible scores can the score keeper read after this write completes? Write (“home”, 1); Write (“visitors”, 1); Write (“home”, 2); Write (“home”, 3); Write (“visitors”, 2); Write (“home”, 4); Write (“home”, 5); 2-5 Visitors = 2 Home = 5
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V V V H H H W W W Writer (also reads) Writer (also reads) Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V V1 V2 H H1 H Under strong consistency, who must S3 have spoken to (directly or indirectly) to satisfy the read request? R W W W S2, S5 Writer (also reads) Writer (also reads) Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V V1 V2 H H1 H When does S3 have to talk to S2 and S5? Before writes return or before read returns? R W W W Implementation can be flexible. Guarantee is that exchange occurs before read completes. Writer (also reads) Writer (also reads) Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V V1 V2 H H1 H Under read-my-writes, who must S3 have spoken to (directly or indirectly) to satisfy read request? R W W W S5 Writer (also reads) Writer (also reads) Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V V1 V2 H H1 H For score keeper, why is read-my-writes equivalent to strong consistency, even though it is “weaker”? R Application only has one writer. Not true in general. Reader Writer (also reads) Reader
Example 1: score keeper Write (“home”, 1); Write (“visitors”, 1); Write (“home”, 2); Write (“home”, 3); Write (“visitors”, 2); Write (“home”, 4); Write (“home”, 5); Common theme: Consider application invariants Reason about what store must ensure to support application invariants Visitors = 2 Home = 5
Example 2: umpire if first half of 9th inning complete then vScore = Read (“visitors”); hScore = Read (“home”); if vScore < hScore end game; Idea: home team doesn’t need another chance to bat if they are already ahead going into final half inning
Example 2: umpire if first half of 9th inning complete then vScore = Read (“visitors”); hScore = Read (“home”); if vScore < hScore end game; Umpire invariant: Game should end if home team leads going into final half inning.
Example 2: umpire if first half of 9th inning complete then vScore = Read (“visitors”); hScore = Read (“home”); if vScore < hScore end game; What subset of writes must be visible to the umpire to ensure game ends appropriately? Reads must show effect of all prior writes (strong consistency)
Example 2: umpire if first half of 9th inning complete then vScore = Read (“visitors”); hScore = Read (“home”); if vScore < hScore end game; Would read-my-writes work (as it did for the score keeper)? No, since the umpire doesn’t issue any writes
Consistency models • Strong consistency • Reader sees effect of all prior writes • Eventual consistency • Reader sees effect of subset of prior writes • Consistent prefix • Reader sees effect of initial sequence of writes • Bounded staleness • Reader sees effect of all “old” writes • Monotonic reads • Reader sees effect of increasing subset of writes • Read my writes • Reader sees effect of all writes performed by reader Reader’s prior accesses (reads or writes) affect guarantees
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V V V H H H W1 W2 W3 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Strong consistency: which writes could be reflected in the answer to R1? (V, W) = (W2, W3) R1 R1 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Strong consistency: which writes could be reflected in the answer to R2? (V, H) = (W2, W3) R2 R2 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Eventual consistency: which writes could be reflected in the answer to R1? (0, 0), (0, W1), (0, W3), (W2, 0), (W2, W1), (W2, W3) R1 R1 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Eventual consistency: which writes could be reflected in the answer to R2? (0, 0), (0, W1), (0, W3), (W2, 0), (W2, W1), (W2, W3) R2 R2 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Consistent prefix: which writes could be reflected in the answer to R1? (0,0), (0, W1), (W2, W1), (W2, W3) R1 R1 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Consistent prefix: which writes could be reflected in the answer to R2? (0,0), (0, W1), (W2, W1), (W2, W3) R2 R2 Reader Writer Reader
Consistent prefix • Provides a guarantee across variables • Similar to “snapshot isolation” • Must see version of data store that existed at some point • e.g., must see score that really occurred during the game • Assumes that reads are logically grouped • Via what mechanism are reads grouped? (DB question) • Reads (and writes) are logically grouped via transactions • Reads/writes are treated together, rather than individually • Normally provided with additional guarantees • e.g., all reads/writes in transaction succeed/fail together (atomicity)
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Monotonic reads: which writes could be reflected in the answer to R1? (0, 0), (0, W1), (0, W3), (W2, 0), (W2, W1), (W2, W3) R1 R1 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Monotonic reads: which writes could be reflected in the answer to R2? If R1=(0,0), then (0,0), (0, W1), (0, W3), (W2, 0), (W2, W1), (W2, W3) If R1=(0, W1), then (0, W1), (0, W3), (W2, W1), (W2, W3) If R1=(0, W3), then (0, W3), (W2, W3) If R1=(W2,0), then (W2,0), (W2, W1), (W2, W3) If R1=(W2, W1), then (W2, W1), (W2, W3) If R1=(W2, W3), then (W2, W3) R2 R2 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Read my writes: which writes could be reflected in the answer to R1? (0, 0), (0, W1), (0, W3), (W2, 0), (W2, W1), (W2, W3) R1 R1 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Read my writes: which writes could be reflected in the answer to R2? (0, 0), (0, W1), (0, W3), (W2, 0), (W2, W1), (W2, W3) R2 R2 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Read my writes: which writes could be reflected in the answer to R1? (W2, W3) R1 R1 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Bounded staleness: which writes could be reflected in the answer to R1? Must see all writes that occurred more than bound time earlier than R1, could also see more recent writes R1 R1 Reader Writer Reader
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V W2 V W1 W3 H Bounded staleness: which writes could be reflected in the answer to R2? Must see all writes that occurred more than bound time earlier than R2, could also see more recent writes R2 R2 Reader Writer Reader
Example 3: radio reporter do { vScore = Read (“visitors”); hScore = Read (“home”); report vScore, hScore; sleep (30 minutes); } Idea: periodically read score and broadcast it to listeners
Example 3: radio reporter do { vScore = Read (“visitors”); hScore = Read (“home”); report vScore, hScore; sleep (30 minutes); } Invariant: reporter should only report scores that actually occurred, and score should monotonically increase.
Example 3: radio reporter do { vScore = Read (“visitors”); hScore = Read (“home”); report vScore, hScore; sleep (30 minutes); } Do we need strong consistency? No, since listeners can accept slightly old scores.
Example 3: radio reporter do { vScore = Read (“visitors”); hScore = Read (“home”); report vScore, hScore; sleep (30 minutes); } Can we get away with eventual consistency (some subset of writes is visible)? No, eventual consistency can return scores that never occurred.
Example 3: radio reporter Under eventual consistency, what possible scores could the radio reporter read after this write completes? Write (“home”, 1); Write (“visitors”, 1); Write (“home”, 2); Write (“home”, 3); Write (“visitors”, 2); Write (“home”, 4); Write (“home”, 5); 0-0, 0-1, 0-2, 0-4, 0-5, 1-0, … 2-4, 2-5 Visitors = 2 Home = 5
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V V V H H H W1 W2 W3 Reader Score keeper Radio reporter
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V=0 V=1 V=0 H=1 H=2 H=0 W2 W1 W3 How could reporter read a score of 1-0 (eventual consistency)? Reader Score keeper Radio reporter
Visitors’ score Home score S1 S2 S3 S4 S5 S6 V=0 V=1 V=0 H=1 H=2 H=0 W2 W1 W3 R R Reader Score keeper Radio reporter 1-0
Example 3: radio reporter do { vScore = Read (“visitors”); hScore = Read (“home”); report vScore, hScore; sleep (30 minutes); } How about only consistent prefix (some sequence of writes is visible)? No. Would give us scores that occurred, but not monotonically increasing.