CS 603 Data Replication

CS 603Data Replication February 25, 2002

Data Replication: Why? • Fault Tolerance • Hot backup • Catastrophic failure • Performance • Parallelism • Decreased reliance on network This is a two-edged sword

Data Replication: What? • Correctness criterion: Replication invisible • Results indistinguishable from one-copy database • One-copy serializability (1SR) • Alternatives • Bounded inconsistency • User selection of real/copy More discussion Friday

Data Replication: How? • Goal: Ensure one-copy serializability • Write-all solution: All copies identical • Write goes to every site • Read from any site • Standard single-copy concurrency control • Guarantees 1SR • Single-copy concurrency control gives serializable execution • Equivalent to serial execution where all writes happen in one transaction

Write All Approach Writer Reader 5 read 5 5 5 read 3 3 3 3 5 5 5

Problem: Site Failure • Failure causes write to block • Must maintain locks • Clogs up entire system Is this fault tolerance? • What about “write all available”? • T0: w0[xA] w0[xB] w0[yC] c0 • B-fails • T1: r1[yC] w1[xA] c1 • B-recovers • T2: r2[xB] w2[yC] c2 • What is the serial equivalent order?

Model for Replicated Data • Data and Transaction Managers at each site • Data Manager: local concurrency control to guarantee local serializability • Transaction manager: Distributed actions • Turns reads/writes into multi-site reads/writes • Runs commit protocol • Directory to get sites of each copy

Failure Assumptions • Communications failure: Site A does not receive reads/writes on xA issued by B • Site failure: Site A is unable to process reads/writes on xA issued by B • Communications failure: Site A processes but does not acknowledge reads/writes on xA issued by B • Fail-stop model, detectable by timeout

Types of Write • Write(x): All copies of x will eventually be written • Immediate write • Send write to all sites on request • Quick detection of conflict • Delayed write • Delays non-local writes until commit • Minimizes message traffic • Abort is cheap • Primary copy write • Quick detection of conflict • Lower message traffic than immediate write

Distributed Serializability • A complete replicated data (RD) history H over T = {T0, …, Tn} is a partial order with ordering relation < where • H = h(ni=0Ti) for some translation function h • for each Ti and all operations pi, qi in Ti, if pi <iqi, then every operation in h(pi) is related by < to every operation in h(qi) • for every rj[xA], there is at least one wi[xA] < rj[xA] • if wi[x] H and rj[x] H, then wi[x] < rj[x] or rj[x] < wi[x] • if wi[x] <iri[x] and h(ri[x]) = ri[xA] then wi[xA] h(wi[x]) • Theorem: If reads-from relationships same as serial history, RD history is 1-copy serializable

Write All Available FailsEven if no recovery!

Solutions • Validate availability on commit • Check if any failed writes now available • Check that all sites read or written still available • Enforces serializability for site failures Doesn’t work with communication failures!

Communication Failures • Available copies fails on network partition • Each side succeeds in validation • Write all blocks • Write n-k, read k+1 • Generalization of the “write all” approach • Handles up to min(n-k, k+1) failures • Tradeoff read vs. write performance • Partition effect based on size of partition: • <k+1: small partition acts as if all sites failed, large continues • Otherwise entire system becomes read-only

Other approaches:Don’t enforce Serializability! • Master copy • Writes must update master copy • Reads can be consistent or inconsistent • Bounded inconsistency • Time bound on update of copies • Value bound: write all if difference too great • Dumps consistency on the application • Added complexity • Better performance

CS 603 Data Replication