310 likes | 336 Views
This review discusses centralized and distributed DBMS, data fragmentation, transaction processing, serializability theorem, locking protocols, and reliability in replicated databases. It covers concepts like local and mutual consistency, advantages of replication, risks, transaction correctness, replica control, and one-copy serializability. It compares mutual consistency and transaction consistency, and explores different replication control methods. The text delves into concepts like read-one/write-all replica control, primary site approach, majority approach, and quorum consensus. It also discusses pessimistic and optimistic replica control strategies, with an example of optimistic algorithm implementation.
E N D
Reading • Textbook: Ch.13 CSCE 824 - Spring 2011
Review • Centralized DBMS • Distributed DBMS • Data fragmentation and allocation • Top-down design • Bottom-up design • Transaction processing • Serializability theorem • Locking protocols • Reliability CSCE 824 - Spring 2011
Replicated Databases • Multiple copies of the same data items (databases) • Consistency: • Local consistency • Mutual consistency CSCE 824 - Spring 2011
Why Replication? • System availability • Performance • Scalability • Application requirements CSCE 824 - Spring 2011
Risk of Replication • Worse performance: updates must be applied to all replicas and synchronized • Worse availability: some algorithms require multiple replicas to be operational for any of them to be used CSCE 824 - Spring 2011
Transaction Correctness • 2-Phase Locking – serializability • 2-Phase Commit – reliability • Replica control – mutual consistency • Database design: local vs. global transactions • Database consistency: strong consistency vs. weak consistency • Location of updates: master vs. distributed • Update propagation: eager vs. lazy • Degree of transparency: limited vs. full CSCE 824 - Spring 2011
Mutual Consistency vs. Transaction Consistency • Transaction consistency: global serializability • Mutual consistency: replicas having the same values • Strong: all replicas have the same value at the end of the execution of an update transaction • Quorum: a quorum of replicas have the same value • Weak: eventually the values of all replicas become identical CSCE 824 - Spring 2011
Replica Control • Hides replication from transaction • Knows location of all replicas • Translates transaction’s request to access an item into request to access particular replica(s) • Maintains some form of mutual consistency CSCE 824 - Spring 2011 9
One-Copy Serializability (1SR) • Extension of the serializability theory • Effects of transactions on replicated data items should be the same as if they had been performed one at-a-time on a single set of date items CSCE 824 - Spring 2011
x1 Transaction x2 x3 Example Replication • Issues • May reduce performance (complex operations) • Too expensive • Can’t control when replicas are updated CSCE 824 - Spring 2011 11 7/22/99
Replica Control • Pessimistic replica control: at most one group can make an update – mutual consistency at all times • Optimistic replica control: system must be available at all times. Correct if there is any violation of mutual consistency CSCE 824 - Spring 2011
Read One / Write All Replica Control • Pessimistic approach • Read the nearest replica • Write all replicas • Synchronous : before transaction commits • Asynchronous case: eventually • Advantage: • Mutual consistency • Performance benefits: reads transactions • Disadvantage: availability is not always guaranteed • E.g., Primary site approach CSCE 824 - Spring 2011 13
Primary Site – static • Primary site: most recent copy • What happens if the network is partitioned? 2 DB0 1 Primary DB3 DB1 DB2 DB6 DB5 DB4 CSCE 824 - Spring 2011
Majority Approach • The group that contains the majority of the sites can process an update DB0 1 DB3 DB1 DB2 DB6 DB5 DB4 CSCE 824 - Spring 2011
Majority Approach • The group that contains the majority of the sites can process an update 2 DB0 (N+1)/2 1 DB3 DB1 DB2 DB6 DB5 DB4 Farkas CSCE 824 - Spring 2011 CSCE 824 - Spring 2011 16
Majority Approach • Advantages: more flexible than primary site • Disadvantages: zero availability may still happen • Who has the most recent copy? • Version number: • Each site assigns a version number to the copy (initially VN=0) • After an update, the VN is incremented by 1 CSCE 824 - Spring 2011
Quorum Consensus • Each sites are not equal • Special case of majority approach W=5 DB0 W=3 W=2 DB3 DB1 W=1 W=1 DB2 DB6 DB5 DB4 W=1 W=15 CSCE 824 - Spring 2011
Other Approaches • Dynamic Linear: order sites linearly to calculate majority • Token-based primary site (moving token): change the location of the primary site CSCE 824 - Spring 2011
Pessimistic Replica Control • Advantages: • Mutual consistency at all times • Know the latest version ( between two consecutive updates, there is a site in common) • Disadvantage: • May result in zero availability CSCE 824 - Spring 2011
Optimistic Replica Control • Goal: availability at all time • Issues: consistency may not be guaranteed • Need an algorithm to detect whether an inconsistency occurred • Take actions to fix any inconsistencies CSCE 824 - Spring 2011
Example Optimistic Alg. • Two partitions P1, P2 • Assumption: separately, P1 and P2 produces serializable histories • Need: after P1 and P2 joins again: Detect which transactions violate global serializability CSCE 824 - Spring 2011
Example cont. • Items read by transaction T: read(T) • Items written by transaction T: write(T) • Assume: write(T) read(T) • Transactions in P1: T1i , in P2: T2i CSCE 824 - Spring 2011
Example cont. • Precedence graph: G • Nodes: {T11, …,T1n, T21, …, T2m} • Edges: • Dependency edge (ripple effect): there is an edge TijTikif j<k and there is a data item d, s.t., d write (Tij) read(Tik) and there is no l s.t., j<l<k and d is in the write set in Til (to consider dirty read within the same partition) CSCE 824 - Spring 2011
Example cont. • Precendence edges: there is an edge TijTikif j<k and there is a data item d, s.t., d read(Tij) write(Tik) and there is no l s.t., j<l<k and d is in the write set in Til (to consider the first transaction to write a data item after a read within the same partition) CSCE 824 - Spring 2011
Example cont. • Interference edges: there is an edge T1i T2j if j<k and there is a data item d, s.t., d read(T1i) write(T2j) or vice verse (to consider when T1i reads something written by T2j) CSCE 824 - Spring 2011
Example cont. • Theorem: The combined histories are correct iff the precendense graph is acyclic • Correct inconsistencies: remove (undo) transactions that make the graph cyclic CSCE 824 - Spring 2011
Summary • Correctness: If the transactions are ACID, local execution in serializable, distributed transactions are reliable, and update replication is synchronous then distributed transactions are globally atomic & serializable • Performance: • Applications: transactions are not always serializable (e.g., WS-transactions) • Replication: update propagation is not always asynchronous • Compensating transactions CSCE 824 - Spring 2011
Next Class Review distributed databases Design Concurrency control Reliability Replication CSCE 824 - Spring 2011