1 / 29

Replicated Databases

Replicated Databases. Reading. Textbook: Ch.13. Review. Centralized DBMS Distributed DBMS Data fragmentation and allocation Top-down design Bottom-up design Transaction processing Serializability theorem Locking protocols Reliability. Replicated Databases.

mkuhn
Download Presentation

Replicated Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Replicated Databases

  2. Reading • Textbook: Ch.13 CSCE 824 - Spring 2011

  3. Review • Centralized DBMS • Distributed DBMS • Data fragmentation and allocation • Top-down design • Bottom-up design • Transaction processing • Serializability theorem • Locking protocols • Reliability CSCE 824 - Spring 2011

  4. Replicated Databases • Multiple copies of the same data items (databases) • Consistency: • Local consistency • Mutual consistency CSCE 824 - Spring 2011

  5. Why Replication? • System availability • Performance • Scalability • Application requirements CSCE 824 - Spring 2011

  6. Risk of Replication • Worse performance: updates must be applied to all replicas and synchronized • Worse availability: some algorithms require multiple replicas to be operational for any of them to be used CSCE 824 - Spring 2011

  7. Transaction Correctness • 2-Phase Locking – serializability • 2-Phase Commit – reliability • Replica control – mutual consistency • Database design: local vs. global transactions • Database consistency: strong consistency vs. weak consistency • Location of updates: master vs. distributed • Update propagation: eager vs. lazy • Degree of transparency: limited vs. full CSCE 824 - Spring 2011

  8. Mutual Consistency vs. Transaction Consistency • Transaction consistency: global serializability • Mutual consistency: replicas having the same values • Strong: all replicas have the same value at the end of the execution of an update transaction • Quorum: a quorum of replicas have the same value • Weak: eventually the values of all replicas become identical CSCE 824 - Spring 2011

  9. Replica Control • Hides replication from transaction • Knows location of all replicas • Translates transaction’s request to access an item into request to access particular replica(s) • Maintains some form of mutual consistency CSCE 824 - Spring 2011 9

  10. One-Copy Serializability (1SR) • Extension of the serializability theory • Effects of transactions on replicated data items should be the same as if they had been performed one at-a-time on a single set of date items CSCE 824 - Spring 2011

  11. x1 Transaction x2 x3 Example Replication • Issues • May reduce performance (complex operations) • Too expensive • Can’t control when replicas are updated CSCE 824 - Spring 2011 11 7/22/99

  12. Replica Control • Pessimistic replica control: at most one group can make an update – mutual consistency at all times • Optimistic replica control: system must be available at all times. Correct if there is any violation of mutual consistency CSCE 824 - Spring 2011

  13. Read One / Write All Replica Control • Pessimistic approach • Read the nearest replica • Write all replicas • Synchronous : before transaction commits • Asynchronous case: eventually • Advantage: • Mutual consistency • Performance benefits: reads transactions • Disadvantage: availability is not always guaranteed • E.g., Primary site approach CSCE 824 - Spring 2011 13

  14. Primary Site – static • Primary site: most recent copy • What happens if the network is partitioned? 2 DB0 1 Primary DB3 DB1 DB2 DB6 DB5 DB4 CSCE 824 - Spring 2011

  15. Majority Approach • The group that contains the majority of the sites can process an update DB0 1 DB3 DB1 DB2 DB6 DB5 DB4 CSCE 824 - Spring 2011

  16. Majority Approach • The group that contains the majority of the sites can process an update 2 DB0 (N+1)/2 1 DB3 DB1 DB2 DB6 DB5 DB4 Farkas CSCE 824 - Spring 2011 CSCE 824 - Spring 2011 16

  17. Majority Approach • Advantages: more flexible than primary site • Disadvantages: zero availability may still happen • Who has the most recent copy? • Version number: • Each site assigns a version number to the copy (initially VN=0) • After an update, the VN is incremented by 1 CSCE 824 - Spring 2011

  18. Quorum Consensus • Each sites are not equal • Special case of majority approach W=5 DB0 W=3 W=2 DB3 DB1 W=1 W=1 DB2 DB6 DB5 DB4 W=1 W=15 CSCE 824 - Spring 2011

  19. Other Approaches • Dynamic Linear: order sites linearly to calculate majority • Token-based primary site (moving token): change the location of the primary site CSCE 824 - Spring 2011

  20. Pessimistic Replica Control • Advantages: • Mutual consistency at all times • Know the latest version ( between two consecutive updates, there is a site in common) • Disadvantage: • May result in zero availability CSCE 824 - Spring 2011

  21. Optimistic Replica Control • Goal: availability at all time • Issues: consistency may not be guaranteed • Need an algorithm to detect whether an inconsistency occurred • Take actions to fix any inconsistencies CSCE 824 - Spring 2011

  22. Example Optimistic Alg. • Two partitions P1, P2 • Assumption: separately, P1 and P2 produces serializable histories • Need: after P1 and P2 joins again: Detect which transactions violate global serializability CSCE 824 - Spring 2011

  23. Example cont. • Items read by transaction T: read(T) • Items written by transaction T: write(T) • Assume: write(T)  read(T) • Transactions in P1: T1i , in P2: T2i CSCE 824 - Spring 2011

  24. Example cont. • Precedence graph: G • Nodes: {T11, …,T1n, T21, …, T2m} • Edges: • Dependency edge (ripple effect): there is an edge TijTikif j<k and there is a data item d, s.t., d  write (Tij)  read(Tik) and there is no l s.t., j<l<k and d is in the write set in Til (to consider dirty read within the same partition) CSCE 824 - Spring 2011

  25. Example cont. • Precendence edges: there is an edge TijTikif j<k and there is a data item d, s.t., d  read(Tij)  write(Tik) and there is no l s.t., j<l<k and d is in the write set in Til (to consider the first transaction to write a data item after a read within the same partition) CSCE 824 - Spring 2011

  26. Example cont. • Interference edges: there is an edge T1i T2j if j<k and there is a data item d, s.t., d  read(T1i)  write(T2j) or vice verse (to consider when T1i reads something written by T2j) CSCE 824 - Spring 2011

  27. Example cont. • Theorem: The combined histories are correct iff the precendense graph is acyclic • Correct inconsistencies: remove (undo) transactions that make the graph cyclic CSCE 824 - Spring 2011

  28. Summary • Correctness: If the transactions are ACID, local execution in serializable, distributed transactions are reliable, and update replication is synchronous then distributed transactions are globally atomic & serializable • Performance: • Applications: transactions are not always serializable (e.g., WS-transactions) • Replication: update propagation is not always asynchronous • Compensating transactions CSCE 824 - Spring 2011

  29. Next Class Review distributed databases Design Concurrency control Reliability Replication CSCE 824 - Spring 2011

More Related