680 likes | 2.3k Views
Concurrency Control in Distributed Databases. . By :- Rishikesh Mandvikar rmandvik[at]engr.smu.edu May 1, 2004. Topics. Serializability Theory Centralized Databases Distributed Databases Lock Based Concurrency Control Algorithms Centralized (2PL, S2PL) Distributed (C2PL, PC2PL, D2PL)
E N D
Concurrency Control in Distributed Databases. By :- Rishikesh Mandvikar rmandvik[at]engr.smu.edu May 1, 2004
Topics • Serializability Theory • Centralized Databases • Distributed Databases • Lock Based Concurrency Control Algorithms • Centralized (2PL, S2PL) • Distributed (C2PL, PC2PL, D2PL) • Optimistic Concurrency Control
Serializability Theory extended to Distributed Database [14] • Fragmentation • Horizontal • Vertical • Hybrid • Replication • Synchronous Replication • ROWA Protocol • Voting • Asynchronous Replication
Locking based CC Algorithms • Centralized • 2PL (Relaxed S2PL) • S2PL • Distributed • C2PL • PC2PL • D2PL
2 Phase locking (2PL) [13] Rules: Growing phase: • “A txn that has to read/write a data object first has to request a read/write lock on it.” Shrinking phase: • “A txn cant request additional locks once it releases a lock.”
Strict 2 Phase Locking (S2PL) [13] Rules: Growing phase: • “A txn that has to read/write a data object first has to request a read/write lock on it.” Non - Shrinking phase: • “Txn releases all locks only when it completes.”
2PL, S2PL • Differences • 2PL • Cascading aborts • Conflict serializable schedules (not all) • High concurrency • S2PL • No cascading aborts • Serializable schedules • Low concurrency
Centralized 2PL [14] • Cons • Failure of primary site • Bottleneck situation • Communication links
Primary Copy 2PL [14] • Lock on primary copy necessary • Lock management at the primary-copy sites only • Pros • Reduces load at central site • Cons • Deadlock handling is partially centralized
Distributed 2PL [14] • Pros • Lock management independency • Cons • Complex deadlock handling required • Communication cost
Optimistic Concurrency Control [13][14] • Txns assumed to have no conflicts • Private workspace area • Validation of txns before write phase
Optimistic Concurrency Control [13][14] • Txn phases: • Read and Compute • read from database and write into private workspace • Validate • Timestamps assigned over here • Check for conflict with concurrent txns • Write • Copy into database if validation successful
Optimistic Concurrency Control [13][14] For Ti and Tj where TS(Ti) < TS(Tj) • Validation Criteria • All phases of Ti execute before Tj • Ti ends before write phase of Tj and Ti doesn’t modify data read by Tj • Ti finishes its read phase before Tj finishes its read phase and they both don’t read/write any common data
Optimistic Concurrency Control [13][14] • Validation • For validating Tj w.r.t committed txn Ti where TS(Ti) < TS(Tj) • Maintain a list of read/write object list for Tj • Other cant commit while Tj is validated • Once Validated, write phase allowed to finish • Bottleneck situation
Optimistic Concurrency Control [13][14] • Advantages • Increased concurrency with a good “mix” of txns. • Better than Lock based systems • Disadvantages • Bottleneck situation • Maintaining read/write list for every txn • Copying the private space to the database • Long txns
Optimistic Concurrency Control [13][14] • Disadvantages • Long txns • Read/write list would be very long • Chance of Restart is proportional to the square of its size [9]
Research • Optimistic CC algorithm • IBM’s IMS FASTPATH (Centralized DBMS) • OCC in Distributed DBMS
Conclusion • Serializability Theory • Lock Based Systems • Optimistic CC algorithms • Timestamp Ordering