370 likes | 385 Views
This paper discusses concurrency control models, deadlock management, recovery mechanisms, and reliable commit protocols in distributed database systems. It also explores issues in DDBMS such as data planning, query optimization, and fault tolerance.
E N D
Concurrency Control and Reliable Commit Protocol in Distributed Database Systems Jian Jia Chen 2002/05/09 Real-time and Embedded System Lab., CSIE, National Taiwan University.
Outline • Distributed Database Management System (DDBMS) • Concurrency Control models (CC) • Deadlock Management in DDBMS • Recovery and Reliable Mechanisms in DDBMS. • Mobile Database
Distributed Database Management System (DDBMS) • A collection of multiple, logically interrelated databases distributed over a computer network.A distributed database management system is as the software system that permits the management of the distributed database and make the distribution transparent to the users.
Architectural Models for Distributed Database Management System • Autonomy(A): controller • 0: right integration, 1:semiautonomous system, 2: isolation • Heterogeneity(H): • 0:homogenous, 1: heterogeneous • Distribution(D): data arrangement • 0:no distribution, 1:client/server arch., 2:peer-to-peer
Issues in DDBMS • Data Planning (NP-complete) • Query Optimization and Decomposition (NP-complete) • Distributed Transaction Management • Fault Tolerance and Reliability • Networking.
Transaction and Transaction Management • The ACID property is still must be notified in DDBMS • Transaction structures: flat, nested Begin_transaction Begin_transaction T1 Begin_transaction T2 T3(); …… End_transaction T2 End_transaction T1 End_transaction Begin_transaction T1(); T2(); …… End_transaction
Concurrency Control Algorithms • Pessimistic • Two-phase locking protocol (Mutex) • Timestamp ordering protocol • Hybrid • Optimistic • Locking based • Timestamp ordering based
Locking and Timestamp Ordering • 2PL is simple and guarantees serializability but the locking may damage the throughput of the system, and may cause dead-lock. • Timestamp Ordering (TO) protocols don’t attempt to maintain serializability of by mutual exclusion so that it won’t cause dead-lock • TO rule: Given two conflicting operations Oij and Okl, belongs to Ti and Tk (Tk is younger), the former operation is executed before the latter if and only if ts(Ti) < ts(Tk)
Basic TO Algorithm • Transaction (Ti) is assigned a globally unique timestamp ts(Ti) • Transaction managers attach the timestamp to all operations issued by the transaction • Each data item is assigned a write timestamp and a read timestamp: • rts(x), wts(x) • For Rt(x) for Wt(x) • If ts(Tt) < wts(x) if ts(Tt) < rts(x) and ts(Tt) < wts(x) • then reject Rt(x) then reject Wt(x) • else accept Rt(x) else accept Wt(x) • rts(x) <- ts(Tt) wts(x) <- ts(Tt)
Basic TO Algorithm (2) cont. • The basic TO algorithm is simple and deadlock-free. The penalty of such mechanism is potential restart of a transaction numerous times. • Take an example. • The global unique timestamp assignment is not an easy problem neither.
Conservation TO Algorithms • The previous example shows that the restart penalty may be serious, the conservation TO algo.s attempt to lower the aggressive restart. • The algo.s delay each operation until there is an assurance that it will not be restarted
Deadlock Management • There are some ways to solve the deadlock problem: prevention, avoidance, detection, and resolution. • Deadlock prevention is not easy to achieve since it must have the complete serial graph • A famous deadlock avoidance approach is similar with that in Operation Systems. Wait-Die and Wound-Die rule.
Deadlock Detection Approach • It is a NP-complete problem to find the minimum cost edge for breaking the deadlock cycle. • Local wait-for graph and Global wait-for graph. We only concern the deadlocks among the sites. • Topologies for deadlock detection algo. • Centralized • Distributed • Hierarchical
Distributed Reliability Protocols • Commit protocols: • How to execute commit command for distributed transactions? How to ensure atomicity and durability. • Termination protocols: • If a failure occurs, how do the remaining operational sites deal with it? • Recovery protocols: • If a failure occurs, how do the site where the failure occurred deal with it? • My focus is on the first issue
Two-Phase Commit Protocol • I had reported this protocol briefly in my previous presentation. • Global Commit Rule: all or nothing. • Phase 1: The coordinator gets the participants ready to write the result to the physical storage. • Phase 2: Everyone writes its results into the database. • Coordinator: • Participants:
Site Failures and Recovery • Develop non-blocking termination and independent recovery protocols. • A proof shows that such protocols exist when a single site fails. • However it is not possible to design independent recovery protocols when multiple sites fail.
Problems with 2PC • Blocking • Ready implies that the participants wait for the coordinator • If coordinator fails, site is blocked until recovery. • Blocking reduces availability • Independent recovery is not possible • The 3PC protocol was proposed to solve the blocking problems. 3PC is non-blocking(non-realistic but reducing).
Three Phase Commit Protocol • A proof shows that there are necessary and sufficient conditions for designing non-blocking atomic commitment protocols: • No state that is adjacent to both a commit and an abort state. (2PC violates) • No non-committable state that is adjacent to a commit state (Abort is not adjacent to Commit)
Mobile Databases • Mobile database is an extension of distributed database system. • A mobile database may contain databases connected with wire-line networks and databases built on mobile stations. • The characteristics: • The wireless network have restricted bandwidth • The power supplies in Mobile stations have limited lifetimes • Because the power restrictions, mobile stations are not available always. • The mobile stations move in difference speeds, areas.
Mobile Databases cont.(2) • Base on the previous characteristics, the CC problem in Mobile Databases is harder than that in distributed databases. • The disconnection of stations is long, so that locking protocol and timestamp ordering protocol is not suitable. 2PC is not suitable neither, since the availability is reduced. • Different transaction models were proposed in Mobile Database environment. ex. relaxation of the ACID property, relaxation of the serializability.
Mobile Databases cont.(3) • It is much harder to design real-time mobile database systems. • Because of the unpredictability of the environment, hard real-time transactions are hard to meet their deadline. Almost papers discuss about firm and soft real-time in the environment.
Happy Birthday to Me :^^ • It’s my birthday ^)^