360 likes | 380 Views
This article discusses reading uncommitted data, resolving deadlocks, and replication issues in advanced transaction topics in distributed databases.
E N D
ADVANCEDTRANSACTIONTOPICS Spring 2015
Warning This is a first draft I welcome your corrections
Reading uncommitted data • Can create cascading rollbacks • solutions • strict locking: T will not release any write lock (or increment lock) until the transaction has aborted or committed and the commit/abort record has been flushed to disk • recoverable schedule • Must still fix data in buffers
Deadlock detection by timeouts • Simplest method • A transaction is assumed to be in a deadlock with other transactions if it fails to make progress over a certain time interval • Must then decide which transaction to abort
Deadlock detection by wait-for graphs (I) • We represent transactions by the summits of a graph that has an arc from summit T to summit U iff transactionT is waiting for a lock that is currently held by transaction U T U
Deadlock detection by wait-for graphs (II) • A sufficient and necessary condition for a deadlock to exist is the existnce of a cycle in the wait for graph T U V W U, V and W are involved in a deadlock
Limitations • Maintaining the wait-for graph is time-consuming • Especially for distributed DB • Can use cheaper heuristics • Prevent all deadlocks • Will abort from time to time transactions that do not need to be aborted
Deadlock elimination heuristics • Attach to each transaction an immutable time stamp • Does not change when a transaction is rolled back • Two possible schemes • Wait-Die • Wound-Wait
Wait-die scheme • When transaction T has to wait for a lock that is held by a transaction u • If T has a smaller timestampthan U • T is older than U • T is allowed to wait for the lock held by U • If U has a smaller timestampthan T • U is older than T • T is aborted and restarted Deadlocks are avoided because no transaction will ever wait for a lock held by an older transaction
Wound-wait scheme • When transaction T has to wait for a lock that is held by a transaction u • If T has a smaller timestampthan U • T "wounds" U: it forces it to roll back unless U is very close to releasing its locks • If U has a smaller timestampthan T • T is allowed to wait for the lock held by U Deadlocks are avoided because no transaction will ever wait for a lock held by an newer transaction
The idea • Storing relations or fragment of relations • on different servers • at different sites with the possibility of replacing some relations • Still maintaining central control • Not the same as federated databases
Motivations (I) • Keeping data closer to their users • An industrial concern that has several plants might want information about its personnel to be kept at the plant where each employee works • EMPLOYEE relation will be horizontally partitioned
Motivations (II) • Splitting data according to their primary usages • May want to split vertically EMPLOYEE relation by keeping some of its attributes in one table and moving the others to a different table
Motivations (III) • Replicating data at different sites in order to: • Increase DB availability • Reduce risk of data loss • Speed up read-only queries • But not update queries
Replication issues • Maintaining the replicated data in a "consistent" state • That is, keeping replicas identical • Selecting the proper level of replication and the locations of the replicas • Tradeoffs between faster read access and slower updates • Storage costs • Handling network partitions
Availability and Reliability • Availability • Fraction of time the DBMS is operational • Often expressed in nines • 99.9 percent is 3 nines • 99.999 percent is five nines • Reliability • Function R(t) representing the probability the system will not fail over the time interval [0, t]
Tradeoffs • Must distinguish between read availability and write availability • Will make tradeoff between • Availability and data replication cost • Higher storage cost • Higher update costs • Read availability and write availability
Example CAN SKIPTHE MATH • Let A be the availability of a single server • To increase the read availability of a DB we can replicate it on two servers • Use write all/read any • Writes must update both replicas • Can read from either of them • Read availability = A×A + 2A(1-A) = 2A – A2 • Write availability = A×A= A2 • A2 < A !!!
Example CAN SKIPTHE MATH • Can also replicate the DB on three servers and use write all/read any • Read availability =A3 + 3A2(1 – A) + 3A(1 – A)2 • Write availability = A3 • A3 < A !!!
Example CAN SKIPTHE MATH • With three servers, we can use majority consensus voting • Two servers out of 3 • Required for all accesses • Read availability = write availability =A3 + 3A2(1 – A) = 3A2 – 2A3 • Majority consensus voting improves both read and write availabilities
Updating distributed DBs • Updates can now involve several local updates at the different sites where the DB data are stored • Challenge is to handle partial failures that could result in partial updates that would leave the DB in an inconsistent state • Solution is two-phase commit protocol
Two-phase commit protocol • Has nothing to do with two-phase locking. • Applies of distributed entities consisting of • A leader L that initiates the updates • Also called the coordinator • Followers F, which participate in the update and its validation • Lead and followers each have their own log
First phase • Leader • Writes on its log <PREPARE T> • Sends to all followers a PREPARE T message • Followers that are ready to finalize the transaction • Write on their logs <READY T> • Send to leader a READY T message • Other followers • Write on their logs <DON'T COMMIT T> • Send to leader a DON'T COMMIT T message
Second phase • If leader has received READY T messages from all sites involved in the transaction • Writes to its log < COMMIT T> • Sends to all followers a COMMIT T message • Followers commit T and write on their logs <COMMIT T> • Otherwise leader • Writes to its log < ABORT T> • Sends to all followers an ABORT T message • Followers abort T and write on their logs <ABORT T>
Recovering from a crash • Sites that find on their log • <COMMIT T> can assume that the T is committed and replay it if needed • <ABORT T> or <DON'T COMMIT T> should abort T • <READY T>must check with all other sites in order to know the status of T • no mention of T can safely abort it
Recovering from a leader failure • Followers should elect a new leader • Use arbitrary rule, such as site with lowest IP address • The rule is arbitrary but must result in the election of a singleleader • Compare with succession rule inWestern monarchies
Recovering from a leader failure • New leader will poll its followers (and itself) about the status of transactions and will • Commit all transactions that are committed at some sites • Abort all transactions that are aborted or in <DON'T COMMIT T> state at some sites • Commit all transactions that in <READY T> state at all sites and abort them otherwise
The three approaches (I) • Using centralized locking • Lock server • Use a primary copy approach • All updates are performed first on a single primary copies then distributed to secondary copies • Secondary copies are read-only and can be only used for operations that do not modify the data. • Locks only apply to primary copy
The three approaches (II) • Implementing global locking with local locks • Useful for managing replicated data • Each replica has its own locks • In write all/read any • Global read lock requires acquiring a single local lock for the replicated entity • Global write lock requires acquiring all local locks for the replicated entity • In majority consensus voting • Global lock requires acquiring a majority of the local lock for the replicated entity
Long-running transactions • Pessimistic concurrency control require transactions to maintain locks for most of their duration • Two-phase locking • Not possible for long running transactions • Especially those involving human interactions • Approving travel plans and expenses
The best solution • Decompose long-running transactions into smaller transactions that • Correspond to steps of long-running transactions • Can individually commit as soon as step is completed • Rely on compensating transactions—anti-transactions—to undo them if the long-running transaction must be aborted