ADVANCED TRANSACTION TOPICS

ADVANCEDTRANSACTIONTOPICS Spring 2015

Warning This is a first draft I welcome your corrections

Transactions that read uncommitted data

Reading uncommitted data • Can create cascading rollbacks • solutions • strict locking: T will not release any write lock (or increment lock) until the transaction has aborted or committed and the commit/abort record has been flushed to disk • recoverable schedule • Must still fix data in buffers

Resolving deadlocks

Deadlock detection by timeouts • Simplest method • A transaction is assumed to be in a deadlock with other transactions if it fails to make progress over a certain time interval • Must then decide which transaction to abort

Deadlock detection by wait-for graphs (I) • We represent transactions by the summits of a graph that has an arc from summit T to summit U iff transactionT is waiting for a lock that is currently held by transaction U T U

Deadlock detection by wait-for graphs (II) • A sufficient and necessary condition for a deadlock to exist is the existnce of a cycle in the wait for graph T U V W U, V and W are involved in a deadlock

Limitations • Maintaining the wait-for graph is time-consuming • Especially for distributed DB • Can use cheaper heuristics • Prevent all deadlocks • Will abort from time to time transactions that do not need to be aborted

Deadlock elimination heuristics • Attach to each transaction an immutable time stamp • Does not change when a transaction is rolled back • Two possible schemes • Wait-Die • Wound-Wait

Wait-die scheme • When transaction T has to wait for a lock that is held by a transaction u • If T has a smaller timestampthan U • T is older than U • T is allowed to wait for the lock held by U • If U has a smaller timestampthan T • U is older than T • T is aborted and restarted Deadlocks are avoided because no transaction will ever wait for a lock held by an older transaction

Wound-wait scheme • When transaction T has to wait for a lock that is held by a transaction u • If T has a smaller timestampthan U • T "wounds" U: it forces it to roll back unless U is very close to releasing its locks • If U has a smaller timestampthan T • T is allowed to wait for the lock held by U Deadlocks are avoided because no transaction will ever wait for a lock held by an newer transaction

Distributed databases

The idea • Storing relations or fragment of relations • on different servers • at different sites with the possibility of replacing some relations • Still maintaining central control • Not the same as federated databases

Motivations (I) • Keeping data closer to their users • An industrial concern that has several plants might want information about its personnel to be kept at the plant where each employee works • EMPLOYEE relation will be horizontally partitioned

Motivations (II) • Splitting data according to their primary usages • May want to split vertically EMPLOYEE relation by keeping some of its attributes in one table and moving the others to a different table

Motivations (III) • Replicating data at different sites in order to: • Increase DB availability • Reduce risk of data loss • Speed up read-only queries • But not update queries

Replication issues • Maintaining the replicated data in a "consistent" state • That is, keeping replicas identical • Selecting the proper level of replication and the locations of the replicas • Tradeoffs between faster read access and slower updates • Storage costs • Handling network partitions

Availability and Reliability • Availability • Fraction of time the DBMS is operational • Often expressed in nines • 99.9 percent is 3 nines • 99.999 percent is five nines • Reliability • Function R(t) representing the probability the system will not fail over the time interval [0, t]

Tradeoffs • Must distinguish between read availability and write availability • Will make tradeoff between • Availability and data replication cost • Higher storage cost • Higher update costs • Read availability and write availability

Example CAN SKIPTHE MATH • Let A be the availability of a single server • To increase the read availability of a DB we can replicate it on two servers • Use write all/read any • Writes must update both replicas • Can read from either of them • Read availability = A×A + 2A(1-A) = 2A – A2 • Write availability = A×A= A2 • A2 < A !!!

Example CAN SKIPTHE MATH • Can also replicate the DB on three servers and use write all/read any • Read availability =A3 + 3A2(1 – A) + 3A(1 – A)2 • Write availability = A3 • A3 < A !!!

Example CAN SKIPTHE MATH • With three servers, we can use majority consensus voting • Two servers out of 3 • Required for all accesses • Read availability = write availability =A3 + 3A2(1 – A) = 3A2 – 2A3 • Majority consensus voting improves both read and write availabilities

Updating distributed DBs • Updates can now involve several local updates at the different sites where the DB data are stored • Challenge is to handle partial failures that could result in partial updates that would leave the DB in an inconsistent state • Solution is two-phase commit protocol

Two-phase commit protocol • Has nothing to do with two-phase locking. • Applies of distributed entities consisting of • A leader L that initiates the updates • Also called the coordinator • Followers F, which participate in the update and its validation • Lead and followers each have their own log

First phase • Leader • Writes on its log <PREPARE T> • Sends to all followers a PREPARE T message • Followers that are ready to finalize the transaction • Write on their logs <READY T> • Send to leader a READY T message • Other followers • Write on their logs <DON'T COMMIT T> • Send to leader a DON'T COMMIT T message

Second phase • If leader has received READY T messages from all sites involved in the transaction • Writes to its log < COMMIT T> • Sends to all followers a COMMIT T message • Followers commit T and write on their logs <COMMIT T> • Otherwise leader • Writes to its log < ABORT T> • Sends to all followers an ABORT T message • Followers abort T and write on their logs <ABORT T>

Recovering from a crash • Sites that find on their log • <COMMIT T> can assume that the T is committed and replay it if needed • <ABORT T> or <DON'T COMMIT T> should abort T • <READY T>must check with all other sites in order to know the status of T • no mention of T can safely abort it

Recovering from a leader failure • Followers should elect a new leader • Use arbitrary rule, such as site with lowest IP address • The rule is arbitrary but must result in the election of a singleleader • Compare with succession rule inWestern monarchies

Recovering from a leader failure • New leader will poll its followers (and itself) about the status of transactions and will • Commit all transactions that are committed at some sites • Abort all transactions that are aborted or in <DON'T COMMIT T> state at some sites • Commit all transactions that in <READY T> state at all sites and abort them otherwise

Distributed locking

The three approaches (I) • Using centralized locking • Lock server • Use a primary copy approach • All updates are performed first on a single primary copies then distributed to secondary copies • Secondary copies are read-only and can be only used for operations that do not modify the data. • Locks only apply to primary copy

The three approaches (II) • Implementing global locking with local locks • Useful for managing replicated data • Each replica has its own locks • In write all/read any • Global read lock requires acquiring a single local lock for the replicated entity • Global write lock requires acquiring all local locks for the replicated entity • In majority consensus voting • Global lock requires acquiring a majority of the local lock for the replicated entity

Long-running transactions

Long-running transactions • Pessimistic concurrency control require transactions to maintain locks for most of their duration • Two-phase locking • Not possible for long running transactions • Especially those involving human interactions • Approving travel plans and expenses

The best solution • Decompose long-running transactions into smaller transactions that • Correspond to steps of long-running transactions • Can individually commit as soon as step is completed • Rely on compensating transactions—anti-transactions—to undo them if the long-running transaction must be aborted

ADVANCED TRANSACTION TOPICS

ADVANCED TRANSACTION TOPICS

Presentation Transcript

Advanced Transaction Management

Advanced Topics

HDF5 Advanced Topics

Advanced topics

Advanced Topics

Advanced Topics

Advanced Topics and Recent Topics

Advanced Topics:

Advanced Topics

Advanced Topics

REDCap Advanced Topics

Advanced Transaction Management Concepts

Advanced Topics

Advanced Topics

Advanced Transaction Models

Advanced Topics

ADVANCED TOPICS

Advanced Topics

Advanced Topics

DCM: Advanced Topics

Advanced Topics

Advanced Topics