1 / 36

Transactions

Transactions. A transaction is a complete and consistent series of database operations to perform some logical task that is part of a database application. a database application usually consists of many such transactions

Download Presentation

Transactions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transactions • A transaction is a complete and consistent series of database operations to perform some logical task that is part of a database application. • a database application usually consists of many such transactions • A transactions starts with the first SELECT, INSERT, UPDATE and DELETE statement and ends with either: • “commit work” – save the work done by the transaction • “rollback work” – undo the work done by the transaction Database Systems

  2. Properties of a Transaction • Transaction management in a DBMS allows a large number of transactions to execute concurrently while maintaining: • Atomicity:a transaction either finishes all operations successfully or has no effect on the database • Consistency:when a transaction completes successfully, it leaves the database in a consistent state • Isolation:transactions run as if no other process is accessing the database while they execute • Durability:when a transaction completes successfully, its effects are written to disk and are never lost! Database Systems

  3. Transactions • A transaction is defined as a series of read and write operations • select is a read type operation, insert is a write operation, delete and update are a read operation followed by a write operation • data items (usually tuples) are operands for the operations in a transaction • most often we disregard or simplify the program logic that takes place in between the database operations • A transaction: R(A) R(B) A := A + 500 W(A) B:= B - 500 W(B) Database Systems

  4. T1 is overwriting a change T2 made T2 is reading something T1 changed (DIRTY READ) As a result, T2 used a value for A that should have never been in the database Transaction Management T1 T2 R(A) R(B) R(B) A := A + 500 W(A) B := B+200 W(B) R(A) B:= B - 500 W(B) A:= A-200 W(A) ABORT Time Database Systems

  5. Transaction Management • What does it mean to correctly execute a series of transactions? When are interleaved read/write operations from transactions valid? • a schedule is a global sequence of database operations from multiple transactions • serializable schedules are equivalent to executing all transactions one by one in some order T1,T2,…,Tk (i.e. T1 first, then T2, etc.) • serializable schedules guarantee isolation and consistency Database Systems

  6. Achieving the ACID Properties • How do we guarantee all schedules generated by a DBMS are serializable? [Concurrency Control] • locking can be used to guarantee serializable schedules as we will see (consistency and isolation) • locking also guarantees atomicity • How do we make sure that if a transaction succeeds, its effects are never lost? [Recovery] • write-ahead logging, achieves durability Database Systems

  7. Concurrency Control • Concurrency control – activity of coordinating the actions of transactions that operate in parallel and access shared data. • The operations of multiple transactions must be interleaved in such a way that the transactions do not interfere with one another. • The results of the execution of the transactions must be the same as if they were run one after the other in a serial fashion – serializability. • A database system must find serializable executions of the transactions it receives. • For simplicity, we will assume transactions are composed of the operations: read(x), write(x), commit and abort. Database Systems

  8. Concurrency Control Problems T1 T2 • Lost update problem: • Dirty read problem: read(x) x = x + 1 read(x) The final value of x is incremented by one rather than by two, as it should be. time x = x + 1 write(x) write(x) T1 T2 read(x) T2 reads a value for x that should not have been part of the database. x = x + 1 write(x) time read(x) abort Database Systems

  9. More Problems • Incorrect summary problem: T1 T2 sum = 0 T1 incorrectly computes the sum of x, y, and z. read(x) sum = sum + x read(y) y = y - 1 write(y) time If T2 is moving money From one account to another and T1 is computing total assets, this can be a serious problem. read(y) sum = sum + y read(z) sum = sum + z read(z) z = z + 1 write(z) Database Systems

  10. Transaction Histories • A history is a partial ordering of the operations in a set of transactions recording how the transactions executed. • Two operations conflict if they operate on the same data item and at least one of them is a write. • in this case it is important to know which of the two operations went first • Let T = {T1, T2, ..., Tn}. A history H over T is a partial order <H that includes all the operations in the {Ti } and that preserves the order of the operations in each Ti. It must also order all pairs of conflicting operations. Database Systems

  11. History Example • Consider the following three transactions: • T1 = r1[x] ---> w1[x] ---> c1 • T2 = r2[x] ---> w2[y] ---> w2[x] ---> c2 • T3 = r3[y] ---> w3[x] ---> w3[y] ---> w3[z] ---> c3 • A history over these transactions is: • r2[x] ---> w2[y] ---> w2[x] ---> c2 • r3[y] ---> w3[x] ---> w3[y] ---> w3[z] ---> c3 H = • r1[x] ---> w1[x] ---> c1 Database Systems

  12. Serialization Graph • A serialization graph of a history H is a directed graph whose nodes are the transactions in H and whose edges are all of the form Ti ---> Tj such that an operation in Ti precedes and conflicts with an operation in Tj. • Example • The serialization graph for this example is: • r3[x] ---> w3[x] ---> c3 • r1[x] ---> w1[x] ---> w1[y] ---> c1 H = • r2[x] ---> w2[y] ---> c2 T2 T1 T3 Database Systems

  13. Serializability Theory • Two histories are equivalent if they order conflicting operations of non-aborted transactions in the same way. • Goal -- determine if there is a serial history equivalent to a given history H. • if there is, then the transactions in H did not interfere with one another • Therefore, we need a serial history consistent with the edges in the serialization graph for the history H. • Such a serial history exists if the serialization graph for history H is acyclic. Database Systems

  14. Example - A Serializable History • Example: • The serialization graph for this example is: • An equivalent serial history is: • r3[x] ---> w3[x] ---> c3 • r1[x] ---> w1[x] ---> w1[y] ---> c1 H = • r2[x] ---> w2[y] ---> c2 T2 T1 T3 T2 T1 T3 Database Systems

  15. Example: Non-Serializable History • r2[x] ---> w2[y] ---> w2[x] ---> c2 • Example: • The serialization graph for this history is: • There are no equivalent serial histories because the serialization graph has a cycle. • these transactions interfere with one another • r3[y] ---> w3[x] ---> w3[y] ---> w3[z] ---> c3 H = • r1[x] ---> w1[x] ---> c1 T1 T2 T3 Database Systems

  16. Serialization Graphs • Given a serialization graph that does not contain a cycle, an equivalent serial schedule can always be constructed: • create an empty schedule S • while the graph is not empty: • find all nodes N in the graph with no incoming edges • order transactions in N in any arbitrary serial order and add them to the end of S for each transaction in N, select the operations in the current schedule and write them in the same order to the new schedule • delete nodes N and all edges from nodes in N to the remaining graph Database Systems

  17. Schedulers • A database system must have a scheduler that orders operations in concurrently executing transactions. • Aggressive (optimistic) scheduler -- assume the best and abort if discover a problem later • the abort undoes operations that the transaction completed before the abort • good when conflicts are rare • Conservative scheduler -- delay operations so that they can be reordered to avoid problems. • less concurrency • good when conflicts are likely Database Systems

  18. Locking to Ensure Serializability • A transaction should have a lock in its possession for any data objects it accesses. • read/shared lock is required to read an item • write/exclusive lock is required to write an item • Many transactions may have read locks on the same data item. • If a transaction holds a write lock on an item, no other transaction may obtain a lock of any kind for this item. Database Systems

  19. Two Phase Locking (2PL) • Before each read and write operation, the transaction manager attempts to obtain the necessary lock for the appropriate data item • if the lock is granted, then the operation is performed • if the lock is not granted, then the whole transaction waits in a queue until the lock can be obtained • A transaction can release a lock at any time • after a transaction releases a lock for the first time, it cannot obtain any new locks on any data items • The COMMIT command in SQL releases all locks a transaction holds, which is a stronger condition than required by 2PL. [Strict Two Phase Locking] Database Systems

  20. 2PL THEOREM: All schedules generated by enforcing the 2PL protocol are serializable. The actual read/write events that take place in the database correspond to some serial execution of transactions. PROOF: The conflict graph induced by 2PL compliant schedulers are acyclic. By contradiction, assume the serialization graph has a cycle of the form: Then there must be at least two conflicts. A sample scenario for conflicts 1 and 2: R1(A)…W2(A)...R2(B)…W1(B) conflict1 T1 T2 conflict2 Database Systems

  21. 2PL conflict1 T1 T2 R1(A)…W2(A)...R2(B)…W1(B) • Such a cycle is not possible. T2 could only obtain a write lock for “A” if T1 first releases it’s read lock and enters its shrinking phase. • Consequently, T1 will be able to write “B” only if it already holds a write lock for “B”. • But is T1 already holds a write lock for “B”, then T2 will not be able to get the read lock on B that it needs. • Hence, this schedule cannot happen. conflict2 Database Systems

  22. But There is a Problem ! • Two-phase locking is prone to deadlocks. • Example: • T1: r1 [x] ---> w1 [y] ---> c1 • T2: w2 [y] ---> w2 [x] ---> c2 • Lock conversion from read locks to write locks can also cause a problem. • T1: r1 [x] ---> w1 [x] ---> c1 • T2: r2 [x] ---> w2 [x] ---> c2 Database Systems

  23. Deadlocks in 2PL • If a transaction requests a lock of any kind and cannot obtain it, it enters a wait state • this can lead to deadlocks • A wait-for graph to detect deadlocks is constructed as follows: • if transaction Ti is waiting for a lock that transaction Tj holds, draw an edge from Ti to Tj • If the wait-for graph has a cycle, then no transaction in the cycle will ever exit the “wait” state. [DEADLOCK] • One of the transactions participating in a deadlock can be aborted to remove the deadlock Database Systems

  24. Example • T1: R(A) W(B) • T2: R(B) W(A) • Schedule: R1(A) R2(B) • T1 is waiting for T2 to release the lock on B, T2 is waiting for T1 to release the lock on A • Neither of the transactions can complete  there is a cycle in the wait-for graph  deadlock T1 T2 Database Systems

  25. Deadlocks • Time-outs -- if a transaction has waited too long for a lock, the scheduler “guesses” that it is deadlocked and aborts it • Maintain wait-for graphs • things to consider when aborting transactions: • effort already invested • cost to abort • other potential deadlocks • work remaining • priority and locks already held Database Systems

  26. Implementation Issues • The scheduler in a database system is usually implemented as a combination of a lock manager and a transaction manager. • The lock manager maintains a table of locks and supports the following operations: • lock (transaction id, data item, mode) • unlock (transaction id, data item) • these must be atomic operations • The lock manager also maintains a queue for each data item of transactions waiting for locks on that item. Database Systems

  27. Locking Granularity • Possible lock granularities: • attribute value in a tuple • tuple • disk block • file • whole database • Larger granularities allow less concurrency • Smaller granularities require more overhead • Many database systems provide a small set of choices to pick from. • Row (tuple level) locking only in Oracle. Database Systems

  28. Timestamp Ordering • A unique timestamp ts(Ti) is given to each transaction. • If operations oi[x] from Ti and oj[x] from Tj conflict, then the system processes oi[x] before oj[x] iff ts(Ti) < ts(Tj). • Basic Timestamp Ordering: (aggressive) • operations executed in first come first serve order • if an operation arrives to late, its transaction is aborted • the scheduler must maintain the maximum timestamp of previous reads and writes for each data item to do this • Timestamp Ordering establishes an ordering on the transactions so that the serialization graph will be acyclic. Database Systems

  29. Cascading Abort • By definition, if a transaction aborts, its effects must be completely erased from the database • Example: • if T1 aborts, since T2 has seen the changes caused by T1, then T2 has to be aborted as well -> cascading abort • schedules generated by 2PL may require cascading abort Database Systems

  30. Strict 2PL • A transaction obtains all locks it requires as in normal 2PL. • A transaction holds all locks until it commits • Since all transactions see only the changes made by committed (not aborted) transactions, these transactions do not require cascading aborts • the example on the previous slide could not happen • Remember that most commercial database systems use strict 2PL • this is one of the reasons why Database Systems

  31. Levels of Isolation • 2PL ensures serializability, but at the expense of lower concurrency • “Consistency” may not be vital to some types of transactions • In all the isolations levels, all locks are kept until commit time! Database Systems

  32. Read Uncommitted • Read all tuples regardless of the locks that are on them. • When reading, a short lock (latch) is used to ensure atomicity. • Transactions that are set to be “read uncommitted” are read only and cannot make database updates • Results of such transactions are not guaranteed to be correct! SET TRANSACTION READ UNCOMMITTED ; SELECT count(F.reserved_seats) FROM Flights F WHERE F.flight_id = “AA112” ; Database Systems

  33. Read committed (cursor stability) • There are no read locks (except for latches) ! But, transactions can only read items with no locks. • A transaction needs a write lock to write a data item. • Write locks are not released until the transaction holding them commits. • Avoids all W1(A) -> R2(A) and W1(A) -> W2(A) type conflicts since transaction T2 will only execute after transaction T1 commits • Problem with R1(A) -> … -> W1(A) type transactions Another transaction T2 may read and alter A in between these operations Database Systems

  34. Cursor Stability • Cursor Stability • declare a cursor for update when using “read committed” • a read lock is held on each row the cursor points to until the cursor is moved from that location exec sql set transaction read committed ; exec sql declare cursor deposit for select balance from accounts where branch_id = ‘Alb’ for update of balance ; exec sql open deposit ; (loop through rows in cursor) exec sql fetch deposit into :balance ; balance = balance + 10 ; exec sql update account set balance = :balance where current of deposit ; (end of loop) exec sql close deposit ; exec sql commit work ; Database Systems

  35. Repeatable reads • Hold read locks on rows as well as write locks until commit time • similar to the 2PL mechanism we have seen • Suffers from “phantom update” problems. exec sql set transaction repeatable reads ; exec sql select count(F.reserved_seats) into :num; from Flights F where F.flight_id = “AA112” exec sql update current set status = :num where current.flight_id = “AA112” ; Another transaction may insert a reservation tuple in between these operations making the count incorrect Database Systems

  36. Serializable • Serializable transactions do not have the phantom update problem. • Serializable transactions lock a “predicate”, limiting all read/write access to operations that would satisfy a predicate such as F.flight_id = “AA112” • In Oracle Database Systems

More Related