480 likes | 603 Views
Foundations of Database Systems. Concurrency Controls Instructor: Zhijun Wang. Announcement. Quiz #2 will be given in the last hour today Project Demo has been scheduled on May 8, 20:00-22:00pm at M402. You need to submit your project report at your Demo time. Concurrency Control.
E N D
Foundations of Database Systems Concurrency Controls Instructor: Zhijun Wang Database Systems
Announcement • Quiz #2 will be given in the last hour today • Project Demo has been scheduled on May 8, 20:00-22:00pm at M402. You need to submit your project report at your Demo time. Database Systems
Concurrency Control • Concurrency control ensures that one user’s work does not inappropriately influence another user’s work • No single concurrency control technique is ideal for all circumstances • Trade-offs need to be made between level of protection and throughput Database Systems
Atomic Transactions • A transaction, or logical unit of work (LUW), is a series of actions taken against the database that occurs as an atomic unit • Either all actions in a transaction occur or none of them do Database Systems
Errors Introduced Without Atomic Transaction Database Systems
Errors Prevented With Atomic Transaction Database Systems
Concurrent Transaction • Concurrent transactions refer to two or more transactions that appear to users as they are being processed against a database at the same time • In reality, CPU can execute only one instruction at a time • Transactions are interleaved meaning that the operating system quickly switches CPU services among tasks so that some portion of each of them is carried out in a given interval • Concurrency problems: lost update and inconsistent reads Database Systems
Concurrent Transaction Processing Database Systems
Lost-Update Problem Database Systems
Resource Locking • Resource locking prevents multiple applications from obtaining copies of the same record when the record is about to be changed Database Systems
Lock Terminology • Implicit locks are locks placed by the DBMS • Explicit locks are issued by the application program • Lock granularity refers to size of a locked resource • Rows, page, table, and database level • Large granularity is easy to manage but frequently causes conflicts • Types of lock • An exclusive lock prohibits other users from reading the locked resource • A shared lock allows other users to read the locked resource, but they cannot update it Database Systems
Locking Granularity Database Systems
Concurrent Processing with Explicit Locks Database Systems
Serializable Transactions • Serializable transactions refer to two transactions that run concurrently and generate results that are consistent with the results that would have occurred if they had run separately • Two-phased locking is one of the techniques used to achieve serializability Database Systems
Two-phased Locking • Two-phased locking • Transactions are allowed to obtain locks as necessary (growing phase) • Once the first lock is released (shrinking phase), no other lock can be obtained • A special case of two-phased locking • Locks are obtained throughout the transaction • No lock is released until the COMMIT or ROLLBACK command is issued • This strategy is more restrictive but easier to implement than two-phased locking Database Systems
2PL Implementation Database Systems
Deadlock • Deadlock, or the deadly embrace, occurs when two transactions are each waiting on a resource that the other transaction holds • Preventing deadlock • Allow users to issue all lock requests at one time • Require all application programs to lock resources in the same order • Breaking deadlock • Almost every DBMS has algorithms for detecting deadlock • When deadlock occurs, DBMS aborts one of the transactions and rollbacks partially completed work Database Systems
Deadlock Database Systems
Optimistic versus Pessimistic Locking • Optimistic locking assumes that no transaction conflict will occur: • DBMS processes a transaction; checks whether conflict occurred: • If not, the transaction is finished • If so, the transaction is repeated until there is no conflict • Pessimistic locking assumes that conflict will occur: • Locks are issued before a transaction is processed, and then the locks are released • Optimistic locking is preferred for the Internet and for many intranet applications Database Systems
Optimistic Locking Database Systems
Pessimistic Locking Database Systems
Declaring Lock Characteristics • Most application programs do not explicitly declare locks due to its complication • Instead, they mark transaction boundaries and declare locking behavior they want the DBMS to use • Transaction boundary markers: BEGIN, COMMIT, and ROLLBACK TRANSACTION • Advantage • If the locking behavior needs to be changed, only the lock declaration need be changed, not the application program Database Systems
Marking Transaction Boundaries Database Systems
ACID Transactions • Acronym ACID transaction is one that is Atomic, Consistent, Isolated, and Durable • Atomic means either all or none of the database actions occur • Durable means database committed changes are permanent Database Systems
ACID Transactions • Consistency means either statement level or transaction level consistency • Statement level consistency: each statement independently processes rows consistently • Transaction level consistency: all rows impacted by either of the SQL statements are protected from changes during the entire transaction • With transaction level consistency, a transaction may not see its own changes Database Systems
ACID Transactions • Isolation means application programmers are able to declare the type of isolation level and to have the DBMS manage locks so as to achieve that level of isolation • SQL-92 defines four transaction isolation levels: • Read uncommitted • Read committed • Repeatable read • Serializable Database Systems
Database Recovery • In the event of system failure, that database must be restored to a usable state as soon as possible • Two recovery techniques: • Recovery via reprocessing • Recovery via rollback/rollforward Database Systems
Recovery via Reprocessing • Recovery via reprocessing: the database goes back to a known point (database save) and reprocesses the workload from there • Unfeasible strategy because • The recovered system may never catch up if the computer is heavily scheduled • Asynchronous events, although concurrent transactions, may cause different results Database Systems
Rollback/Rollforward • Recovery via rollback/rollforward: • Periodically save the database and keep a database change log since the save • Database log contains records of the data changes in chronological order • When there is a failure, either rollback or rollforward is applied • Rollback: undo the erroneous changes made to the database and reprocess valid transactions • Rollforward: restored database using saved data and valid transactions since the last save Database Systems
Rollback Before-image: a copy of every database record (or page) before it was changed. Database Systems
Rollforward After-image: a copy of every database record (or page) after it was changed Database Systems
Checkpoint • A checkpoint is a point of synchronization between the database and the transaction log • DBMS refuses new requests, finishes processing outstanding requests, and writes its buffers to disk • The DBMS waits until the writing is successfully completed the log and the database are synchronized • Checkpoints speed up database recovery process • Database can be recovered using after-images since the last checkpoint • Checkpoint can be done several times per hour • Most DBMS products automatically checkpoint themselves Database Systems
Transaction Log Database Systems
Database Recovery:A Processing Problem Occurs Database Systems
Database Recovery: Recovery Processing Database Systems
Schedules • Schedule: An interleaving of actions from a set of transactions, where the actions of any transaction are in the original order. • Represents some actual sequence of database actions. • Example: R1(A), W1(A), R2(B), W2(B), R1(C), W1(C) • In a complete schedule, each transaction ends in commit or abort. • Initial State + Schedule Final State Database Systems
Acceptable Schedules • One sensible “isolated, consistent” schedule: • Run transactions one at a time, in a series. • This is called a serial schedule. • NOTE: Different serial schedules can have different final states; all are “OK” • DBMS makes no guarantees about the order in which concurrently submitted transactions are executed. • Serializable schedules: • Final state is what some serial schedule would have produced. • Aborted transactions are not part of schedule • they are made to ‘disappear’ by using logging. Database Systems
Serializability Violations • Two actions conflict when 2 transactions access the same item: • W-R conflict: T2 reads something T1 wrote. • R-W and W-W conflicts: Similar. • WR conflict (dirty read): • Result is not equal to any serial execution! transfer $100 from A to B add 6% interest to A & B Database is inconsistent! Database Systems
More Conflicts • RW Conflicts (Unrepeatable Read) • T2 overwrites what T1 read. • If T1 reads it again, it will see something new! • Example when this would happen? • The increment(T1)/decrement(T2) example. • Again, not equivalent to a serial execution. • WW Conflicts (Overwriting Uncommited Data) • T2 overwrites what T1 wrote. • Example: 2 transactions (T1 to increase, T2 to decrease) to update 2 items to be kept equal. • Usually occurs in conjunction w/other anomalies. • Unless you have “blind writes”. Database Systems
RW conflict A = 5 T1: A = A + 1, T2: A = A – 1 T1 T2T1 T2T1 T2 R(A) R(A) R(A) W(A) W(A) R(A) R(A) R(A) W(A) W(A) W(A) W(A) Database Systems
WW Conflict T1: H=1000, L=1000 T2: H=2000, L=2000 T1 T2 W(H) W(L) W(L) W(H) Database Systems
Aborted Transactions • Serializable schedule: A schedule which is equivalent to a serial schedule of committed transactions. • as if aborted transactions never happened. • Two Issues: • How does one undo the effects of an transaction? • We’ll cover this in logging/recovery • What if another transaction sees these effects?? • Must undo that transaction as well! Database Systems
Cascading Aborts • Abort of T1 requires abort of T2! • Cascading Abort • What about WW conflicts & aborts? • T2 overwrites a value that T1 writes. • T1 aborts: its “remembered” values are restored. • Lose T2’s write! We will see how to solve this, too. • An ACA (avoids cascading abort)schedule is one in which cascading abort cannot arise. • A transaction only reads/writes data from committed transactions. Database Systems
Recoverable Schedules • Abort of T1 requires abort of T2! • But T2 has already committed! • A recoverable schedule is one in which this cannot happen. • i.e. a transaction commits only after all the transactions it “depends on” (i.e. it reads from or overwrites) commit. • Recoverable implies ACA (but not vice-versa!). • Real systems typically ensure that only recoverable schedules arise (through locking). Database Systems
Precedence Graph T1 T2 • A Precedence (or Serializability) graph: • Node for each committed transaction. • Arc from Ti to Tj if an action of Ti precedes and conflicts with an action of Tj. • T1 transfers $100 from A to B, T2 adds 6% to both • R1(A), W1(A), R2(A), W2(A), R2(B), W2(B), R1(B), W1(B) Database Systems
Conflict Serializability • 2 schedules are conflict equivalentif: • they have the same sets of actions, and • each pair of conflicting actions is ordered in the same way. • (they have the same effect on a DB) • A schedule is conflict serializableif it is conflict equivalent to a serial schedule. • It is serializable if the set of items in the DB does not grow or shrink • Note: Some serializable schedules are not conflict serializable! Database Systems
Conflict Serializability & Graphs • Theorem: A schedule is conflict serializable iff its precedence graph is acyclic. • Theorem: 2PLensures that the precedence graph will be acyclic! • Strict 2PLimproves on this by avoiding cascading aborts, problems with undoing WW conflicts; i.e., ensuring recoverable schedules. Database Systems
Summary • Concurrency control key to a DBMS. • More than just mutexes! • Transactions and the ACID properties: • C & I are handled by concurrency control. • A & D coming soon with logging & recovery. • Conflicts arise when two transactions access the same object, and one of the transactions is modifying it. • Serial execution is our model of correctness. Database Systems