560 likes | 681 Views
Database Transactions. Transaction Management and Concurrency Control. Svetlin Nakov. Telerik Software Academy. http://academy.telerik.com. Manager Technical Training. http://www.nakov.com. Table of Contents. What is a Transaction? ACID Transactions Managing Transactions in SQL
E N D
Database Transactions Transaction Management and Concurrency Control Svetlin Nakov Telerik Software Academy http://academy.telerik.com Manager Technical Training http://www.nakov.com
Table of Contents • What is a Transaction? • ACID Transactions • Managing Transactions in SQL • Concurrency Problems in DBMS • Concurrency Control Techniques • Locking Strategies:Optimistic vs. Pessimistic Locking • Transaction Isolation Levels • Transaction Log and Recovery • When and How to Use Transactions? 2
Transactions • Transactionsis a sequence of actions (database operations)executed as a whole: • Either all of them complete successfully • Or none of the them • Example of transaction: • A bank transfer from one account into another (withdrawal + deposit) • If either the withdrawal or the deposit failsthe whole operation is cancelled 4
A Transaction Read Write Durable starting state Sequenceof reads and writes Durable, consistent, ending state Commit Write Rollback 5
Transactions Behavior • Transactions guarantee the consistency and the integrity of the database • All changes in a transaction are temporary • Changes are persisted when COMMIT is executed • At any time all changes can be canceled by ROLLBACK • All of the operations are executed as a whole • Either all of them or none of them 6
Withdraw $100 Transactions: Example Transfer $100 • Read current balance • New balance = current - $100 • Write new balance • Dispense cash • Read savings • New savings =current - $100 • Read checking • New checking =current + $100 • Write savings • Write checking 7
What Can Go Wrong? • Some actions fail to complete • For example, the application software or database server crashes • Interference from another transaction • What will happen if several transfers run for the same account in the same time? • Some data lost after actions complete • Database crashes after withdraw is complete and all other actions are lost 8
Transactions Properties • Modern DBMS servers have built-in transaction support • Implement “ACID” transactions • E.g. MS SQL Server, Oracle, MySQL, … • ACID means: • Atomicity • Consistency • Isolation • Durability 10
Atomicity • Atomicity means that • Transactions execute as a whole • DBMS to guarantee that either all of the operations are performed or none of them • Atomicity example: • Transfer funds between bank accounts • Either withdraw + deposit both execute successfully or none of them • In case of failure the DB stays unchanged 11
Consistency • Consistency means that • The database is in a legal state when the transaction begins and when it ends • Only valid data will be written in the DB • Transaction cannot break the rules of the database, e.g. integrity constraints • Primary keys, foreign keys, alternate keys • Consistency example: • Transaction cannot end witha duplicate primary key in a table 12
Isolation • Isolation means that • Multiple transactions running at the same time do not impact each other’s execution • Transactions don’t see othertransaction’s uncommitted changes • Isolation level defines how deeptransactions isolate from one another • Isolation example: • Manager can see the transferred funds on one account or the other, but never on both 13
Durability • Durability means that • If a transaction is committedit becomes persistent • Cannot be lost or undone • Ensured by use of database transaction logs • Durability example: • After funds are transferred and committed the power supply at the DB server is lost • Transaction stays persistent (no data is lost) 14
ACID Transactions and RDBMS • Modern RDBMS servers are transactional: • Microsoft SQL Server, Oracle Database, PostgreSQL, FirebirdSQL, … • All of the above servers support ACID transactions • MySQL can also run in ACID mode (InnoDB) • Most cloud databases are transactional as well • Amazon SimpleDB, AppEngine Datastore, Azure Tables, MongoDB, … 15
Transactions and SQL • Start a transaction • BEGINTRANSACTION • Some RDBMS use implicit start, e.g. Oracle • Ending a transaction • COMMIT • Complete a successful transaction and persist all changes made • ROLLBACK • “Undo” changes from an aborted transaction • May be done automatically when failure occurs 17
Transactions inSQL Server: Example • We have a table with bank accounts: • We use a transaction to transfer money from one account into another CREATE TABLE Accounts( Id int NOT NULL PRIMARY KEY, Balance decimal NOT NULL) CREATE PROCEDURE sp_Transfer_Funds( @from_account INT, @to_account INT, @amount MONEY) AS BEGIN BEGIN TRAN; (example continues) 18
Transactions inSQL Server: Example (2) UPDATE Accounts SET Balance = Balance - @amount WHERE Id = @from_account; IF @@ROWCOUNT <> 1 BEGIN ROLLBACK; RAISERROR('Invalid src account!', 16, 1); RETURN; END; UPDATE Accounts SET Balance = Balance + @amount WHERE Id = @to_account; IF @@ROWCOUNT <> 1 BEGIN ROLLBACK; RAISERROR('Invalid dest account!', 16, 1); RETURN; END; COMMIT; END; 19
Transfer Funds Live Demo
Scheduling Transactions • Serial schedule – the ideal case • Transactions execute one after another • No overlapping: users wait one another • Not scalable: doesn’t allow much concurrency • Conflicting operations • Two operations conflict if they: • are performed in different transactions • access the same piece of data • at least one of the transactions does a write operation to that piece of data 22
Serial Schedule – Example • T1: Adds $50to the balance • T2: Subtracts $25from the balance • T1 completes before T2begins • No concurrency problems Time Trans. Step Value 1 T1 Readbalance 100 2 T1 Balance= 100 + 50 T1 3 Write balance 150 4 T2 Read balance 150 5 T2 Balance = 150 - 25 6 T2 Write balance 125 23
Serializable Transactions • Serializability • Want to get the effect of serial schedules,but allow for more concurrency • Serializable schedules • Equivalent to serial schedules • Produce same final result as serial schedule • Locking mechanisms can ensure serializability • Serializability is too expensive • Optimistic locking allows better concurrency 24
Concurrency Problems • Problems from conflicting operations: • Dirty Read • A transaction updates an item, then fails • The item is accessed by another transaction before the rollback • The second transaction reads invalid data • Non-Repeatable Read • A transaction reads the same item twice • And gets different values • Due to concurrent change in another transaction 25
Concurrency Problems (2) • Problems from conflicting operations: • Phantom Read • A transaction executes a query twice • And gets a different number of rows • Due to another transaction inserted new rows in the meantime • Lost Update • Two transactions update the same item • The second update overwrites the first • Last update wins 26
Concurrency Problems (3) • Problems from conflicting operations: • Incorrect Summary • One transaction is calculating an aggregate function on some records • While another transaction is updating them • The result is incorrect • Some records are aggregated before the updates • Some after the updates 27
Dirty Read – Example Time Trans. Step Value 1 T1 Readbalance 100 2 T1 Balance= 100 + 50 T1 3 Write balance 150 Uncommitted 4 T2 Read balance 150 5 T2 Balance = 150 - 25 Undoes T1 6 T1 Rollback 125 7 T2 Write balance • Update from T1 was rolled back, but T2 doesn’t know about it, so finally the balance is incorrect T2 writes incorrect balance 28
Lost Update – Example Time Trans. Step Value 1 T1 Readbalance 100 2 T2 Read balance 100 T1 3 Balance = Balance + 50 4 T2 Balance = Balance - 25 5 T2 150 Write balance 6 T1 Write balance 75 • Update from T1 is lost because T2reads the balance before T1was completed Lost update 29
Concurrency Control • The problem • Conflicting operations in simultaneous transactions may produce an incorrect results • What is concurrency control? • Managing the execution of simultaneous operations inthe database • Preventing conflicts when two or more users access database simultaneously • Ensuring the results are correct like when all operations are executed sequentially 31
Locking Strategies • Optimistic concurrency control (no locking) • No locks – all operations run in parallel • Conflicts are possible • Can be resolved before commit • High concurrency – scale very well • Pessimistic concurrency control (locking) • Use exclusive and shared locks • Transactions wait for each other • Low concurrency – does not scale well 32
Optimistic Concurrency • Optimistic concurrency control(optimistic locking)means no locking • Based on assumption that conflicts are rare • Transactions proceed without delays to ensure serializability • At commit, checks are made to determine whether a conflict has occurred • Conflicts can be resolved by last wins / first wins strategy • Or conflicted transaction can be restarted • Allows greater concurrency than pessimistic locking 33
Optimistic Concurrency: Phases • Three phases of optimistic concurrency: • Read • Reads DB, perform computations, store the results in memory • Validate • Check for conflicts in the database • In case of conflict resolve it / discard changes • Write • Changes are made persistent to DB 34
Optimistic Concurrency Example • Read the data from DB: • Remember the state and perform some changes: • Update the original database record: • Check for conflicts happened during the update: SELECT @fname = FirstName FROM Persons WHERE PersonId = 7 This could take some time (e.g. wait for user action) @old_fname = @fname @fname = "Some new name" UPDATE Persons SET FirstName = @fname WHERE PersonId = 7 AND FirstName = @old_fname IF @@ROWCOUNT = 0 RAISERROR ('Conflicting update: row changed. ', 16, 1); 35
Pessimistic Concurrency • Pessimistic concurrency control(pessimistic locking) • Assume conflicts are likely • Lock shared data to avoid conflicts • Transactions wait each other does not scale well • Use shared and exclusive locks • Transactions must claim a read (shared) or write (exclusive) lock on a data item before read or write • Locks prevents another transaction from modifying item or even reading it, in the case of a write lock 36
Locking – Basic Rules • If transaction has read lock on an item • The item can be read but not modified • If transaction has write lock on an item • The item can be both read and modified • Reads are not conflicting • Multiple transactions can hold read locks simultaneously on the same item • Write lock gives exclusive access to the locked item • Transaction can upgrade a read lock to a write lock • Or downgrade a write lock to a read lock • Commits and rollbacks release the locks 37
Deadlock • What is deadlock? • When two (or more) transactions are each waiting for locks held by the others • Deadlock example: • A locks the "Authors" table • And tries to modify the "Books" table • B locks the "Books" table • And tries to modify the "Authors" table • Breaking a deadlock • Only one way: abort some of the transactions 38
Dealing with Deadlock • Deadlock prevention • Transaction can’t obtain a new lock if the possibility of a deadlock exists • Deadlock avoidance • Transaction must obtain all the locks it needs upfront (before it starts) • Deadlock detection and recovery • DB checks for possible deadlocks • If deadlock is detected, one of the transactions is killed and an exception is thrown 39
Locking Granularity • What is locking granularity? • The size of data items chosen as unit of protection by concurrency control • Ranging from coarse to fine: • Entire database • Single data file • Data page (block) • Table record • Field value of a record 40
Coarse vs. Fine Granularity • Coarse granularity • Small number of locks protecting large segments of data, e.g. DB, file, page locks • Small overhead, small concurrency • Fine granularity • Large number of locks over small areas of data, e.g. table row of field in a row • More overhead, better concurrency • DBMS servers are “smart” and use both 41
Transactions can define different isolationlevels for themselves Stronger isolation Ensures better consistency Has less concurrency The data is locked longer Transactions and Isolation 43
Isolation levels • Uncommitted Read • Reads everything, even data not committed by some other transaction • No data is locked • Not commonly used • Read Committed • Current transaction sees only committed data • Records retrieved by a query are not prevented from modification by some other transaction • Default behavior in most databases 44
Isolation levels • Repeatable Read • Records retrieved cannot be changed from outside • The transaction acquires read locks on all retrieved data, but does not acquire range locks (phantom reads may occur) • Deadlocks can occur • Serializable • Acquires a range lock on the data • Simultaneous transactions are actually executed one after another 45
Snapshot Isolation in SQL Server • By default MS SQL Server applies pessimistic concurrency control • When some transaction updates some data, the other transactions wait it to complete • A special SNAPSHOT isolation level in MS SQL • It enables optimistic concurrency control • When some transaction updates some data, all other transactions see the old data (snapshot) • Nolocking is applied no waiting transactions SET TRANSACTION ISOLATION LEVEL SNAPSHOT 46
Transaction Log • What is transaction log (REDO log)? • Keep a log of all database writes ON DISK (so that it is still available after crash) • <transaction ID>; <data item>; <new value> • (Tj; x=125) (Ti; y=56) • Actions must be idempotent (undoable / redoable) • But don't write to the database yet • At the end of transaction execution • Add "commit <transaction ID>" to the log • Do all the writes to the database • Add "complete <transaction ID>" to the log 49