1 / 37

3. Transaction processing concepts

Explore the fundamentals of transactions, concurrent processing, types of failures, and recovery mechanisms in database management systems. Learn the principles of transaction viewpoints and interleaved executions. Discover OLTP and OLAP processing categories.

jcarmela
Download Presentation

3. Transaction processing concepts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 3. Transaction processing concepts What is a transaction? • Logical unit of work • Execution of a program that accesses or changes the database Fundamental property of a database: • The data are shared among several users and applications  concurrent processing, possibly conflicting interests. AdvDB-3 J. Teuhola 2015

  2. Transaction viewpoints • A transaction should lead the database from one consistent state to another. • Partial execution not allowed (principle:all or nothing) • Concurrent access to data by multiple transactions should be supported. • Transactions may end prematurely, due to system (hardware/software) failure;recovery mechanisms are needed. AdvDB-3 J. Teuhola 2015

  3. Categories of database processing • OLTP = On-Line Transaction Processing • Needed for the everyday functioning of the organization’s main activities, accessing the ’operative’ databases • Short transactions and response times • OLAP = On-Line Analytic Processing • Decision support, data mining • Long read-only transactions • ’Data warehouse’ • The data need not be exactly up-to-date; refreshed periodically AdvDB-3 J. Teuhola 2015

  4. Interleaved execution • Concurrenttransactions may be executed in an interleaved fashion. • Transactions consist of steps, between which the data may be inconsistent. • Arbitrary interleaving of steps will cause problems (interference): • Lost update problem • Temporary update problem (‘dirty read’) • Incorrect summary problem AdvDB-3 J. Teuhola 2015

  5. Time Transaction 1: Transaction 2:Withdraw 200 from account 123 Withdraw 100 from account 123 bal = read_balance(123) = 1000 € bal = bal – 200 € bal = read_balance(123) = 1000 € bal = bal - 100 € write_balance(123, bal) = 800 € write_balance(123, bal) = 900 € Lost update problem Example: Bank account, two parallel withdrawals • The first update is lost. Note that the transactions are not aware of each other’s internal program variables (bal). AdvDB-3 J. Teuhola 2015

  6. Time Transaction 1 Transaction 2Transfer 100 € from accountWithdraw 200 € from account 123123 to account 789 bal = read_balance(123)=1000€ bal = bal - 100€ write_balance(123, bal)=900€ bal = read_balance(123) = 900€ bal = read_balance(789)=2000€abort; recover_balance(123)=1000€ bal = bal - 200 € write_balance(123, bal)=700 €// Should be 800 € Temporary update (‘dirty read’) problem Example transactions on bank accounts: AdvDB-3 J. Teuhola 2015

  7. Time Transaction 1 Transaction 2Sum of balances Transfer 100 € from 789 to 123 sum = 0 bal = read_balance(123) = 1000 € sum = sum + bal = 1000 € bal = read_balance(456) = 2000 € sum = sum + bal = 3000€ bal = read_balance(789) = 3000 € bal = bal – 100 € write_balance(789, bal) = 2900 € bal = read_balance(123) = 1000 € bal = bal + 100€ write_balance(123, bal) = 1100 € bal = read_balance(789) = 2900 € sum = sum + 2900 € = 5900 € // Should be 6000 € Incorrect summary problem AdvDB-3 J. Teuhola 2015

  8. Reasons for recovery • DBMS must ensure that each transaction is either(1) executed in totality (all operations completed successfully; updates made permanent), or(2) has no effect on the database contents nor on other transactions. • In case of failure, the DBMS must roll backthe (partially executed) transaction so that (2) holds. AdvDB-3 J. Teuhola 2015

  9. Types of failures • Hardware/software failure: contents of main memory may be lost. • Transaction error: e.g. illegal operation or user interrupt. • Logical error: Violation of database consistency constraints. • Concurrency control error: Interference of transactions, e.g. deadlock. • Disk failure, accidents, etc.: Some data are usually lost. AdvDB-3 J. Teuhola 2015

  10. Transaction events • begin_transaction • read/write • commit: successful end; fix the changes • rollback (abort): unsuccessful ending • undo effects of a failed transaction • redo effects of a successful transaction • end_transaction AdvDB-3 J. Teuhola 2015

  11. System log • On disk (permanent storage) • Enables recovery from failures • Stores data about transaction states and performed updates (before-/after-images) • Transactions identified by a unique id. • Log entries:start, [read,] write, commit, abort • Undo needs before-image (old value) • Redo needs after-image (new value) AdvDB-3 J. Teuhola 2015

  12. Commit point • Database operations performed successfully • Effects recorded in the log. • At commit the log page in buffer must be force-written to disk. • Writing other buffer pages may be postponed. • At failure: • Uncommitted transactions are undone. • Committed transactions may have to be redone. AdvDB-3 J. Teuhola 2015

  13. Checkpoint • Effects of write operations of committed transactions are forced to disk. • A transaction need not be redone at failure,if its commit-entry in the log is before the last checkpoint. • Checkpoint is taken periodically. AdvDB-3 J. Teuhola 2015

  14. Checkpoint actions • Suspend execution. • Force-write committed updates. • Write the checkpoint info to the log: • active transactions • pointers to the first and last log entries of transactions (to speed-up recovery) • Resume execution. AdvDB-3 J. Teuhola 2015

  15. Main memory Program 1 DBMS Program 2 DB buffer manages Logbuffer ‘Page frames’ Force-write at checkpoint Force-write at commit Before-/after-imagesof changedtuples Disk pages Log pages DB Transaction processing scenario AdvDB-3 J. Teuhola 2015

  16. Desirable (ACID) properties of transactions • Atomicity: All or nothing. • Consistency: From one consistent state of the database to another. • Isolation: Updates are invisible before commit; no need for cascading rollbacks. • Durability: Committed changes are not lost (responsibility of the recovery method), exceptin catastrophic failures. AdvDB-3 J. Teuhola 2015

  17. Levels of isolation 0: No overwrite of dirty reads. 1: No lost updates. 2. No dirty reads & no lost updates. 3. ‘True isolation’: No dirty reads & no lost updates & repeatable read. AdvDB-3 J. Teuhola 2015

  18. Schedule • Assume a set of transactions {T1, ..., Tn}with ordered operations: Ti = <op1>i ; <op2>i ; ... • A schedule S is an ordered set of operations, where each Ti-sequence is a subsequenceof S. • Notice: Operations of different transactions may be interleaved. AdvDB-3 J. Teuhola 2015

  19. Example schedule • Operation sequences of three transactions: T1: r1(x); r1(y); w1(x); c1; T2: r2(y); w2(y); c2; T3: r3(x); r3(y); w3(x); c3; • One possible interleaved schedule: r1(x); r2(y); r1(y); w1(x); r3(x); r3(y); w2(y); w3(x); c2; c1; c3; AdvDB-3 J. Teuhola 2015

  20. Conflict of operations • Notation for operations: <op><trans-id>(<data item>)where <op> is one of {r, w, c, a} = {read, write, commit, abort} • Conflict (= danger): Two operations <op1>i(X) and <op2>j(X)may cause trouble if i  j and (<op1> = w or <op2> = w) AdvDB-3 J. Teuhola 2015

  21. Example of conflicts • Schedule:r1(x); r1(y); r2(x); w1(x); r2(y); w2(x); c1; c2; • Conflicting pairs of operations: • r1(x) and w2(x) • r2(x) and w1(x) • w1(x) and w2(x) • Not in conflict: • r1(y) and r2(y) AdvDB-3 J. Teuhola 2015

  22. Complete schedule S is a complete schedule of T1, T2, ..., Tn, if (1) S contains operations of T1, T2, ..., Tn (but no others). (2) Operation order of each Ti is preserved. (3) Order of operations is fixed for each conflicting pair; otherwise ordering may be partial. (4) Commit/abort is the last operation of each Ti. AdvDB-3 J. Teuhola 2015

  23. Committed projection • Committed projection of S =All operations of S belonging to committed transactions. • Reason for this concept:New transactions appear all the time;difficult to find a moment when no active transactions exist. AdvDB-3 J. Teuhola 2015

  24. Recoverable schedule • Transaction T2 that reads X does not commit until T1 that wrote X (before the read) has committed. • Explanation:T2 can be rolled back as long as it has not committed. If T1 aborts, T2 should be rolled back to avoid consequences of dirty read. AdvDB-3 J. Teuhola 2015

  25. Cascading rollback Cascade phenomenon: T2 has to be aborted because T1 aborted;T3 has to be aborted because T2 aborted; ... Prevention: Read an item only after the writing transaction has committed (cascadeless schedule). Strict schedule: Read/write X only after the last writer transaction of X has committed.Advantage: Simpler undo.Disadvantage: Lower level of concurrency AdvDB-3 J. Teuhola 2015

  26. Serializability Aim: • Correct (but not necessarily serial) schedulesfor interfering transactions Means: • Sufficient isolation Serial schedule: • All operations of each transaction Ti are executed consecutively, without interleaving. AdvDB-3 J. Teuhola 2015

  27. Serializability (cont.) Axiom: For independent transactions everyserial schedule is correct (n! alternativesfor n transactions). Problem: Limited concurrency; if one transaction waits for I/O, another cannot take over; the processor isidle. Serializable schedule: Equivalent to some serial schedule (same result, though interleaved execution); means logical correctness. AdvDB-3 J. Teuhola 2015

  28. Forms of schedule equivalence • Result equivalence:May be accidental; it should hold for allpossible database states. • Conflict equivalence:The order of any twoconflicting operations is the same in both schedules. • View equivalence: A read operation reads the result of the same write operation (or none) in both schedules. AdvDB-3 J. Teuhola 2015

  29. Forms of serializability Corresponding to the forms of equivalence: • Conflict serializability:Simple testing algorithm • View serializability:Less restrictive than conflict serializability.Testing is anNP-complete problem. Note. Every conflict-serializable schedule isalso view-serializable, but not vice versa. AdvDB-3 J. Teuhola 2015

  30. Testing conflict serializability 1. For each transaction, create a node in a precedence graph. 2. For each pair of conflicting operations <op>Ti(X) and <op>Tj(X), occurring in this order, create an edge from Ti to Tj. 3. The schedule is serializable if the precedence graph has no cycles. AdvDB-3 J. Teuhola 2015

  31. Serializable schedule equivalent serial schedule 1. Derive the precedence graph of transactionsas above. 2. Perform a topological sort of the graph nodes. 3. The resulting sequence is the solution(which may not be unique, in general). AdvDB-3 J. Teuhola 2015

  32. Example of a precedence graph Schedule: r1(x), r2(y), w1(x), r3(x), r3(y), w3(x), w3(y), w2(y), c1, c2, c3 1 2 y x The schedule is not serializable, due to the cycle. y 3 AdvDB-3 J. Teuhola 2015

  33. Problems of serializability • Difficult to determine the order of operations beforehand. • Scheduling must be done on-the-fly, based on waiting, priorities, system load, etc. • Testing serializability afterwards may be too optimistic (cf. optimistic concurrency control). • Serializability can be applied only to the committed projection. AdvDB-3 J. Teuhola 2015

  34. Solution to maintaining serializability • Control the interleaving of operations by obeying rules (a protocol) that guarantee serializability without checking it.(E.g. two-phase locking protocol, see later) • Conclusion: Serializability theory helps to gain better understanding of protocols, to obtain correct interleaving and improved concurrency. AdvDB-3 J. Teuhola 2015

  35. Debit-credit transactions • Less restrictive form of equivalence • Addition to and subtraction from anitem can be done in any order(commutativity). • A schedule is correct if<read, add/subtract, write> triplesare not interrupted. AdvDB-3 J. Teuhola 2015

  36. Granularity • To be decided: Unit of data considered in scheduling • A smaller unit enables a higher level of concurrency (but involves also more effort to manage) • A (disk) page is often the simplest unit of operation. AdvDB-3 J. Teuhola 2015

  37. Transaction support in SQL • Implicit BEGIN_TRANSACTION • Explicit COMMIT or ROLLBACK • SET TRANSACTION characteristics: • Access mode: READ ONLY or READ WRITE • Diagnostic area size: Number of feedback conditions held simultaneously. • Isolation level (from weakest to strongest): • READ UNCOMMITTED (allows dirty reads) • READ COMMITTED (allows non-repeatable reads) • REPEATABLE READ (allows ‘phantoms’) • SERIALIZABLE (actually stronger than defined earlier) AdvDB-3 J. Teuhola 2015

More Related