1 / 142

What we have covered?

What we have covered?. Indexing and Hashing Data warehouse and OLAP Data Mining Information Retrieval and Web Mining XML and XQuery Spatial Databases Transaction Management. Lecture 7: Transactions. The Setting.

minnie
Download Presentation

What we have covered?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What we have covered? • Indexing and Hashing • Data warehouse and OLAP • Data Mining • Information Retrieval and Web Mining • XML and XQuery • Spatial Databases • Transaction Management

  2. Lecture 7: Transactions

  3. The Setting • Database systems are normally being accessed by many users or processes at the same time. • Both queries and modifications. • Unlike operating systems, which support interaction of processes, a DMBS needs to keep processes from troublesome interactions.

  4. Example: Bad Interaction • You and your domestic partner each take $100 from different ATM’s at about the same time. • The DBMS better make sure one account deduction doesn’t get lost. • Compare: An OS allows two people to edit a document at the same time. If both write, one’s changes get lost.

  5. ACID Transactions • A DBMS is expected to support “ACID transactions,” processes that are: • Atomic : Either the whole process is done or none is. • Consistent : Database constraints are preserved. • Isolated : It appears to the user as if only one process executes at a time. • Durable: Effects of a process do not get lost if the system crashes.

  6. Transactions in SQL • SQL supports transactions, often behind the scenes. • Each statement issued at the generic query interface is a transaction by itself. • In programming interfaces like Embedded SQL or PSM, a transaction begins the first time a SQL statement is executed and ends with the program or an explicit transaction-end.

  7. COMMIT • The SQL statement COMMIT causes a transaction to complete. • It’s database modifications are now permanent in the database.

  8. ROLLBACK • The SQL statement ROLLBACK also causes the transaction to end, but by aborting. • No effects on the database. • Failures like division by 0 or a constraint violation can also cause rollback, even if the programmer does not request it.

  9. An Example: Interacting Processes • Assume the usual Sells(bar,beer,price) relation, and suppose that Joe’s Bar sells only Bud for $2.50 and Miller for $3.00. • Sally is querying Sells for the highest and lowest price Joe charges. • Joe decides to stop selling Bud and Miller, but to sell only Heineken at $3.50.

  10. Sally’s Program • Sally executes the following two SQL statements, which we call (min) and (max), to help remember what they do. (max) SELECT MAX(price) FROM Sells WHERE bar = ’Joe’’s Bar’; (min) SELECT MIN(price) FROM Sells WHERE bar = ’Joe’’s Bar’;

  11. Joe’s Program • At about the same time, Joe executes the following steps, which have the mnemonic names (del) and (ins). (del) DELETE FROM Sells WHERE bar = ’Joe’’s Bar’; (ins) INSERT INTO Sells VALUES(’Joe’’s Bar’, ’Heineken’, 3.50);

  12. Interleaving of Statements • Although (max) must come before (min), and (del) must come before (ins), there are no other constraints on the order of these statements, unless we group Sally’s and/or Joe’s statements into transactions.

  13. 2.50, 3.00 2.50, 3.00 3.50 (max) (del) (min) 3.00 3.50 Example: Strange Interleaving • Suppose the steps execute in the order (max)(del)(ins)(min). Joe’s Prices: Statement: Result: • Sally sees MAX < MIN! (ins)

  14. Fixing the Problem by Using Transactions • If we group Sally’s statements (max)(min) into one transaction, then she cannot see this inconsistency. • She sees Joe’s prices at some fixed time. • Either before or after he changes prices, or in the middle, but the MAX and MIN are computed from the same prices.

  15. Another Problem: Rollback • Suppose Joe executes (del)(ins), not as a transaction, but after executing these statements, thinks better of it and issues a ROLLBACK statement. • If Sally executes her statements after (ins) but before the rollback, she sees a value, 3.50, that never existed in the database.

  16. Solution • If Joe executes (del)(ins) as a transaction, its effect cannot be seen by others until the transaction executes COMMIT. • If the transaction executes ROLLBACK instead, then its effects can never be seen.

  17. Isolation Levels • SQL defines four isolation levels = choices about what interactions are allowed by transactions that execute at about the same time. • How a DBMS implements these isolation levels is highly complex, and a typical DBMS provides its own options.

  18. Choosing the Isolation Level • Within a transaction, we can say: SET TRANSACTION ISOLATION LEVEL X where X = • SERIALIZABLE • REPEATABLE READ • READ COMMITTED • READ UNCOMMITTED

  19. Serializable Transactions • If Sally = (max)(min) and Joe = (del)(ins) are each transactions, and Sally runs with isolation level SERIALIZABLE, then she will see the database either before or after Joe runs, but not in the middle. • It’s up to the DBMS vendor to figure out how to do that, e.g.: • True isolation in time. • Keep Joe’s old prices around to answer Sally’s queries.

  20. Isolation Level Is Personal Choice • Your choice, e.g., run serializable, affects only how you see the database, not how others see it. • Example: If Joe Runs serializable, but Sally doesn’t, then Sally might see no prices for Joe’s Bar. • i.e., it looks to Sally as if she ran in the middle of Joe’s transaction.

  21. Read-Commited Transactions • If Sally runs with isolation level READ COMMITTED, then she can see only committed data, but not necessarily the same data each time. • Example: Under READ COMMITTED, the interleaving (max)(del)(ins)(min) is allowed, as long as Joe commits. • Sally sees MAX < MIN.

  22. Repeatable-Read Transactions • Requirement is like read-committed, plus: if data is read again, then everything seen the first time will be seen the second time. • But the second and subsequent reads may see more tuples as well.

  23. Example: Repeatable Read • Suppose Sally runs under REPEATABLE READ, and the order of execution is (max)(del)(ins)(min). • (max) sees prices 2.50 and 3.00. • (min) can see 3.50, but must also see 2.50 and 3.00, because they were seen on the earlier read by (max).

  24. Read Uncommitted • A transaction running under READ UNCOMMITTED can see data in the database, even if it was written by a transaction that has not committed (and may never). • Example: If Sally runs under READ UNCOMMITTED, she could see a price 3.50 even if Joe later aborts.

  25. Concurrency Control T1 T2 … Tn DB (consistency constraints)

  26. Review • Why do we need transaction? • What’s ACID? • What’s SQL support for transaction? • What’s the four isolation level • SERIALIZABLE • REPEATABLE READ • READ COMMITTED • READ UNCOMMITTED

  27. Example: T1: Read(A) T2: Read(A) A  A+100 A  A2 Write(A) Write(A) Read(B) Read(B) B  B+100 B  B2 Write(B) Write(B) Constraint: A=B

  28. A B 25 25 125 125 250 250 250 250 Schedule A T1 T2 Read(A); A  A+100 Write(A); Read(B); B  B+100; Write(B); Read(A);A  A2; Write(A); Read(B);B  B2; Write(B);

  29. A B 25 25 50 50 150 150 150 150 Schedule B T1 T2 Read(A);A  A2; Write(A); Read(B);B  B2; Write(B); Read(A); A  A+100 Write(A); Read(B); B  B+100; Write(B);

  30. A B 25 25 125 250 125 250 250 250 Schedule C T1 T2 Read(A); A  A+100 Write(A); Read(A);A  A2; Write(A); Read(B); B  B+100; Write(B); Read(B);B  B2; Write(B);

  31. A B 25 25 125 250 50 150 250 150 Schedule D T1 T2 Read(A); A  A+100 Write(A); Read(A);A  A2; Write(A); Read(B);B  B2; Write(B); Read(B); B  B+100; Write(B);

  32. A B 25 25 125 125 25 125 125 125 Same as Schedule D but with new T2’ Schedule E T1 T2’ Read(A); A  A+100 Write(A); Read(A);A  A1; Write(A); Read(B);B  B1; Write(B); Read(B); B  B+100; Write(B);

  33. Want schedules that are “good”, regardless of • initial state and • transaction semantics • Only look at order of read and writes Example: Sc=r1(A)w1(A)r2(A)w2(A)r1(B)w1(B)r2(B)w2(B)

  34. Example: Sc=r1(A)w1(A)r2(A)w2(A)r1(B)w1(B)r2(B)w2(B) Sc’=r1(A)w1(A) r1(B)w1(B)r2(A)w2(A)r2(B)w2(B) T1 T2

  35. Review • Why do we need transaction? • What’s ACID? • What’s SQL support for transaction? • What’s the four isolation level • SERIALIZABLE • REPEATABLE READ • READ COMMITTED • READ UNCOMMITTED

  36. Example: T1: Read(A) T2: Read(A) A  A+100 A  A2 Write(A) Write(A) Read(B) Read(B) B  B+100 B  B2 Write(B) Write(B) Constraint: A=B

  37. A B 25 25 125 250 50 150 250 150 Schedule D T1 T2 Read(A); A  A+100 Write(A); Read(A);A  A2; Write(A); Read(B);B  B2; Write(B); Read(B); B  B+100; Write(B);

  38. However, for Sd: Sd=r1(A)w1(A)r2(A)w2(A) r2(B)w2(B)r1(B)w1(B) • as a matter of fact, T2 must precede T1 in any equivalent schedule, i.e., T2 T1

  39. T2 T1 • Also, T1 T2 T1 T2 Sd cannot be rearranged into a serial schedule Sd is not “equivalent” to any serial schedule Sd is “bad”

  40. A B 25 25 125 250 125 250 250 250 Schedule C T1 T2 Read(A); A  A+100 Write(A); Read(A);A  A2; Write(A); Read(B); B  B+100; Write(B); Read(B);B  B2; Write(B);

  41. Returning to Sc Sc=r1(A)w1(A)r2(A)w2(A)r1(B)w1(B)r2(B)w2(B) T1 T2 T1 T2  no cycles  Sc is “equivalent” to a serial schedule (in this case T1,T2)

  42. Concepts Transaction: sequence of ri(x), wi(x) actions Conflicting actions: r1(A) w2(A) w1(A) w2(A) r1(A) w2(A) Schedule: represents chronological order in which actions are executed Serial schedule: no interleaving of actions or transactions

  43. Definition S1, S2 are conflict equivalentschedules if S1 can be transformed into S2 by a series of swaps on non-conflicting actions.

  44. Definition A schedule is conflict serializable if it is conflict equivalent to some serial schedule.

  45. Precedence graph P(S) (S is schedule) Nodes: transactions in S Arcs: Ti  Tj whenever - pi(A), qj(A) are actions in S - pi(A) <S qj(A) - at least one of pi, qj is a write

  46. Exercise: • What is P(S) forS = w3(A) w2(C) r1(A) w1(B) r1(C) w2(A) r4(A) w4(D) • Is S serializable?

  47. Another Exercise: • What is P(S) forS = w1(A) r2(A) r3(A) w4(A) ?

  48. Proof: Assume P(S1)  P(S2)  Ti: Ti  Tj in S1 and not in S2  S1 = …pi(A)... qj(A)… pi, qj S2 = …qj(A)…pi(A)... conflict  S1, S2 not conflict equivalent Lemma S1, S2 conflict equivalent  P(S1)=P(S2)

  49. Counter example: S1=w1(A) r2(A) w2(B) r1(B) S2=r2(A) w1(A) r1(B) w2(B) Note: P(S1)=P(S2)  S1, S2 conflict equivalent

  50. Theorem P(S1) acyclic  S1 conflict serializable () Assume S1 is conflict serializable  Ss: Ss, S1 conflict equivalent  P(Ss)=P(S1)  P(S1) acyclic since P(Ss) is acyclic

More Related