1 / 37

Advanced Topics: Indexes & Transactions

Advanced Topics: Indexes & Transactions. Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu. Indexes. Why Indexes. With or without indexes, the query answer should be the same Indexes are needed for efficiency and fast access of data. Without index, we check all 10,000 students.

sovann
Download Presentation

Advanced Topics: Indexes & Transactions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Topics: Indexes & Transactions Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu cs3431

  2. Indexes cs3431

  3. Why Indexes • With or without indexes, the query answer should be the same • Indexes are needed for efficiency and fast access of data Without index, we check all 10,000 students Assume we have 10,000 students SELECT* FROMStudent WHEREsNumber = 76544357; With index, we can reach that student directly cs3431

  4. Direct Access vs. Sequential Access SELECT* FROMStudent WHEREsNumber = 76544357; Without index, we check all 10,000 students (sequential access) With index, we can reach that student directly (direct access) cs3431

  5. What is an Index • A index is an auxiliary file that makes it more efficient to search for a record in the data file • The index is usually specified on one field of the file • Although it could be specified on several fields • The index is stored separately from the base table • Each table may have multiple indexes Can create an index on sNumber Student Can create a second index on sName cs3431

  6. Example: Index on sNumber Student • Index file is always sorted • Index size is much smaller than the table size • Now any query (equality or range) on sNumber can be efficiently answered (Binary search on the index) Index on sNumber

  7. Example: Index on sName Student • Duplicates values have duplicate entries in the index • Now any query (equality or range) on sName can be efficiently answered (Binary search on the index) Index on sName

  8. Creating an Index Create Index <name> On <tablename>(<colNames>); Student DB System knows how to: 1- create the index 2- when and how to use it Create Index sNumberIndexOn Student(sNumber); Create Index sNameIndexOn Student(SName);

  9. Multiple Predicates 1- The best the DBMS can do is using addressIndex  ‘320FL’ 2- From those tuples, check sName = ‘Dave’ Student SELECT* FROM Student WHERE address = ‘320FL’ AND sName = ‘Dave’; Create Index addessIndexOn Student(address); cs3431

  10. Multi-Column Indexes • Columns X, Y are frequently queried together (with AND) • Each column has many duplicates • Then, consider creating a multi-column index on X, Y SELECT* FROM Student WHERE address = ‘320FL’ AND sName = ‘Dave’; Directly returns this record only Create Index nameAddOn Student(sName, address);

  11. Using an Index • DBMS automatically figures out which index to use based on the query SELECT* FROMStudent WHEREsNumber = 76544357; Student Automatically uses SNumberIndex Create Index sNumberIndexOn Student(sNumber); Create Index sNameIndexOn Student(SName); cs3431

  12. How Do Indexes Work? cs3431

  13. Types of Indexes • Primary vs. Secondary • Single-Level vs. Multi-Level (Tree Structure) • Clustered vs. Non-Clustered cs3431

  14. Primary vs. Secondary Indexes • Index on the primary key of a relation is called primary index (only one) • Index on any other column is called secondary index (can be many) • In primary index, all values are unique • In secondary indexes, values may have duplicates Student Index on SSN is a Primary Index Index on sNumberis a Secondary Index Index on sNameis a Secondary Index

  15. Single-Level Indexes • Index is one-level sorted list • Given a value v to query • Perform a binary search in the index to find it (Fast) • Follow the link to reach the actual record Student Index on sNumber

  16. Multi-Level Index • Build index on top of the index (can go multiple levels) • When searching for value v: • Find the largest entry ≤ v, and follow its pointer Student 2nd level 1st level cs3431 Index on sNumber

  17. Clustered vs. Non-Clustered Assume there is index X on column C • If the records in the table are stored sorted based on C • X  Clustered index • Otherwise, X  Non-Clustered index • Primary index is a clustered index Student • Non-Clustered • index • Clustered index

  18. Index Maintenance • Indexes are used in queries • But, need to be maintained when data change • Insert, update, delete • DBMS automatically handles the index maintenance • When insert new records  the indexed field is added to the index • When delete records  their values are deleted from the index • When update an indexed value  delete the old value from index & insert the new value • There is a cost for maintaining an index, however its benefit is usually more (if used a lot) cs3431

  19. Summary of Indexes • Indexes are auxiliary structures for efficient searching and querying • Query answer is the same with or without index • What to index depends on which columns are frequently queried (in Where clause) • Main operations Create Index <name> On <tablename>(<colNames>); Drop Index <name>; cs3431

  20. Transactions cs3431

  21. What is a Transaction • A set of operations on a database that are treated as one unit • Execute Allor None • Transactions have semantics at the application level • Want to reserve two seats in a flight • Transfer money from account A to account B • … • What if two users are reserving the same flight seat at the same time??? Transactions solve these problems

  22. Transactions • By default, each SQL statement is a transaction • Can change the default behavior SQL > Start transaction; SQL > Insert …. SQL > Update … SQL > Delete .. SQL > Select … SQL> Commit | Rollback; All of these statements are now one unit (either all succeed all fail) End transaction successfully Cancel the transaction

  23. Transaction Properties • Four main properties • Atomicity– A transaction if one atomic unit • Consistency– A transaction ensures DB is consistent • Isolation– A transaction is considered as if no other transaction was executing simultaneously • Durability– Changes made by a transaction must persist • ACID: Atomicity, Consistency, Isolation, Durability • ACID properties are enforced by the DBMS cs3431

  24. Consistency Issue • Many users may update the data at the same time • How to ensure the result is consistent 2 1 Update T Set x = x * 3; Update T Set x = x + 2; 3 What is the right answer??? Wrong, Inconsistent data

  25. Serial Order of Transactions • Given N concurrent transactions T1, T2, …TN • Serial order is any permutation of these transactions (N!) • T1, T2, T3, …TN • T2, T3, T1, …, TN • … • DBMS will ensure that the end-result from executing the N transactions (concurrently) matches one of the serial order execution • That is called Serializability • As if transactions are executed in serial order cs3431

  26. Serializable Execution • Given N concurrent transactions T1, T2, …TN • DBMS will execute them concurrently (at the same time) • But, the final effect matches one of the serial order executions Update T Set x = x * 3; Update T Set x = x + 2;

  27. Isolation Levels • Read Uncommitted • Read Committed • Repeatable Read • Serializable Gets stronger & avoids problems That is the default in DBMS cs3431

  28. 1- READ UNCOMMITTED Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust where id=500; color ----- blue select color from cust where id=500; color ----- blue -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- update cust set color='blue' where id=500; -----------COMMIT------------ | | | | V Time NonRepeatable read (bad) Dirty read (bad)

  29. 2- READ COMMITTED Dirty Read  Solved Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust where id=500; color ----- red select color from cust where id=500; color ----- blue -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- update cust set color='blue' where id=500; -----------COMMIT------------ | | | | V Time NonRepeatable read (bad)

  30. 2- READ COMMITTED Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust where id=500; color ----- red select color from cust where id=500; color ----- -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- delete cust where id=500; -----------COMMIT------------ | | | | V Time Phantom (bad)

  31. 3- REPEATABLE READ NonRepeatable Read Solved Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust where id=500; color ----- red select color from cust where id=500; color ----- red -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- update cust set color='blue' where id=500; -----------COMMIT------------ | | | | V Time

  32. 3- REPEATABLE READ Phantom (For Delete)  Solved Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust where id=500; color ----- red select color from cust where id=500; color ----- red -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- delete cust where id=500; -----------COMMIT------------ | | | | V Time

  33. 3- REPEATABLE READ Session 2 -------BEGIN TRANSACTION----- select id from cust where color=‘blue’; id -- select id from cust where color=‘blue’; id -- select id from cust where color=‘blue’; id -- 500 -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- Insert into cust(id, color) values (500, ‘blue’); -----------COMMIT------------ | | | | V Time Phantom Insert (bad)

  34. 4- SERIALIZABLE Phantom  Solved Session 2 -------BEGIN TRANSACTION----- select id from cust where color=‘blue’; id -- select id from cust where color=‘blue’; id -- select id from cust where color=‘blue’; id -- -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- Insert into cust(id, color) values (500, ‘blue’); -----------COMMIT------------ | | | | V Time

  35. Summary of Transactions • Unit of work in DBMS • Either executed All or None • Ensures consistency among many concurrent transactions • Ensures persistent data once committed (using recovery techniques) • Main ACID properties • Atomicity, Consistency, Isolation, Durability cs3431

  36. END !!! cs3431

  37. Final Exam • Dec. 13, at 8:15am – 9:30am (75 mins) • Closed book, open sheet • Answer in the same exam sheet • Material Included • ERD • SQL (Select, Insert, Update, Delete) • Views, Triggers, Assertions • Cursors, Stored Procedures/Functions • Material Excluded • Relational Model & Algebra • Normalization Theory • ODBC/JDBC • Indexes and Transactions Friday’s Lecture (Revision + short Quiz)

More Related