1 / 29

CS422 Principles of Database Systems Indexes

CS422 Principles of Database Systems Indexes. Chengyu Sun California State University, Los Angeles. Indexes. Auxiliary structures that speed up operations that are not supported efficiently by the basic file organization. A Simple Index Example. 10. 10. 20. 20. 30. 30. 40. 40. 50.

Download Presentation

CS422 Principles of Database Systems Indexes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS422 Principles of Database SystemsIndexes Chengyu Sun California State University, Los Angeles

  2. Indexes • Auxiliary structures that speed up operations that are not supported efficiently by the basic file organization

  3. A Simple Index Example 10 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 Index blocks Data blocks

  4. Entries in an Index • <key, rid> • <key, list of rid> • Data records

  5. Organization of Index Entries • Tree-structured • B-tree, R-tree, Quad-tree, kd-tree, … • Hash-based • Static, dynamic • Other • Bitmap, VA-file, …

  6. From BST to BBST to B • Binary Search Tree • Worst case?? • Balance Binary Search Tree • E.g. AVL, Red-Black • B-tree • Why not use BBST in databases??

  7. B-tree (B+-tree) Example 100 root internal nodes 30 120 150 leaf nodes 3 5 11 30 35 100 101 110 120 130 150 156 179

  8. B-tree Properties • Each node occupies one block • Order n • n keys, n+1 pointers • Nodes (except root) must be at least half full • Internal node: (n+1)/2 pointers • Leaf node: (n+1)/2 pointers • All leaf nodes are on the same level

  9. B-tree Operations • Search • Insert • Delete

  10. B-tree Insert • Find the appropriate leaf • Insert into the leaf • there’s room  we’re done • no room • split leaf node into two • insert a new <key,pointer> pair into leaf’s parent node • Recursively apply previous step if necessary • A split of current ROOT leads to a new ROOT

  11. B-tree Insert Examples • (a) simple case • space available in leaf • (b) leaf overflow • (c) non-leaf overflow • (d) new root HGM Notes

  12. 32 n=3 100 (a) Insert key = 32 30 3 5 11 30 31 HGM Notes

  13. 7 3 5 7 n=3 100 (b) Insert key = 7 30 3 5 11 30 31 HGM Notes

  14. 160 180 160 179 n=3 100 (c) Insert key = 160 120 150 180 180 200 150 156 179 HGM Notes

  15. 30 new root 40 40 45 n=3 (d) New root, insert 45 10 20 30 1 2 3 10 12 20 25 30 32 40 HGM Notes

  16. B-tree Delete • Find the appropriate leaf • Delete from the leaf • still at least half full  we’re done • below half full – coalescing • borrow a <key,pointer> from one sibling node, or • merge with a sibling node, and delete from a parent node • Recursively apply previous step if necessary

  17. B-tree Delete in Practice • Coalescing is usually not implemented because it’s too hard and not worth it

  18. Static Hash Index record hash function bucket blocks overflow blocks bucket directory Hash Index

  19. Hash Function • A commonly used hash function: K%B • K is the key value • B is the number of buckets

  20. Static Hash Index Example … • 4 buckets • Hash function: key%4 • 2 index entries per bucket block

  21. … Static Hash Index Example bucket directory bucket blocks • Insert the records with the following keys: 4, 3, 7, 17, 22, 10, 25, 33 key%4 0 1 2 record 3

  22. Dynamic Hashing • Problem of static hashing?? • Dynamic hashing • Extendable Hash Index

  23. Extendable Hash Index … • Maximum 2M buckets • M is maximum depth of index • Multiple buckets can share the same block • Inserting a new entry to a block that is already full would cause the block to split

  24. … Extendable Hash Index • Each block has a local depth L, which means that the hash values of the records in the block has the same rightmost L bit • The bucket directory keeps a global depthd, which is the highest local depth

  25. Extendable Hash Index Example • M=4 (i.e. could have at most 16 buckets) • Hash function: key%24 • 2 index entries per block

  26. Extendable Hashing (I) Bucket directory Bucket blocks d=0 L=0 insert 8 (i.e. 1000) Insert 11 (i.e. 1011) d=0 1000 L=0 1011 insert 4 (i.e. 0100)

  27. Extendable Hashing (II) Bucket directory Bucket blocks 1000 L=1 d=1 0 0100 1 1011 L=1 insert 14 (i.e. 1110)

  28. Extendable Hashing (III) Bucket directory Bucket blocks 1000 L=2 d=2 00 0100 10 01 1110 L=2 11 1011 L=1

  29. Readings • Textbook Chapter 21.1 – 21.4

More Related