380 likes | 440 Views
B-Trees, Part 2 Hash-Based Indexes. R&G Chapter 10 Lecture 10. Administrivia. The new Homework 3 now available Due 1 week from Sunday Homework 4 available the week after Midterm exams available here. Review. Last time discussed File Organization Unordered heap files Sorted Files
E N D
B-Trees, Part 2Hash-Based Indexes R&G Chapter 10 Lecture 10
Administrivia • The new Homework 3 now available • Due 1 week from Sunday • Homework 4 available the week after • Midterm exams available here
Review • Last time discussed File Organization • Unordered heap files • Sorted Files • Clustered Trees • Unclustered Trees • Unclustered Hash Tables • Indexes • B-Trees – dynamic, good for changing data, range queries • Hash tables – fastest for equality queries, useless for range queries
Review (2) • For any index, 3 alternatives for data entries k*: • Data record with key value k • <k, rid of data record with search key value k> • <k, list of rids of data records with search key k> • Choice orthogonal to the indexing technique
Today: • Indexes • Composite Keys, Index-Only Plans • B-Trees • details of insertion and deletion • Hash Indexes • How to implement with changing data sets
Inserting a Data Entry into a B+ Tree • Find correct leaf L. • Put data entry onto L. • If L has enough space, done! • Else, must splitL (into L and a new node L2) • Redistribute entries evenly, copy upmiddle key. • Insert index entry pointing to L2 into parent of L. • This can happen recursively • To split index node, redistribute entries evenly, but push upmiddle key. (Contrast with leaf splits.) • Splits “grow” tree; root split increases height. • Tree growth: gets wider or one level taller at top.
Indexes with Composite Search Keys Examples of composite key indexes using lexicographic order. • Composite Search Keys: Search on a combination of fields. • Equality query: Every field value is equal to a constant value. E.g. wrt <sal,age> index: • age=20 and sal =75 • Range query: Some field value is not a constant. E.g.: • age =20; or age=20 and sal > 10 • Data entries in index sorted by search key to support range queries. • Lexicographic order, or • Spatial order. 11,80 11 12 12,10 name age sal 12,20 12 13,75 bob 12 10 13 <age, sal> cal 11 80 <age> joe 12 20 10,12 sue 13 75 10 20 20,12 Data records sorted by name 75,13 75 80,11 80 <sal, age> <sal> Data entries in index sorted by <sal,age> Data entries sorted by <sal>
Composite Search Keys • To retrieve Emp records with age=30 ANDsal=4000, an index on <age,sal> would be better than an index on age or an index on sal. • Choice of index key orthogonal to clustering etc. • If condition is: 20<age<30 AND 3000<sal<5000: • Clustered tree index on <age,sal> or <sal,age> is best. • If condition is: age=30 AND 3000<sal<5000: • Clustered <age,sal> index much better than <sal,age> index! • Composite indexes are larger, updated more often.
SELECT D.mgr FROM Dept D, Emp E WHERE D.dno=E.dno Index-Only Plans <E.dno> SELECT D.mgr, E.eid FROM Dept D, Emp E WHERE D.dno=E.dno <E.dno,E.eid> • A number of queries can be answered without retrieving any tuples from one or more of the relations involved if a suitable index is available. Tree index! SELECT E.dno, COUNT(*) FROM Emp E GROUP BY E.dno <E.dno> SELECT E.dno, MIN(E.sal) FROM Emp E GROUP BY E.dno <E.dno,E.sal> Tree index! <E. age,E.sal> or <E.sal, E.age> SELECTAVG(E.sal) FROM Emp E WHERE E.age=25 AND E.sal BETWEEN 3000 AND 5000 Tree!
Index-Only Plans (Contd.) SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age=30 GROUP BY E.dno • Index-only plans are possible if the key is <dno,age> or we have a tree index with key <age,dno> • Which is better? • What if we consider the second query? SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age>30 GROUP BY E.dno
B-Trees: Insertion and Deletion • Insertion • Find leaf where new record belongs • If leaf is full, redistribute* • If siblings too full, split, copy middle key up • If index too full, redistribute* • If index siblings full, split, push middle key up • Deletion • Find leaf where record exists, remove it • If leaf is less than 50% empty, redistribute* • If siblings too empty, merge, remove key above • If index node above too empty, redistribute* • If index siblings too empty, merge, move above key down
B-Trees: For Homework 2 • Insertion • Find leaf where new record belongs • If leaf is full, redistribute* • If siblings too full, split, copy middle key up • If index too full, redistribute* • If index siblings full, split, push middle key up • Deletion • Find leaf where record exists, remove it • If leaf is less than 50% empty, redistribute* • If siblings too empty, merge, remove key above • If index node above too empty, redistribute* • If index siblings too empty, merge, move above key down This means that after deletion, nodes will often be < 50% full
B-Trees: For Homework 2 (cont) • Splits • When splitting nodes, choose the middle key • If there are even number of keys, choose middle • Your code must Handle Duplicate Keys • We promise that there will never be more than 1 page of duplicate values. • Thus, when splitting, if the middle key is identical to the key to the left, you must find the closest splittable key to the middle.
Hashing • Static and dynamic hashing techniques exist; trade-offs based on data change over time • Static Hashing • Good if data never changes • Extendable Hashing • Uses directory to handle changing data • Linear Hashing • Avoids directory, usually faster
Static Hashing • # primary pages fixed, allocated sequentially, never de-allocated; overflow pages if needed. • h(k) mod N = bucket to which data entry withkey k belongs. (N = # of buckets) 0 h(key) mod N 2 key h N-1 Primary bucket pages Overflow pages
Static Hashing (Contd.) • Buckets contain data entries. • Hash fn works on search key field of record r. Must distribute values over range 0 ... N-1. • h(key) = (a * key + b) usually works well. • a and b are constants; lots known about how to tune h. • Long overflow chains can develop and degrade performance. • Extendible and LinearHashing: Dynamic techniques to fix this problem.
Extendible Hashing • Situation: Bucket (primary page) becomes full. • Why not re-organize file by doubling # of buckets? • Reading and writing all pages is expensive! • Idea: Use directory of pointers to buckets, • double # of buckets by doubling the directory, • splitting just the bucket that overflowed! • Directory much smaller than file, doubling much cheaper. • Nooverflowpages! • Trick lies in how hash function is adjusted!
Extendible Hashing Details • Need directory with pointer to each bucket • Need hash function to incrementally double range • can just use increasingly more LSBs of h(key) • Must keep track of global “depth” • how many times directory doubled so far • Must keep track of local “depth” • how many time each bucket has been split
LOCAL DEPTH 2 Bucket A 16* 4* 12* 32* Example GLOBAL DEPTH 2 2 Bucket B 00 5* 1* 21* 13* 01 • Directory is array of size 4. • To find bucket for r, take last `global depth’ # bits of h(r); we denote r by h(r). • If h(r) = 5 = binary 101, it is in bucket pointed to by 01. 2 10 Bucket C 10* 11 2 DIRECTORY Bucket D 15* 7* 19* DATA PAGES • Insert: If bucket is full, splitit (allocate new page, re-distribute). • If necessary, double the directory. (As we will see, splitting a • bucket does not always require doubling; we can tell by • comparing global depth with local depth for the split bucket.)
Insert h(r)=20 (Causes Doubling) 2 LOCAL DEPTH 3 LOCAL DEPTH Bucket A 4* 12* 32* 16* 32* 16* GLOBAL DEPTH Bucket A GLOBAL DEPTH 2 2 2 3 Bucket B 5* 21* 13* 1* 00 1* 5* 21* 13* 000 Bucket B 01 001 2 10 2 010 Bucket C 10* 11 10* Bucket C 011 100 2 2 DIRECTORY 101 Bucket D 15* 7* 19* 15* 7* 19* Bucket D 110 111 3 DIRECTORY 12* 20* Bucket A2 4* (`split image' of Bucket A)
Now Insert h(r)=9 (Causes Split Only) 3 3 LOCAL DEPTH LOCAL DEPTH 32* 32* 16* 16* Bucket A Bucket A GLOBAL DEPTH GLOBAL DEPTH 2 3 3 3 1* 5* 21* 13* 1* 9* 000 000 Bucket B Bucket B 001 001 2 2 010 010 10* 10* Bucket C Bucket C 011 011 100 100 2 2 101 101 15* 15* 7* 7* 19* 19* Bucket D Bucket D 110 110 111 111 3 3 DIRECTORY 12* 12* 20* 20* Bucket A2 Bucket A2 4* 4* (`split image' DIRECTORY of Bucket A) 3 5* 21* 13* Bucket B2
Points to Note • 20 = binary 10100. Last 2 bits (00) tell us r belongs in A or A2. Last 3 bits needed to tell which. • Global depth of directory: Max # of bits needed to tell which bucket an entry belongs to. • Local depth of a bucket: # of bits used to determine if splitting bucket will also double directory • When does bucket split cause directory doubling? • If, before insert, local depth of bucket = global depth. • Insert causes local depth to become > global depth; • directory is doubled by copying it over and `fixing’ pointer to split image page. • (Use of least significant bits enables efficient doubling via copying of directory!)
Directory Doubling • Why use least significant bits in directory? • Allows for doubling via copying! 2 2 3 16* 4* 12* 32* 16* 4* 12* 32* 000 2 2 2 001 5* 1* 21* 13* 010 00 5* 1* 21* 13* 011 01 2 2 100 10 10* 10* 11 101 110 2 2 111 15* 7* 19* 15* 7* 19*
Deletion • Delete: If removal of data entry makes bucket empty, can be merged with `split image’. If each directory element points to same bucket as its split image, can halve directory. 2 1 16* 16* 4* 12* 32* 4* 12* 32* 2 2 2 2 Delete 10 00 00 5* 5* 1* 21* 13* 1* 21* 13* 01 01 2 10 10 10* 11 11 2 2 15* 15*
Deletion (cont) • Delete: If removal of data entry makes bucket empty, can be merged with `split image’. If each directory element points to same bucket as its split image, can halve directory. 1 2 16* 4* 12* 32* 00 1 16* 4* 12* 32* 01 Delete 15 1 10 5* 1* 21* 13* 2 2 11 00 5* 1* 21* 13* 01 10 1 1 11 16* 4* 12* 32* 0 1 2 1 15* 5* 1* 21* 13*
Comments on Extendible Hashing • If directory fits in memory, equality search answered with one disk access; else two. • 100MB file, 100 bytes/rec, 4K pages contains 1,000,000 records (as data entries) and 25,000 directory elements; chances are high that directory will fit in memory. • Directory grows in spurts, and, if the distribution of hash values is skewed, directory can grow large. • Biggest problem: • Multiple entries with same hash value cause problems! • If bucket already full of same hash value, will keep doubling forever! So must use overflow buckets if dups.
Linear Hashing • This is another dynamic hashing scheme, an alternative to Extendible Hashing. • LH handles the problem of long overflow chains without using a directory, and handles duplicates. • Idea: Use a family of hash functions h0, h1, h2, ... • hi(key) = h(key) mod(2iN); N = initial # buckets • h is some hash function (range is not 0 to N-1) • If N = 2d0, for some d0, hi consists of applying h and looking at the last di bits, where di = d0 + i. • hi+1 doubles the range of hi (similar to directory doubling)
Linear Hashing Example • Let’s start with N = 4 Buckets • Start at “round” 0, “next” 0, have 2round buckets • Each time any bucket fills, split “next” bucket • If (O ≤ hround(key) < Next), use hround+1(key) instead Start Add 9 Next 16* 16* 4* 12* 32* 32* Add 20 9* 5* 5* 1* 21* 13* Next 1* 21* 13* 16* 32* 10* 10* Nround Nround 15* 15* 9* 5* Next 1* 21* 13* 4* 12* 10* 15* Nround 20* 4* 12*
Linear Hashing Example (cont) • Overflow chains do exist, but eventually get split • Instead of doubling, new buckets added one-at-a-time Add 6 Add 17 16* 16* 32* 32* 9* 5* Next 1* 21* 13* 1* 13* 10* 6* 10* 6* Next Nround Nround 15* 15* 4* 12* 20* 4* 12* 20* 5* 21* 9*
Linear Hashing (Contd.) • Directory avoided in LH by using overflow pages, and choosing bucket to split round-robin. • Splitting proceeds in `rounds’. Round ends when all NRinitial (for round R) buckets are split. Buckets 0 to Next-1 have been split; Next to NR yet to be split. • Current round number also called Level. • Search:To find bucket for data entry r, findhround(r): • If hround(r) in range `Next to NR’, r belongs here. • Else, r could be hround(r) or hround(r) + NR; • must apply hround+1(r) to find out.
Overview of LH File • In the middle of a round. Buckets split in this round: Bucket to be split If ( h search key value ) Level Next is in this range, must use h ( search key value ) Level+1 Buckets that existed at the to decide if entry is in beginning of this round: `split image' bucket. this is the range of h Level `split image' buckets: created (through splitting of other buckets) in this round
Linear Hashing (Contd.) • Insert: Find bucket by applying hround / hround+1: • If bucket to insert into is full: • Add overflow page and insert data entry. • (Maybe) Split Next bucket and increment Next. • Can choose any criterion to `trigger’ split. • Since buckets are split round-robin, long overflow chains don’t develop! • Doubling of directory in Extendible Hashing is similar; switching of hash functions is implicit in how the # of bits examined is increased.
Another Example of Linear Hashing • On split, hLevel+1 is used to re-distribute entries. Round=0, N=4 Round=0 PRIMARY h h OVERFLOW h h PRIMARY PAGES 0 0 1 1 PAGES PAGES Next=0 32* 32* 44* 36* 000 00 000 00 Next=1 Data entry r 9* 5* 9* 5* 25* 25* with h(r)=5 001 001 01 01 30* 30* 10* 10* 14* 18* 14* 18* Primary 10 10 010 010 bucket page 31* 35* 7* 31* 35* 7* 11* 11* 43* 011 011 11 11 (This info is for illustration only!) (The actual contents of the linear hashed file) 100 44* 36* 00
Example: End of a Round Round=1 PRIMARY OVERFLOW h h PAGES 0 1 PAGES Next=0 Round=0 00 000 32* PRIMARY OVERFLOW PAGES h PAGES h 1 0 001 01 9* 25* 32* 000 00 10 010 50* 10* 18* 66* 34* 9* 25* 001 01 011 11 35* 11* 43* 66* 10 18* 10* 34* 010 Next=3 100 00 44* 36* 43* 11* 7* 31* 35* 011 11 101 11 5* 29* 37* 44* 36* 100 00 14* 22* 30* 110 10 5* 37* 29* 101 01 14* 30* 22* 31* 7* 11 111 110 10
LH Described as a Variant of EH • The two schemes are actually quite similar: • Begin with an EH index where directory has N elements. • Use overflow pages, split buckets round-robin. • First split is at bucket 0. (Imagine directory being doubled at this point.) But elements <1,N+1>, <2,N+2>, ... are the same. So, need only create directory element N, which differs from 0, now. • When bucket 1 splits, create directory element N+1, etc. • So, directory can double gradually. Also, primary bucket pages are created in order. If they are allocated in sequence too (so that finding i’th is easy), we actually don’t need a directory! Voila, LH.
Summary • Hash-based indexes: best for equality searches, cannot support range searches. • Static Hashing can lead to long overflow chains. • Extendible Hashing avoids overflow pages by splitting a full bucket when a new data entry is to be added to it. (Duplicates may require overflow pages.) • Directory to keep track of buckets, doubles periodically. • Can get large with skewed data; additional I/O if this does not fit in main memory.
Summary (Contd.) • Linear Hashing avoids directory by splitting buckets round-robin, and using overflow pages. • Overflow pages not likely to be long. • Duplicates handled easily. • Space utilization could be lower than Extendible Hashing, since splits not concentrated on `dense’ data areas. • Can tune criterion for triggering splits to trade-off slightly longer chains for better space utilization. • For hash-based indexes, a skewed data distribution is one in which the hash values of data entries are not uniformly distributed!