Indexing

Indexing CS157B Lecture 9

Contents • Basic Concepts • Ordered Indices • B+ - Tree Index Files • B- Tree Index Files

Basic Concepts

Index Evaluation Metrics

Ordered Indices

B+-Tree Index Files

B+-Tree Node Structure • Typical node • K i are the search-key values • Pi are pointers to children (for no-leaf nodes) or pointers to records or buckets of records (for leaf nodes). • The search-keys in a node are ordered • K1<K2<K3<…<Kn-1

Example of a B+-tree

Example of B+-tree ! Leaf nodes must have between 2 and 4 values (.(n–1)/2. and n –1, with n = 5). ! Non-leaf nodes other than root must have between 3 and 5 children (.(n/2. and n with n =5). ! Root must have at least 2 children.

Queries on B+-Trees

Queries on B+-Trees (Cont.) above difference is significant since every node access may need a disk I/O, costing around 20 milliseconds!

B+-Tree File Organization • Index file degradation problem is solved by using B+-Tree indices. Data file degradation problem is solved by using B+-Tree File Organization. • The leaf nodes in a B+-tree file organization store records, instead of pointers. • Since records are larger than pointers, the maximum number of records that can be stored in a leaf node is less than the number of pointers in a nonleaf node. • Leaf nodes are still required to be half full. • Insertion and deletion are handled in the same way as insertion and deletion of entries in a B+-tree index.

B+-Tree File Organization (Cont.) Example of B+-tree File Organization

B-Tree Index Files • Similar to B+-tree, but B-tree allows search-key values toappear only once; eliminates redundant storage of search keys. • Search keys in nonleaf nodes appear nowhere else in the B tree; an additional pointer field for each search key in a nonleaf node must be included. • Generalized B-tree leaf node Nonleaf node – pointers Bi are the bucket or file record pointers.

B-Tree Index File Example

B-Tree Index Files (Cont.) Advantages of B-Tree indices: • May use less tree nodes than a corresponding B+-Tree. • Sometimes possible to find search-key value before reaching leaf node. Disadvantages of B-Tree indices: • Only small fraction of all search-key values are found early • Non-leaf nodes are larger, so fan-out is reduced. Thus B-Trees typically have greater depth than corresponding B+-Tree • Insertion and deletion more complicated than in B+-Trees • Implementation is harder than B+-Trees. Typically, advantages of B-Trees do not out weigh disadvantages.

Data File Block 1 Block 2 Block 3 Adams Becker Dumpling Actual Value Address Block Number Dumpling Harty Texaci ... 1 2 3 … Getta Harty Mobile Sunoci Texaci Index Sequential

001 003 . . 150 251 . . 385 Key Value Key Value Key Value Address Address Address Key Value Address 455 480 . . 536 785 805 536 678 150 385 3 4 5 6 1 2 385 678 805 7 8 9 … 605 610 . . 678 705 710 . . 785 791 . . 805 Indexed Sequential: Two Levels

Indexed Random • Key values of the physical records are not necessarily in logical sequence • Index may be stored and accessed with Indexed Sequential Access Method • Index has an entry for every data base record. These are in ascending order. The index keys are in logical sequence. Database records are not necessarily in ascending sequence. • Access method may be used for storage and retrieval

Becker Harty Actual Value Address Block Number Adams Becker Dumpling Getta Harty 2 1 3 2 1 Adams Getta Dumpling Indexed Random

F | | P | | Z | B | | D | | F | H | | L | | P | R | | S | | Z | Devils Hawkeyes Hoosiers Minors Panthers Seminoles Aces Boilers Cars Flyers Btree

Inverted • Key values of the physical records are not necessarily in logical sequence • Access Method is better used for retrieval • An index for every field to be inverted may be built • Access efficiency depends on number of database records, levels of index, and storage allocated for index

Student name Course Number CH 145 101, 103,104 CH145 cs201 ch145 ch145 cs623 cs623 Adams Becker Dumpling Getta Harty Mobile Actual Value Address Block Number CH 145 CS 201 CS 623 PH 345 1 2 3 … CS 201 102 CS 623 105, 106 Inverted

Direct • Key values of the physical records are not necessarily in logical sequence • There is a one-to-one correspondence between a record key and the physical address of the record • May be used for storage and retrieval • Access efficiency always 1 • Storage efficiency depends on density of keys • No duplicate keys permitted

So Far • So far, when we do runtime analysis, we give each operation one time unit • Actually, we've been assuming that they are close enough in run time that every operation can be considered the same • This, of course, is not realistic, as things like hard drives and networking are much slower than anything we can calculate in the computer

Why the Processor and Main Memory Are Good • A processor can do about 2.5 billion instructions per second on a higher-end home PC these days • Data stored in main memory is accessible in a speed to match the processor's speed • Imagine that we are storing a tree of 100 million elements (imagine, the number of bank transactions for a given month or year)

Continued • Even if it takes 20 CPU instructions (too many) to traverse a single node of binary search tree (accessing data and processing it), we can still access 125 million nodes per second • This is all of the elements of a completely linear tree 1.25 times per second • Imagine that it takes 32 bytes to represent a key into the tree (what we order on) and 1k bytes to store the data, then we need to store roughly 100,000,000,000 bytes of data, or need 100 GB of RAM to run this procedure

Continued • Of course, 400 MB RAM is not out of question, but that leaves no other RAM to run anything else (like the OS perhaps) among other things • So, processor/main memory is fast, but not very practical, storing 100 GB of data on a hard drive is nothing these days, however

Hard Drives • While speeds of processors go up rapidly, hard drive capacity goes up • But what about speed? • Most drives today run 7,200 RPM • To get data, we have to rotate a disk 0.5 of a rotation, so about 4.1 ms • So we can do about 250 accesses per second • Remember processors? That was 125 million accesses per second

Rough Comparisons • Based on our rough comparisons, a piece of data in main memory can be accessed 500,000 faster than data on a hard drive • But the reality is that main memory is very, very expensive, and hard drives are cheap with lots of storage • So we want to go with hard drives, but we need a way to improve the parts of the runtimes that are slow for hard drives

Past BSTs • If you look at our past BSTs, even the good balanced trees, we'd have to do, at best, an average of O(log n) calculations to find an element • log 100000000 = about 26 • So, in a balanced BST, we would have to do 26 node accesses in the worst case • This is a nearly immeasurable fraction of time if we are using only main memory, but its about 1/10th of a second from a hard drive

Goal • Our goal is to make a tree such that the number of accesses to find a node is greatly reduced • If we can do a lot of heavy, fast calculations to do very few disk operations, we will have improved the runtime greatly • Its okay to do a lot of calculations -- in the time it takes for one disk access, we can do 125 million instructions in the processor

Height • The biggest problem is tree height • Think about what happens in a BST... access a node decide which way to go repeat for the direction desired • We have to do this for every node in the path, which can cause problems • We want a tree with a smaller height, and level, we'll access the disk only once • To reduce height, we have a relatively simple solution: increase branching

The Good Ol' N-ary Tree • We mostly always talk about binary trees, for a good reason: in main memory, who cares what the height is? We have plenty of speed O(log n) is good enough • In that, is hidden details, as log n is actually log2n • In a trinary tree, the runtime is O(log n) also, but (as with constants) we left out the base, and it really goes in log3n time • This may not make a huge difference for main memory, but for disk access, its 10 less accesses, or about 0.4 sec (a huge improvement)

The B-Tree • What the B-tree is is a collection of data all collected at the leaf level, made up of an n-ary tree, • An n-ary tree works like a BST, only we do a slightly more complicated calculation to figure out which path to take • We also need to guarantee that the n-ary tree will be balanced, or it may worsen into a regular BST, this is the first property of a B-Tree, good balancing

B-Tree Properties • Like a red-black tree, we have certain rules to follow in our n-ary tree • 1) All data is stored at the leaf level2) Nonleaf nodes store up to n-1 keys to help searching for which path3) The root is a leaf or has 2 to n children4) All nonleaf nodes (except the root) have between M/2 and M children5) All leaves are at the same depth and have between M/2 and M children, except if the root is a leaf

What It Looks Like 1, 10, 12 16, 21 30, 41, 42, 43 50, 51 Note that even though we have 11 elements in the tree, the runtime will only involve 2 node accesses, differed by 2.18 for the binary

Better? • This is much better • Now we've decreased the number of nodes to be accessed and decreased the height, making everything more shallow • 4-ary isn't really the way to go here, what ends up happening is usually a much bigger number, like say, 200-ary, but either way, it's still better than binary • Note: we have to assume that all the information on one node is stored contiguously in memory, meaning the hard drive doesn't have to reposition itself (slow!) just for one unfinished node

B-Tree • Now we're ready to get into B-Trees • We need to know how to add to and remove from the B-Tree, and as always, we'll focus on insertion first

Summary of Why • Writing to the hard disk is an expensive operation • In a large system, we'll have to read and write from the disk often, and we want to minimize the number of expensive operations for a good runtime • We need a structure that will take into account the cost of accessing a disk and do that as little as possible while using the structure

Summary of How • When the HD is accessed, a set size block of memory is read • We'll make each node of a tree one of these blocks • For efficiency's sake, we want to maximize the amount of data in these nodes/blocks • With this setup, we can define quite a few algorithms to deal with large-scale blocks of data such as the one governing the structure of a B-tree

The Rules 1) All data is stored at the leaf level.2) Non-leaf nodes store keys to help search with path to a leaf to find the element at3) The root is a leaf or has 2 to n children4) All non-leaf nodes have between M/2 and M children (at least half-full)5) All leaves are at the same level6) All leaves are always at least half-full, unless the root is a leaf itself

Starting Out • So what do we really have if we have an empty B-tree? • Essentially, we have one empty storage container, which is the size of a block on the HD (say, b), and can store b/n elements of size n • So, the root starts out as a leaf, which just stores data

Inserting the First Elements • The root/leaf has a set amount of space, that we can keep adding elements to • We may want to keep it sorted for easier searching • In fact, is there any reason to not keep a leaf sorted? • No, because the time it will take us to order the elements is miniscule compared to how long it will take to write the result to the hard drive • Okay, so, while we have space in the leaf, we add a new element in sorted order

Notes: Element? • This is more of a databases topic, but what is normally stored in the B-Tree is a reference to another object or piece of data, which may be, for instance, a memory location to somewhere else on the disk • We also need to store a key that something is being searched on • We are keeping, then, a {key, object} pair, where key is simply a search item (like a ID number) and an object is just a pointer to something else • We'll learn lots more about {key, object} pairs when we start talking about hashtables

Uh Oh! Full! • It works okay to keep adding elements, but as we said, each block is a set amount of space, what happens when we fill it up with key-object pairs? • Recall one of the rules of a B-tree: all leaves are always at least half-full • If we know this, we know we can take that original, full leaf node, and split it into two nodes

Notes: Split? • Every block on an HD is the same size, and each element (key, object) occupies the same size • If we split the number of elements down the middle, then we'll create two half-full nodes

Notes: Full? • Just when does a tree become full, or, more importantly, when do we do a split of a leaf node? • The popular and common way is whenever you try to add one element to a completely full node, you split the node and then add the element again • Another way I like is to check if a split is needed after an addition, this way, you don't have to keep track of the element to be added during the split -- it's already there

The Split • When the node gets full, we'll split it into two nodes • But what happens then? This is a tree, we need connections to our leaf nodes • In the first case of a split, that is, when the root is the leaf and it splits, we end up with two leaves, which would give us reason to believe we should make a new, non-leaf node which connects the two leaf nodes together

Notes: The Non-leaf Node • A leaf node stores key-object pairs in sorted order...what would the non-leaf node store? • One of our rules: Non-leaf nodes store keys to help search with path to a leaf to find the element at • So, the leaf-nodes will store keys, they also need to store links to other nodes

Indexing

Indexing

Presentation Transcript

Indexing

Indexing:

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing

Indexing