860 likes | 978 Views
CSC 213 – Large Scale Programming. Lecture 39: Greek Tragedy & Balanced Trees. Today’s Goals. Review a new search tree algorithm is needed What real-world problems occur with old tree? Why does garbage collection make problem worse? What was ideal approach? How could we force this?
E N D
CSC 213 – Large Scale Programming Lecture 39: Greek Tragedy & Balanced Trees
Today’s Goals • Review a new search tree algorithm is needed • What real-world problems occur with old tree? • Why does garbage collection make problem worse? • What was ideal approach? How could we force this? • Consider how to create other search tree types • Not limit nodes to 1 element & what could happen? • How to perform insertions on multi-nodes? • What about withdrawal? How can we remove data? • Can this sound dirtier? And do I hear banjos playing?
Dictionary ADT • Dictionary and Mapmaps keys to values • O(1) time with hash, but only if hash is good • Can guarantee better -- O(logn) with balanced BST • Assumes data fits in memory since locality will suck • But, honestly, how big can a tree be?
Dictionary ADT • Dictionary and Mapmaps keys to values • O(1) time with hash, but only if hash is good • Can guarantee better -- O(logn) with balanced BST • Assumes data fits in memory since locality will suck • But, honestly, how big can a tree be? • Library of Congress – 20 TB in text database • Amazon.com – 42 TB of combined data • ChoicePoint – 250 TB of data on everyday Americans • World Data Center for Climate – 4 PB of climate data
Dictionary ADT • Dictionary and Mapmaps keys to values • O(1) time with hash, but only if hash is good • Can guarantee better -- O(logn) with balanced BST • Assumes data fits in memory since locality will suck • But, honestly, how big can a tree be? • Library of Congress – 20 TB in text database • Amazon.com – 42 TB of combined data • ChoicePoint – 250 TB of data on everyday Americans • World Data Center for Climate – 4 PB of climate data (Numbers gathered from Feb. 2007 article)
Optimal Tree Partition But no GC algorithm produces this!
Real-World Big Search Trees • Excellent way to test roommatessystem
Real-World Big Search Trees • Excellent way to test roommatessystem
Real-World Big Search Trees • Excellent way to test roommatessystem
(a,b) Trees to the Rescue! • General solution to frequent hikes to Germany • Linux & MacOS to track files & directories • MySQL & other databases use this to hold all the data • Found in many other places where paging occurs • Simple rules define working of any (a,b) tree • Grows upward so that all leaves found at same level • At leasta children for each internal node • Every internal node has at mostb children
What is “the BTree?” • Common multi-way tree implementation • Describe B-Tree using order (“BTree of order m”) • m/2 to m children per internal node • Root node can have m or fewer elements • Many variants existto improve some failing • Each variant is specialized for some niche use • Minor differences only between each variant • Will just describe most basic B-Tree during lecture
BTree Order • Select order minimizing paging when created • Elements & references to kids in full node fills page • Nodes have at least m/2 elements, even at their smallest • In memory guarantees each page is at least 50% full • How many pages touched by each operation?
Multi-Way Search Tree • Nodes contain multiple elements • Tree grows up with leaves always at same level • Each internal node: • At least 2 children • 1fewer Entrys than children • Entrys sorted from smallest to largest 11 24 2 6 8 15 27 30
Multi-Way Search Tree • Children v1v2v3 … vd& keys k1k2 … kd-1 • Keys in subtreev1smaller than k1 • Keys in subtreevibetweenki-1andk2 • Keys in subtreevdgreater than kd-1 1124 2 6 8 15 27 30
Inorder Traversal • Visit each child, vi , before visiting Entryei • As with BST, visits keys in increasing order 11 24 6 4 2 6 8 15 27 30 8 1 2 3 5 7
Multi-Way Searching • Similar to BST treeSearch finding a key fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) 11 24 2 6 8 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 2 6 8 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 268 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 268 15 27 30
Multi-Way Searching fori = 0tonumChildren – 1 doifk < e[i].getKey()thenreturn search(child[i])ifk == e[i].getKey()thenreturn e[i] endfor ifk > e[e.length-1].getKey()thenreturn search(child[child.length-1]) • Example: find(8) 11 24 268 15 27 30
(2,4) Trees • Multi-way search treewith 2 properties: • Node-Size Property Internal nodes have at most 4 children • Depth PropertyAll external nodes at same depth • Nodes are either 2-node, 3-node or 4-node • Node’s number of childrenused as basis for name 10 15 24 2 8 12 18 27 32
Insertion • Start by searchingfor key k • Entryadded to lastinternal node searched • Depth property preserved by enforcing this • Example: insert(30) 10 15 24 2 8 12 18 27 32
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 10 15 24 2 8 12 18 27 32
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 1015 24 2 8 12 18 27 32
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 27 32
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 27 32
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 27 32
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 27 32
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 2732
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 2732
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 2732
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 2732
Insertion • Start by searchingfor key k • Entryadded to last internal node searched • Depth property preserved by enforcing this • Example: insert(30) 101524 2 8 12 18 273032
Insertion • Insertion may cause overflow! • 5-node created by the insertion • This would make it violateNode-Size property 15 24 12 18 27 32 35
Insertion • Insertion may cause overflow! • 5-node created by the insertion • This would make it violateNode-Size property 15 24 12 18 27303235
In Case Of Overflow Split Node • Split 5-node into 2 new nodes • Entryse1e2& children v1v2v3 become a 3-node • 2-node created with Entry e4& children v4v5 15 24 12 18 27 30 32 35
In Case Of Overflow Split Node • Split 5-node into 2 new nodes • Entryse1e2& children v1v2v3 become a 3-node • 2-node created with Entry e4& children v4v5 • Promote e3to parent node • If overflow occurs in root node, create new root • Overflow can cascade when parent already was 4-node 15 24 32 15 24 12 18 27 30 3235 12 18 27 30 35
Parent Overflow • In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 32 35
Parent Overflow • In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 32 35
Parent Overflow • In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 32 35
Parent Overflow • In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 32 35
Parent Overflow • In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 29 32 35
Parent Overflow • In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 29 32 35
Parent Overflow • In case of cascade, repeat overflow process • Works identically to when children are external • Example: insert(29) 15 24 26 12 18 25 27 29 32 35