250 likes | 372 Views
More Trees. Multiway Trees and 2-4 Trees. Motivation of Multi-way Trees. Main memory vs. disk Assumptions so far: We have assumed that we can store an entire data structure in the main memory of a computer. What if we have more data than can fit in main memory?
E N D
More Trees Multiway Trees and 2-4 Trees
Motivation of Multi-way Trees • Main memory vs. disk • Assumptions so far: • We have assumed that we can store an entire data structure in the main memory of a computer. • What if we have more data than can fit in main memory? • Meaning that we must have the data structure reside on disk. • The rules of the game change, because the Big-Oh model doesn’t apply if all operations are not equal.
Motivation of Multi-way Trees • Main memory vs. disk • Disk Accesses are incredibly expensive. • 1 disk access is worth about 4,000,000 instructions. • (See the book for derivation) • So we’re willing to do lots of calculations just to save disk accesses.
Motivation of Multi-way Trees • For example: • Suppose we want to access the driving records of the citizens of Florida. • 10 million items. • Assume doesn’t fit in main memory. • Assume in 1 sec, can execute 25 million instructions or perform 6 disk accesses. • The Unbalanced Binary tree would be a disaster. • In the worst case, it has linear depth and could require 10 mil disk accesses. • An AVL Tree • In the typical case, it has a depth close to log N, log 10 mil ≈ 24 disk accesses, requiring 4 sec.
The point is… • Reduce the # of disk accesses to a very small constant, • Such as 3 or 4 • And we are willing to write complicated code to do this, because in comparison machine instructions are essentially free. • As long as we’re not ridiculous. • We cannot go below log N using a BST • Even an AVL • Solution?? • More branching, less height. • Multiway Tree
Multiway Tree (or B-tree in book) • Multiway tree is similar to BST • In a BST • We need 1 key to decide which of 2 branches to take. • In an Multiway tree • We need M-1 keys to decide which branch to take, where M is the order of the Multiway tree. • Also need to balance, • Otherwise like a BST it could degenerate into a linked list. • Here is a Multiway tree of Order 5:
Specifications of a MultiwayTree If a node contains k items (I1, I2.. Ik), it will contain k+1 subtrees. Specifically subtrees S1 thru Sk+1 I1 I2 I3 10 20 30 • All values in Sk+1 >Ik. • All values in S2 < I2, but ≥ to I1. • All values in S3 < I3, but ≥ I2. • All values in S1 < I1. S1 S2 S3 Sk+1 3 7 _ 15 _ _ 22 28 _ 40 50 60 1___ 345_ 78__ 1012 _ _ 1517_ _ 2021__ 22 26 _ _ 2829_ _ 30___ 4045 _ _ 55_ _ _ 6570_ _
2-4 Trees • 2-4 Trees • Specific type of multitree. • Every node must have in between 2 and 4 children. (Thus each internal node must store in between 1 and 3 keys) • All external nodes (the null children of leaf nodes) and leaf nodes have the same depth.
Example of a 2-4 tree 10 20 30 3 7 _ 15 _ _ 22 28 _ 40 50 60 1___ 345_ 78__ 1012 _ _ 1517_ _ 2021__ 22 26 _ _ 2829_ _ 30___ 4045 _ _ 55_ _ _ 6570_ _
Insert • Insert 4 into the 2-4 tree below. Compare 4 to vals in root node. 4 < 10, 4 goes in the subtree to left of 10. 10 20 30 4 > 3 and 4 < 7, 4 goes in the subtree to right of 3 and left of 7 3 7 _ 15 _ _ 22 28 _ 40 50 60 1__ 5__ 45_ 8__ 12 _ _ 17_ _ 21____ 26 _ _ 2829__ _ _ _ 45 _ _ 55_ _ 6570_
Problems with Insert • What if the node that a value gets inserted into is full? • We could just insert the value into a new level of the tree. • BUT then not ALL of the external nodes will have the same depth after this. • Insert 18 into the following tree: 10 20 _ 37_ 131517 22_ _
Problems with Insert • What if the node that a value gets inserted into is full? • We could just insert the value into a new level of the tree. • BUT then not ALL of the external nodes will have the same depth after this. • Insert 18 into the following tree: 10 20 _ 37_ 13151718 22_ _ • The node has too many values. • You can send one of the values to the parent (the book’s convention is to sent the 3rd value.
Problems with Insert • What if the node that a value gets inserted into is full? • We could just insert the value into a new level of the tree. • BUT then not ALL of the external nodes will have the same depth after this. • Insert 18 into the following tree: 10 17 20 37_ 1315_ 18____ 22_ _ • Moving 17, forces you to “split” the other three values.
Other Problems with Insert • In the last example, the parent node was able to “accept” the 17. • What if the parent root node becomes full? • Insert 12 into the 2-4 Tree below: 10 20 30 5_ 111417_ 25____ 3237_ _
Other Problems with Insert • In the last example, the parent node was able to “accept” the 17. • What if the parent root node becomes full? • Insert 12 into the 2-4 Tree below: 10 20 30 5_ 11 121417 25____ 3237_ _ • Using the rule from before, let’s move 12 up.
Other Problems with Insert • In the last example, the parent node was able to “accept” the 17. • What if the parent root node becomes full? • Insert 12 into the 2-4 Tree below: 10 14 20 30 5_ 11 12 17____ 25____ 3237_ _ • Using the rule from before, let’s move 14 up.
Other Problems with Insert • In the last example, the parent node was able to “accept” the 17. • What if the parent root node becomes full? • Insert 12 into the 2-4 Tree below: 10 14 20 30 5_ 11 12 25____ 17____ 3237_ _ • Now this has too many parent nodes AND subtrees! • We can just repeat the process and make a new root.
Other Problems with Insert • In the last example, the parent node was able to “accept” the 17. • What if the parent root node becomes full? • Insert 12 into the 2-4 Tree below: 20 30 10 14 5_ 11 12 17____ 25____ 3237_ _ • Now this has too many parent nodes AND subtrees! • We can just repeat the process and make a new root.
Deletion from a 2-4 Tree • Delete a non-leaf value • Replace that value with the largest value in its left subtree • Or smallest value in its right subtree.
Deletion from a 2-4 Tree • Delete a leaf node • In the standard case: • a value can simply be removed from a leaf node that stores more than one value. • Requires no structural change. 10 20 30 10 20 30 Delete 5 57_ 1217__ 232735 35_ _ 7_ 1217__ 232735 35_ _
Deletion from a 2-4 Tree • BUT what if the leaf node has ONLY one value? • If you get rid of it, • then it would violate the 2-4 property that all leaf nodes MUST be on the same height of the tree. • Break up into 2 cases: • An adjacent sibling has more than one value stored in its node. • An adjacent sibling does NOT have more than one value stored in its node, and a fusion operation MUST be performed.
Deletion from a 2-4 Tree • Case 1: • Consider deleting 5 from the following tree: • An adjacent sibling has more than one value stored in its node. • Take the 10 to replace the 5, • And then simply replace the 10 with the smallest value in its right subtree. • This is okay, since there is more than one value at this subtree. 10 20 30 12 20 30 Delete 5 5__ 1217__ 2327_ 35_ _ 10_ __ 17_ __ 2327_ 35_ _
Deletion from a 2-4 Tree • Case 2: • If an adjacent sibling does NOT have more than one value stored in its node, and a fusion operation MUST be performed. • The fusion operation is a little more difficult since it may result in needing another fusion at a parent node. • We have 3 child nodes when we should have 4. Thus we can drop a value, we will drop 10. 10 20 30 10 20 30 10 20 30 Delete 5 Fuse empty node with 15 5__ 15_ __ 25_ _ 35_ _ __ 15_ __ 25_ _ 35_ _ 15_ __ 25_ _ 35_ _
Deletion from a 2-4 Tree • Case 2: • If an adjacent sibling does NOT have more than one value stored in its node, and a fusion operation MUST be performed. • The fusion operation is a little more difficult since it may result in needing another fusion at a parent node. • We have 3 child nodes when we should have 4. Thus we can drop a value, we will drop 10. 10 20 30 20 30 Drop one parent into fused node 15_ __ 25_ _ 35_ _ 1015 __ 25_ _ 35_ _