1.5k likes | 1.52k Views
CSE 326: Data Structures Binary Search Trees. Today’s Outline. Dictionary ADT / Search ADT Quick Tree Review Binary Search Trees. ADTs Seen So Far. Stack Push Pop Queue Enqueue Dequeue. Priority Queue Insert DeleteMin. Then there is decreaseKey …. Need pointer ! Why?
E N D
Today’s Outline • Dictionary ADT / Search ADT • Quick Tree Review • Binary Search Trees
ADTs Seen So Far • Stack • Push • Pop • Queue • Enqueue • Dequeue • Priority Queue • Insert • DeleteMin Then there is decreaseKey… Need pointer! Why? Because find not efficient.
The Dictionary ADT • jfogartyJamesFogartyCSE 666 • phenryPeterHenryCSE 002 • boqinBoQinCSE 002 • Data: • a set of(key, value) pairs • Operations: • Insert (key, value) • Find (key) • Remove (key) insert(jfogarty, ….) find(boqin) • boqin • Bo, Qin, … The Dictionary ADT is also called the “Map ADT”
A Modest Few Uses • Sets • Dictionaries • Networks : Router tables • Operating systems : Page tables • Compilers : Symbol tables Probably the most widely used ADT!
Implementations insert find delete • Unsorted Linked-list • Unsorted array • Sorted array Θ(1) Θ(n) Θ(n) Θ(1) Θ(n) Θ(n) log n + nΘ(log n) log n + n SO CLOSE! What limits the performance? Time to move elements, can we mimic BinSearch with BST?
t Tree Calculations Recall: height is max number of edges from root to a leaf Find the height of the tree... runtime:
Tree Calculations Example A How high is this tree? B C D E F G H I J K L L M N
+ * 5 2 4 More Recursive Tree Calculations:Tree Traversals A traversal is an order for visiting all the nodes of a tree Three types: • Pre-order: Root, left subtree, right subtree • In-order: Left subtree, root, right subtree • Post-order: Left subtree, right subtree, root (an expression tree)
Inorder Traversal void traverse(BNode t){ if (t != NULL) traverse (t.left); process t.element; traverse (t.right); } }
Data left pointer right pointer Binary Trees • Binary tree is • a root • left subtree (maybe empty) • right subtree (maybe empty) • Representation: A B C D E F G H I J
A F E D C B left pointer left pointer left pointer left pointer left pointer left pointer right pointer right pointer right pointer right pointer right pointer right pointer Binary Tree: Representation A B C D E F
A A A B C B C B C D E F D E F G D E F G H I Binary Tree: Special Cases Complete Tree Perfect Tree Full Tree
Binary Tree: Some Numbers! For binary tree of height h: • max # of leaves: • max # of nodes: • min # of leaves: • min # of nodes: Average Depth for N nodes? 2h, for perfect tree 2h+1 – 1, for perfect tree 1, for “list” tree h+1, for “list” tree
Binary Search Tree Data Structure • Structural property • each node has 2 children • result: • storage is small • operations are simple • average depth is small • Order property • all keys in left subtree smallerthan root’s key • all keys in right subtree larger than root’s key • result: easy to find any given key • What must I know about what I store? 8 5 11 2 6 10 12 4 7 9 14 13
Example and Counter-Example 5 8 4 8 5 11 1 7 11 2 6 10 18 7 3 4 15 20 21 BINARY SEARCH TREES?
10 5 15 2 9 20 17 7 30 Find in BST, Recursive Node Find(Object key, Node root) { if (root == NULL) return NULL; if (key < root.key) return Find(key, root.left); else if (key > root.key) return Find(key, root.right); else return root; } Runtime:
10 5 15 2 9 20 17 7 30 Find in BST, Iterative Node Find(Object key, Node root) { while (root != NULL && root.key != key) { if (key < root.key) root = root.left; else root = root.right; } return root; } Runtime:
Insert in BST 10 Insert(13) Insert(8) Insert(31) 5 15 2 9 20 17 7 30 Insertions happen only at the leaves – easy! Runtime:
BuildTree for BST • Suppose keys 1, 2, 3, 4, 5, 6, 7, 8, 9 are inserted into an initially empty BST. Runtime depends on the order! • in given order • in reverse order • median first, then left median, right median, etc.
Bonus: FindMin/FindMax • Find minimum • Find maximum 10 5 15 2 9 20 17 7 30
Deletion in BST 10 5 15 2 9 20 17 7 30 Why might deletion be harder than insertion?
Lazy Deletion Instead of physically deleting nodes, just mark them as deleted • simpler • physical deletions done in batches • some adds just flip deleted flag • extra memory for “deleted” flag • many lazy deletions = slow finds • some operations may have to be modified (e.g., min and max) 10 5 15 2 9 20 17 7 30
Non-lazy Deletion • Removing an item disrupts the tree structure. • Basic idea: find the node that is to be removed. Then “fix” the tree so that it is still a binary search tree. • Three cases: • node has no children (leaf node) • node has one child • node has two children
Non-lazy Deletion – The Leaf Case Delete(17) 10 5 15 2 9 20 17 7 30
Deletion – The One Child Case Delete(15) 10 5 15 2 9 20 7 30
Deletion – The Two Child Case 10 Delete(5) 5 20 2 9 30 7 • A value guaranteed to be • between the two subtrees! • succ from right subtree • - pred from left subtree What can we replace 5 with?
Deletion – The Two Child Case Idea: Replace the deleted node with a value guaranteed to be between the two child subtrees Options: • succ from right subtree: findMin(t.right) • pred from left subtree : findMax(t.left) Now delete the original node containing succ or pred • Leaf or one child case – easy!
Finally… 10 7 replaces 5 7 20 2 9 30 Original node containing 7 gets deleted
Balanced BST Observation • BST: the shallower the better! • For a BST with n nodes • Average height is O(log n) • Worst case height is O(n) • Simple cases such as insert(1, 2, 3, ..., n)lead to the worst case scenario Solution: Require a Balance Condition that • ensures depth is O(log n) – strong enough! • is easy to maintain – not too strong!
Potential Balance Conditions • Left and right subtrees of the roothave equal number of nodes 2. Left and right subtrees of the roothave equal height Too weak! Do height mismatch example Too weak! Do example where there’sa left chain and a right chain, no other nodes
Potential Balance Conditions 3. Left and right subtrees of every nodehave equal number of nodes 4. Left and right subtrees of every nodehave equal height Too strong! Only perfect trees Too strong! Only perfect trees
Balanced BST Observation • BST: the shallower the better! • For a BST with n nodes • Average height is O(log n) • Worst case height is O(n) • Simple cases such as insert(1, 2, 3, ..., n)lead to the worst case scenario Solution: Require a Balance Condition that • ensures depth is O(log n) – strong enough! • is easy to maintain – not too strong!
Potential Balance Conditions • Left and right subtrees of the roothave equal number of nodes • Left and right subtrees of the roothave equal height 3. Left and right subtrees of every nodehave equal number of nodes 4. Left and right subtrees of every nodehave equal height
The AVL Balance Condition Adelson-Velskii and Landis AVL balance property: Left and right subtrees of every nodehave heights differing by at most 1 • Ensures small depth • Will prove this by showing that an AVL tree of heighth must have a lot of (i.e. O(2h)) nodes • Easy to maintain • Using single and double rotations
The AVL Tree Data Structure Structural properties • Binary tree property (0,1, or 2 children) • Heights of left and right subtrees of every nodediffer by at most 1 Result: Worst case depth of any node is: O(log n) Ordering property • Same as for BST 8 5 11 2 6 10 12 4 7 9 13 14 15
6 4 8 1 5 7 11 3 2 AVL trees or not? 6 4 8 1 11 7 12 10
Proving Shallowness Bound Let S(h) be the min # of nodes in anAVL tree of height h Claim: S(h) = S(h-1) + S(h-2) + 1 Solution of recurrence: S(h) = O(2h)(like Fibonacci numbers) AVL tree of height h=4with the min # of nodes (12) Trees of height h = 1, 2, 3 …. 8 5 11 2 6 10 12 7 9 13 14 15
Testing the Balance Property We need to be able to: 1. Track Balance 2. Detect Imbalance 3. Restore Balance 10 5 15 2 9 20 7 17 30 NULLs have height -1
An AVL Tree 10 data 3 3 height 10 children 2 2 5 20 1 1 0 0 15 30 2 9 0 0 7 17 Track height at all times. Why?
AVL trees: find, insert • AVL find: • same as BST find. • AVL insert: • same as BST insert, except may need to “fix” the AVL tree after inserting new value.
AVL tree insert Let x be the node where an imbalance occurs. Four cases to consider. The insertion is in the • left subtree of the left child of x. • right subtree of the left child of x. • left subtree of the right child of x. • right subtree of the right child of x. Idea: Cases 1 & 4 are solved by a single rotation. Cases 2 & 3 are solved by a double rotation.
Bad Case #1 Insert(6) Insert(3) Insert(1) Where is AVL property violated?
Fix: Apply Single Rotation AVL Property violated at this node (x) 2 1 6 3 1 3 0 0 1 6 0 1 Single Rotation: 1. Rotate between x and child
Single rotation in general a b Z h X Y h h h -1 X < b < Y < a < Z b a X Z h+1 Y h h Height of tree before? Height of tree after? Effect on Ancestors?
Bad Case #2 Insert(1) Insert(6) Insert(3)
Fix: Apply Double Rotation AVL Property violated at this node (x) 2 2 1 1 1 3 1 1 3 6 0 0 0 1 6 0 6 3 Intuition: 3 must become root Double Rotation 1. Rotate between x’s child and grandchild 2. Rotate between x and x’s new child
Double rotation in general h 0 a b Z c h W h h -1 X Y h-1 W < b <X < c < Y < a < Z c b a X Y Z W h-1 h h h Height of tree before? Height of tree after? Effect on Ancestors?
Double rotation, step 1 15 8 17 16 4 10 6 3 5 15 8 17 6 16 10 4 3 5