700 likes | 793 Views
15-211 Fundamental Data Structures and Algorithms. Fast Binary Search Trees. Peter Lee January 28, 2003. Announcements. Homework 2 available! Due next Monday, Feb.3, 11:59pm! More involved than hw1 Start early! It is very important that you read the book:
E N D
15-211Fundamental Data Structures and Algorithms Fast Binary Search Trees Peter Lee January 28, 2003
Announcements • Homework 2 available! • Due next Monday, Feb.3, 11:59pm! • More involved than hw1 • Start early! • It is very importantthat you read the book: • Section 19.2-19.4 on AVL trees • Next time: Chapter 20 on hashing
Binary trees • A binary tree is either • empty (we'll write nil for clarity), or • looks like (x,L,R) where • x is an object, and • L, R are binary subtrees
Binary search trees (BSTs) A binary tree T is a binary search tree iff flat(T) is an ordered sequence. Equivalently, in (x,L,R) all the nodes in L are less than x, and all the nodes in R are larger than x.
5 3 7 2 4 6 9 Example flat(T) = 2,3,4,5,6,7,9
Binary search How does one search in a BST? Inductively: search(x,nil) = false search(x,(x,L,R)) = true search(x,(a,L,R)) = search(x,L) x<a search(x,(a,L,R)) = search(x,R) x>a
Log-time vs linear time • logk(N) = O(N) for any constant k. • I.e, logarithms grow very slowly.
2 3 4 5 6 7 A bad tree 9 flat(T) = 2,3,4,5,6,7,9
Balanced trees • Intuitively, one way to ensure good behavior is to make “balanced” search trees. • If always balanced, then operations such as search and insert will require only O(log(N)) comparisons.
5 3 7 2 4 6 9 Example flat(T) = 2,3,4,5,6,7,9
Forcing good behavior It is clear (?) that for any n inputs, there always is a BST containing these elements of logarithmic depth. But if we just insert the standard way, we may build a very unbalanced, deep tree. Can we somehow force the tree to remain shallow? At low cost?
AVL trees • Adelson-Velskii-Landis (AVL) trees are binary search trees that impose an additional representation invariant on BSTs: • Every node most have left and right subtrees of similar height. • “Similar height” means the difference is at most 1.
5 5 6 5 3 3 2 2 7 8 7 7 1 1 2 8 4 4 4 6 6 9 9 3 5 AVL-Trees An AVL-tree is a BST with the property that at every node the difference in the depth of the left and right subtree is at most 1. OK not OK
AVL implies shallow Claim: Any AVL-tree of depth d has at least Fd+3-1 nodes where Fn is the nth Fibonacci number. Why? Because we may assume it's true for the left and right subtrees. And, Fn is approximately The depth is < 1.44log(n+2)-1.328 = O(log n)
Implementing AVL trees • The main complications are insertion and deletion of nodes. • For deletion: • Don’t actually delete nodes. • Just mark nodes as deleted. • Called lazy deletion.
Is Lazy Deletion OK? • Yes: • If number of non-deleted nodes is about the same as deleted nodes, then depth is only small constant greater (on average). • Helpful note: See the log calculator at http://www.math.utah.edu/~alfeld/math/Log.html
3 7 2 4 6 9 Lazy deletion 5 On average, deleting even half of the nodes by marking leads to depth penalty of only 1.
Nodes • AVL nodes contain the following information. • Node value. • Left and right children. • Lazy deletion flag. • Height of this tree.
BST nodes in Java Based on Weiss, pg 607: class BinaryNode { Comparable element; BinaryNode left, right; int height; boolean deleted; public BinaryNode(Comparable theElement) { element = theElement; left = right = null; } … }
The Comparable interface • Objects that are Comparable support the method compareTo(). • v.compareTo(w) returns • -1 if v < w, • 0 if v == w, • 1 if v > w, • or throws ClassCastException if neither v nor w can be cast into a class appropriate for the comparison.
Naïve insert method public BinaryNode insert (Comparable x, BinaryNode t) { if (t==null) t = new BinaryNode(x); else if (x.compareTo(t.element) < 0) t.left = insert(x, t.left); else if (x.compareTo(t.element) > 0) t.right = insert(x, t.right); else throw new DuplicateItemException(); return t; }
2 9 1 4 8 7 3 The insertion problem 5 Insertions can violate the balance condition. Nodes from the insertion point up to the root might be out of balance.
Maintaining balance • In order to maintain balance, we will maintain an integer height in each node. • This allows us to detect when a node goes out of balance. • When we detect an out-of-balance condition, we rebalance the deepest out-of-balance node. • This will be enough to regain balance. • But can we do this fast enough?
X Y Z Manipulating BSTs • Write two Java methods. First: • static BinaryNode rotate1 (BinaryNode t); t Z X Y
t Part 2 • Write two Java methods. Second: • static BinaryNode rotate2 (BinaryNode t); Z X X Y1 Y2 Z Y1 Y2
t static BinaryNode rotate1 ( BinaryNode t); static BinaryNode rotate2 ( BinaryNode t); Z t X X X Y1 Y2 Z Z Y Z X Y Y1 Y2 Red-green quiz class BinaryNode { Comparable element; BinaryNode left, right; … }
How trees lose balance • Suppose that a node n goes out of balance. node n height b height a abs(a-b) 1 This is the AVL invariant!!!
How trees lose balance • Suppose that a node n goes out of balance. node n height b height a abs(a-b) > 1 AVL invariant violated
How trees lose balance, cont’d • In fact, in this case the left subtree must not have been a leaf. Not previously a leaf, unless right subtree was empty.
How trees lose balance, cont’d • In fact, in this case the left subtree must not have been a leaf. The deepest node in these subtrees has depth 2 greater than the deepest node in this subtree.
How trees lose balance, cont’d • Two cases to consider:
How trees lose balance, cont’d • Two cases to consider: Inserted into left subtree of left child.
How trees lose balance, cont’d • Two cases to consider: Inserted into right subtree of left child. Inserted into left subtree of left child.
How trees lose balance, cont’d • Note that there are two symmetric cases for the right child.
Regaining balance • In an AVL tree, the balance invariant is regained by performing a rotationon out-of-balance nodes. • The rotations require O(d) where d is the depth of the deepest out-of-balance node. • Thus, regaining balance via rotations requires O(log N) time, since d=log(N)+1.
Depth increased by 1 Depth reduced by 1 X Y Z The single rotation • For the case of insertion into left subtree of left child: Z X Y Deepest node of X has depth 2 greater than deepest node of Z.
X Y Z Double rotation • For the case of insertion into the right subtree of the left child. Z X Y Single rotation fails!
Double rotation • For the case of insertion into the right subtree of the left child. Z X Right subtree is definitely non-empty and has depth 2 greater than Z. Y1 Y2
Double rotation • For the case of insertion into the right subtree of the left child. Z Z X X Y1 Y2 Y1 Y2
Double rotation • For the case of insertion into the right subtree of the left child. Z Z X X Y1 Y2 Y1 Y2
Double rotation • For the case of insertion into the right subtree of the left child. Z X X Y1 Y2 Z Y1 Y2
Symmetry • And don’t forget that there are two symmetric cases for insertions into the left and right subtrees of the right child.
Examples • We don’t need examples! • The rotations clearly restore the AVL representation invariant! • But try to play with the demo at http://www.seanet.com/users/arsen/avltree.html
Announcements • Homework 2 available! • Due next Monday, Feb.3, 11:59pm! • More involved than hw1 • Start early! • It is very importantthat you read the book: • Section 19.2-19.4 on AVL trees • Next time: Chapter 20 on hashing
Binary search trees • Simple binary search trees can have bad behavior for some insertion sequences. • Average case O(log N), worst case O(N). • AVL trees maintain a balance invariant to prevent this bad behavior. • Accomplished via rotations during insert. • Splay trees achieve amortized running time of O(log N). • Accomplished via rotations during find.