240 likes | 324 Views
DCO 20105 Data structures and algorithms. Lecture 9: More on BST Removal of a BST Some advanced balanced BST trees (AVL trees): 234 tree, Red-Black tree -- By Rossella Lau. Re-visit on BST.
E N D
DCO20105 Data structures and algorithms • Lecture 9: More on BST • Removal of a BST • Some advanced balanced BST trees (AVL trees):234 tree, Red-Black tree -- By Rossella Lau
Re-visit on BST • A BST is a tree where all the values of the left sub-tree are less than the root and all the values of the right sub-tree are greater than the root • It supports O(log n) execution time for both search and insert in optimal cases when the BST has high density • The worst execution time may be O(n) when the BST is sparse
Some facts of a BST • A binary search tree’s in-order traversal sequence is a sort order insertion to a BST can also be treated as a tree sort method and this is another O(n log n) sort algorithm • The minimum value of a BST is on the left most leaf • BSTNode<T> cur = root; // assume size()>=1 • while (cur->left) cur = cur->left; • Return cur->item; • The maximum value of a BST is on the right most leaf
BST removal • Removing a node from a BST should maintain the resulting tree to be a tree as a BST • It cannot have three children • left sub-tree < root < right sub-tree • Should consider different situations of a node (or a sub-tree) • A leaf • A node with a single child • A full node, which has two children
50 50 28 28 75 75 35 87 95 40 40 22 90 90 35 95 87 Deletion of an item which is a leaf • Delete 22: When the item is found, delete it!
bool BSTree<T>::remove (T const & target) { BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false; BSTNode<T> *forDelete (contenAt); if (contentAt->isLeaf()) contentAt = 0; forDelete->left = forDelete->right = 0; delete forDelete; countNodes--; return true; } bool BSTNode<T>::isLeaf(void) {return !left && !right;} The algorithm for deletion of a leaf
90 50 50 87 95 28 28 75 35 35 87 95 40 40 90 Deletion of an item which has one child • Delete 75: When the item is found, put its only child at its place
bool BSTree<T>::remove (T const & target) { BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false; BSTNode<T> *forDelete (contenAt); if (!contentAt->isLeaf() && // with single !contentAt->isFull() ) // child contentAt = contentAt->left ? contentAt->left : contentAt->right; forDelete->left = forDelete->right = 0; delete forDelete; countNodes--; return true; } bool BSTNode<T>::isFull(void) {return left && right; } The algorithm for deletion of single child node
90 90 90 50 40 87 50 50 87 87 95 95 95 28 28 28 35 35 35 40 50 40 35 Deletion of an item which has two children • Delete 50: • Theory: The inorder successor/predecessor of an internal node at most has one child at its right/left hand side • When the item is found at node n, replace n's data with n's inorder successor s or predecessor p, then deletion goes to s or p -- s or p is either a leaf or a node with single child! or
BSTNode<T> *& BSTree<T>::prepareRemoval( BSTNode<T> *& contentAt) { if (contentAt->isFull()) { BSTNode *& succ ( successor ( contentAt) ); swap ( succ->getItem(), contentAt->getItem() ); return succ; } else return contentAt; } bool BSTree<T>::remove (T const & target) {BSTNode<T> *& contentAt (find (target)); if (! contentAt ) return false; BSTNode<T> *& forDelete(prepareRemoval(contentAt)); BSTNode<T> *realDelete (forDelete); …… // deletion of a leaf or a single child’s parent } The algorithm for deletion of an internal node
BSTNode<T> *& BSTree<T>::successor (BSTNode<T> const *p) { // Assume that the input node (p) has two children BSTNode<T> *it (const_cast<BSTNode<T>*> (p)); if (it->right->left) { // successor at the // left-most right subTree it = it->right; while (it->left->left) it= it->left; return it->left; } else //successor is the right child return it->right; } The algorithm for finding an inorder successor
Notes on const_cast • C++ supports the following type cast operators: • const_cast to cast away constant attribute • In the previous example, p is passed as a pointer pointing to a constant object. • However, it tries to traverse p’s children and the compiler would not allow it to have updated operation it=itnext; • To allow it to traverse its children, const_cast is needed to temporarily cast away the constant attributes of p • static_cast the new way to do former type cast • Former way: doubleResult = (double) intA / intB; • C++: doubleResult = static_cast<double> (intA) / intB; • Other two which are not encouraged: • dynamic_cast, reinterpret_cast
Exercises on BST removal • BST removal: • Ford’s exercise: 10:26: delete 30, 80, 25; 10:32 • Other BST removal related functions • find a predecessor
Complexity for remove() • The main logic for delete() is still find(). However, it requires a function successor() to search an in-order successor. successor() should have a complexity less than or equal to find(), therefore, the big O function of delete() is still the same as find() remove() is similar to find() and has the same complexity as find()
Balanced Binary Tree • To solve the problem of a "linear" BST and maintain an optimal complexity, the problem becomes how to maintain a balanced binary tree • A balanced binary tree is also called an AVL tree • It was discovered by two Russian mathematicians: Adel'son-Vel'skii and Landis • First, the height is defined as the depth of the tree • Then, a balanced binary tree is a binary tree in which the heights of the two sub-trees of every node never differ by more than 1.
A J B C K L D E F M N O P G H Q R S T AVL tree Non AVL tree Examples of AVL BST and non-AVL BST
Efficiency concerns on an AVL BST • There are efficient algorithms to maintain a binary tree as an AVL tree • Insert/remove a node into/from an AVL tree and resulting an AVL tree at O(1) (without searching) • Fords: Supplementary in the book web site • Goodrich et al.: Chapter 9 • Collins: Chapter 9 • It requires more information, the height of a node • With an AVL BST, it can always have an optimal search process on a BST
B-Tree • A node storing only one item is not efficient especially considering I/O is based on “blocks” and a block usually stores about 512 bytes • B-Tree is an extension of a balanced binary tree • When saying a binary tree of order n, it means that the tree allows a node to have n children and stores n-1 items • Searching on a B-tree involves only the number of level block I/O when treating each node as an I/O block and searching within a node which has items stored in a vector that can apply binary search
A • 367 • B C • 103 • 218 • •492 •661•815•912 • 17 87 119 165 198 245 272 330 408 435 524 602 686 770 799 832 871 956 968 975 991 D E F G H I J K A sample B-Tree of Order 5
Searching on a B-Tree • Search for 832 1. Getting block A, linear or binary search on the key values, 815 > 367 go to block C along the right pointer of 367 2. Getting block C, 832 is in between 815 and 912 go to block J along the pointer between 815 and 912 3. Getting block J, search for 832 found! • Search for 65 Getting block A, then B, and D, 65 does not exist in D not found!
2-3-4 Tree • A special case of a B-Tree is 2-3-4 tree, B-tree of order 4, in which a node can have up to four children and stores 3 items • Ford’s slides: Chapter 12: 10-15 • Ford’s exercises: Chapter 12: 26(b) • Draw the 2-3-4 tree built when you insert the keys from E A S Y Q U T I O N into an initially empty tree.
Red-Black Tree • To implement a B-Tree is complicated and to implement a 2-3-4 tree is easier but still complicated • Using a Red-black tree to implement (represent) a 2-3-4 tree is easier • Red-black tree is a binary tree • The root is BLACK • A RED parent never has a RED child • Every path from the root to an empty sub-tree has the same number of BLACK nodes • It is closed to a balanced tree and easier to be constructed • Ford’s slides: 12:16-17; exercises: 12:26(c)
Summary • Construction of a BST is also a sorting method which is at O(n logn) for optimal cases • The in-order successor/predecessor of an interior node must be either a leaf or a node with single child • To erase a node from a BST can be categorized as two cases: to delete a leaf and a node with single child • To solve the worst case of a BST, constructing a BST should assure that it is a balanced BST (AVL) • An extension of a BST is a B-Tree and a special case is 2-3-4 tree • Using a Red-Black tree to implement/represent a 2-3-4 tree greatly reduces the complexity
Reference • Ford: 10.5-6, 12.6-7 • Data Structures and Algorithms in C++ by Michael T. Goodrich, Roberto Tamassia, David M. Mount : Chapter 9 -- END --