740 likes | 782 Views
CHAPTER 4 Trees The linear access time offered by linked list is prohibitive for a large amount of input.
E N D
CHAPTER 4 • Trees • The linear access time offered by linked list is prohibitive for a large amount of input. • Tree ADT offers an O(logn) operations on average. A simple modification to this ADT guarantees the O(logn) bound in the worst case. Also, a second modification will give an O(logn) running time per operation for long sequence of instructions. • The goal of this chapter is to see: • How trees are used to implement the file system of several popular operating systems. • How trees can be used to evaluate arithmetic expressions. • How trees are used to support searching operations in O(logn) worst-case bounds. • How to implement these operations when data is stored on a disk.
A tree can be defined in several ways. One natural way to define a tree is recursively. A tree is a collection of nodes. The collection can be empty, otherwise, it consists of a distinguished node r, called root, and zero or more (sub)tree T1, T2, T3, …, and Tk, each of whose roots are connected by a directed edge to r. The root of each subtree is said to be a child of r, and r is the parent of each subtree root.
Tree ADT: Mixing Dynamic ordering and Recursion • A generic typical tree using the recursive definition. • A tree is a collection of n nodes, one of which is the root, and n-1 are edges. • Leaves: Nodes with no children are known as leaves. • Siblings: Nodes with the same parents. • Path: is a sequence of nodes n1, n2, n3, …, nksuch that ni is the parent of ni+1 for : • Length: Length of a path is the number of edges on the path, namely k-1. There is a path of length zero from a node to itself. • Depth: is the length of the unique path from a node nito a leaf. Thus, the root is a depth 0. • Height: is the longest path from a node ni to a leaf. Thus all leaves are at height 0.
E is at depth 1 and height 2 F is at depth 1 and height 1 The height of the tree is 3 (height of a tree is the same as height of the root) and the depth of the tree is equal to the depth of the deepest leaf. Height of a tree = Depth of that tree. Siblings of E Depth Height Children of E Ancestor and Descendent : If there is a path from n1to n2, then n1 is an ancestor of n2 and n2 is a descendent of n1 . If n1 is not the same as n2, then n1 is a proper ancestor of n2 and n2 is a proper descendent of n1.
Implementation of Trees One way of implementing a tree is to have in each node, besides its data, a pointer to each child of the node. Problem? Number of children per node can vary greatly and is not known in advance. An attempt to make the children direct links in the data structure, would result in wasting too much space. Solution: Keep the children of each node in a linked list of tree nodes. /* 1*/ struct TreeNode /* 2*/ { /* 3*/ Object element; /* 4*/ TreeNode *firstChild; /* 5*/ TreeNode *nextSibling; /* 6*/ };
FirstChild Neither child nor Sibling NextSibling
Find the first child/next sibling representation of the following tree 12 7 18 2 15 21 22 8 1
Tree Traversals with an Application There are many applications of trees. One of the popular uses is the directory structure in many common operating systems, such as UNIX, VAX/VMS, and DOS. Two files in two different directory can share the same name, why ? They have two different paths from the root and thus have different pathnames.
Suppose we want to list the names of all of the files in the directory. void FileSystem::listAll(int depth = 0) const { /* 1*/ printName(depth); //Print the name of the object /* 2*/ if( isDirectory( ) ) /* 3*/ for each file C in this directory (for each child) /* 4*/ C.listAll(depth + 1); } The recursive function listAll needs to be started with a depth of 0, to signify no indenting for the root. The name of the file object is printed out with the appropriate number of tabs. If the entry is a directory, then we process all children recursively, one-by-one. These children are one level deeper and thus must be indented an extra space. This traversal strategy is known as a preorder traversal. In preorder traversal, work at a node is performed before (pre)its children are processed. Suppose we have n file names to be output, what is the running time ? O(n)
Postorder traversal method In this method, the work at a node is performed after (post) its children are evaluated. Number of disk blocks taken by each file Directories are files, so they have sizes too. What is the total number of blocks used by all the files in the tree ? Most common way is to find the total number of blocks contained in the subdirectories. For instance: /usr/mark (30), /usr/alex (9), and /usr/bill (32). The total number then would be the sumof all of them, (71) plus the one block used by /usr for the total of (72).
The following function implements this strategy (total number of blocks) int FileSystem::size( ) const { /* 1*/ int Total_Size = sizeOfThisFile( ); /* 2*/ if( isDirectory( )) /* 3*/ for each file C in this directory (for each child) /* 4 */ totalSize += C.size( ); /* 5*/ return totalSize; } If the current object is not a directory, then size merely returns the number of blocks it uses in the current object. Otherwise, the number of blocks used by the directory is added to the number of blocks (recursively) found in all children.
Binary Trees Binary tree is a tree in which no node can have more than two children. A binary tree consists of a root and two suntree, TL and TR, both of which could be empty. A property of binary trees that is sometimes important is that the depth of an average binary tree is considerably smaller than n. An analysis show that the depth is . For special type of binary tree, namely binary search tree, the average value of the depth is:
Unfortunately, the depth can be as large as n-1 Implementation Since a binary tree has at most two children, we can keep direct pointer to them. The declaration of tree nodes is similar in structure to that for doubly linked lists. In doubly linked lists, nodes consist of the key information plus two pointers (left and Right) to other nodes. // Binary tree declarations /* 1*/ struct BinaryNode /* 2*/ { /* 3*/ Object element; //The data in the node /* 4*/ BinaryNode *left; //Left child /* 5*/ BinaryNode *right; //Right child /* 6*/ };
Many of the rules that apply to linked lists, apply to trees as well. • An insertion is performed by making a call to new to create a node. • Node can be freed after deletion by calling delete. • Trees are drawn as circles connected by lines because trees are actually graphs. • We also do not explicitly draw NULL pointers when referring to trees, because every binary tree with n nodes would require n+1 NULL pointers. • One of the most important use of trees (in addition to search) is in the design of compilers.
Expression Trees Following is an example of an expression tree. The leaves of an expression tree are operands, such as constants or variable names. The other nodes contain operators. This particular tree happens to be binary, why ? All the operations are binary. This is a simple case, but, in general, is it possible for a node to have more than two children ? It is possible for a node to have one child only and more than two children. Examples: x + y + z, and -n (unary minus)
We can evaluate an expression tree, T, by applying the operator at the root to the values obtained by recursively evaluating the left and right subtrees. In our example: The left subtree evaluates: a + (b*c) and The right subtree evaluates: ((d*e) + f) * g The entire tree represents : (a + (b*c)) + (((d*e) + f) * g). Three strategies: 1) Inorder traversal (left, node, right): recursively produce a parenthesized left expression, then print the operator at the root, and finally recursively produce a parenthesized right expression. In our example: (a+b*c)) + (((d*e)+f)*g) 2) Postorder traversal: print out recursively the left subtree, the right subtree, and then the operator. In our example: a b c * + d e * f + g * +. 3) Preorder traversal: print the operator first and then recursively print out the left and right subtrees. In our example: + + a * b c * + * d e f g. How do we construct the expression trees ?
Algorithm to convert a postfix expression into an expression tree Suppose the input is: a b + c d e + * * The first two symbols are operands, we create one-node trees and push pointers to them onto a stack. Next we have ‘+’, the two pointers to trees are popped, a new tree is formed, and a pointer to it is pushed onto a stack.
Next, c, d and e are read, a one-node tree is created for each, and a pointer to the corresponding tree is pushed onto stack. Now a ‘+’ is read, so two trees are merged;
Continuing a ‘*’ is read, pop two tree pointers and form a new tree with a ‘*’ as root; Finally, the last symbol is read, two trees are merged, and a pointer to the final tree is left on the stack;
The search Tree ADT-Binary Search Trees A very important application of binary trees is their use in searching. Let’s assume that each node in the tree is assigned a key value and that the keys are distinct. What makes a binary tree a binary search tree ? Every node, X, in the tree, the values of all the keys in the left subtree are smaller than the key value in X. Also, the values of all the keys in the right subtree are larger than the key value in X. This means that the all elements in the tree can be ordered in some consistent manner.
Not a binary search tree Binary search tree
template <class Comparable> class BinarySearchTree { public: explicit BinarySearchTree( const Comparable & notFound ); BinarySearchTree( const BinarySearchTree & rhs ); ~BinarySearchTree( ); const Comparable & findMin( ) const; const Comparable & findMax( ) const; const Comparable & find( const Comparable & x ) const; bool isEmpty( ) const; void printTree( ) const; void makeEmpty( ); void insert( const Comparable & x ); void remove( const Comparable & x ); const BinarySearchTree & operator=( const BinarySearchTree & rhs ); private: BinaryNode<Comparable> *root; const Comparable ITEM_NOT_FOUND; const Comparable & elementAt( BinaryNode<Comparable> *t ) const; void insert( const Comparable & x, BinaryNode<Comparable> * & t ) const; void remove( const Comparable & x, BinaryNode<Comparable> * & t ) const; BinaryNode<Comparable> * findMin( BinaryNode<Comparable> *t ) const; BinaryNode<Comparable> * findMax( BinaryNode<Comparable> *t ) const; BinaryNode<Comparable> * find( const Comparable & x, BinaryNode<Comparable> *t ) const; void makeEmpty( BinaryNode<Comparable> * & t ) const; void printTree( BinaryNode<Comparable> *t ) const; BinaryNode<Comparable> * clone( BinaryNode<Comparable> *t ) const; }; Binary Search Tree Class Skeleton
/** * Construct the tree. */ template <class Comparable> BinarySearchTree<Comparable>::BinarySearchTree( const Comparable & notFound ) : ITEM_NOT_FOUND( notFound ), root( NULL ) { } /** * Copy constructor. */ template <class Comparable> BinarySearchTree<Comparable>:: BinarySearchTree( const BinarySearchTree<Comparable> & rhs ) : root( NULL ), ITEM_NOT_FOUND( rhs.ITEM_NOT_FOUND ) { *this = rhs; }
/** * Destructor for the tree. */ template <class Comparable> BinarySearchTree<Comparable>::~BinarySearchTree( ) { makeEmpty( ); } /** * Insert x into the tree; duplicates are ignored. */ template <class Comparable> void BinarySearchTree<Comparable>::insert( const Comparable & x ) { insert( x, root ); }
/** * Remove x from the tree. Nothing is done if x is not found. */ template <class Comparable> void BinarySearchTree<Comparable>::remove( const Comparable & x ) { remove( x, root ); } /** * Find the smallest item in the tree. * Return smallest item or ITEM_NOT_FOUND if empty. */ template <class Comparable> const Comparable & BinarySearchTree<Comparable>::findMin( ) const { return elementAt( findMin( root ) ); }
/** * Find the largest item in the tree. * Return the largest item of ITEM_NOT_FOUND if empty. */ template <class Comparable> const Comparable & BinarySearchTree<Comparable>::findMax( ) const { return elementAt( findMax( root ) ); } /** * Find item x in the tree. * Return the matching item or ITEM_NOT_FOUND if not found. */ template <class Comparable> const Comparable & BinarySearchTree<Comparable>:: find( const Comparable & x ) const { return elementAt( find( x, root ) ); }
/** * Make the tree logically empty. */ template <class Comparable> void BinarySearchTree<Comparable>::makeEmpty( ) { makeEmpty( root ); } /** * Test if the tree is logically empty. * Return true if empty, false otherwise. */ template <class Comparable> bool BinarySearchTree<Comparable>::isEmpty( ) const { return root == NULL; }
/** * Print the tree contents in sorted order. */ template <class Comparable> void BinarySearchTree<Comparable>::printTree( ) const { if( isEmpty( ) ) cout << "Empty tree" << endl; else printTree( root ); } /*** Deep copy. */ template <class Comparable> const BinarySearchTree<Comparable> & BinarySearchTree<Comparable>:: operator=( const BinarySearchTree<Comparable> & rhs ) { if( this != &rhs ) { makeEmpty( ); root = clone( rhs.root ); } return *this; }
/** * Internal method to get element field in node t. * Return the element field or ITEM_NOT_FOUND if t is NULL. */ template <class Comparable> const Comparable & BinarySearchTree<Comparable>:: elementAt( BinaryNode<Comparable> *t ) const { if( t == NULL ) return ITEM_NOT_FOUND; else return telement; }
/** * Internal method to insert into a subtree. * x is the item to insert. * t is the node that roots the tree. * Set the new root. */ template <class Comparable> void BinarySearchTree<Comparable>:: insert( const Comparable & x, BinaryNode<Comparable> * & t ) const { if( t == NULL ) t = new BinaryNode<Comparable>( x, NULL, NULL ); else if( x < telement ) insert( x, tleft ); else if( telement < x ) insert( x, tright ); else ; // Duplicate; do nothing }
/*** Internal method to remove from a subtree. * x is the item to remove. t is the node that roots the tree. * Set the new root. */ template <class Comparable> void BinarySearchTree<Comparable>:: remove( const Comparable & x, BinaryNode<Comparable> * & t ) const { if( t == NULL ) return; // Item not found; do nothing if( x < telement ) remove( x, tleft ); else if( telement < x ) remove( x, tright ); else if( tleft != NULL && tright != NULL ) // Two children { telement = findMin( tright )element; remove( telement, tright ); } else { BinaryNode<Comparable> *oldNode = t; t = ( tleft != NULL ) ? tleft : tright; delete oldNode; } }
/** * Internal method to find the smallest item in a subtree t. * Return node containing the smallest item. */ template <class Comparable> BinaryNode<Comparable> * BinarySearchTree<Comparable>::findMin( BinaryNode<Comparable> *t ) const { if( t == NULL ) return NULL; if( t left == NULL ) return t; return findMin( tleft ); }
/** * Internal method to find the largest item in a subtree t. * Return node containing the largest item. */ template <class Comparable> BinaryNode<Comparable> * BinarySearchTree<Comparable>::findMax( BinaryNode<Comparable> *t ) const { if( t != NULL ) while( tright != NULL ) t = tright; return t; }
/*** Internal method to find an item in a subtree. * x is item to search for. t is the node that roots the tree. * Return node containing the matched item. */ template <class Comparable> BinaryNode<Comparable> * BinarySearchTree<Comparable>:: find( const Comparable & x, BinaryNode<Comparable> *t ) const { if( t == NULL ) return NULL; else if( x < telement ) return find( x, tleft ); else if( t element < x ) return find( x, t right ); else return t; // Match }
/****** NONRECURSIVE VERSION************************* template <class Comparable> BinaryNode<Comparable> * BinarySearchTree<Comparable>:: find( const Comparable & x, BinaryNode<Comparable> *t ) const { while( t != NULL ) if( x < telement ) t = tleft; else if( telement < x ) t = tright; else return t; // Match return NULL; // No match } *****************************************************/
/** * Internal method to make subtree empty. */ template <class Comparable> void BinarySearchTree<Comparable>:: makeEmpty( BinaryNode<Comparable> * & t ) const { if( t != NULL ) { makeEmpty( tleft ); makeEmpty( tright ); delete t; } t = NULL; }
/** * Internal method to print a subtree rooted at t in sorted order. */ template <class Comparable> void BinarySearchTree<Comparable>::printTree( BinaryNode<Comparable> *t ) const { if( t != NULL ) { printTree( tleft ); cout << telement << endl; printTree( tright ); } }
/** * Internal method to clone subtree. */ template <class Comparable> BinaryNode<Comparable> * BinarySearchTree<Comparable>::clone( BinaryNode<Comparable> * t ) const { if( t == NULL ) return NULL; else return new BinaryNode<Comparable>( telement, clone( tleft ), clone( tright ) ); }
Example: Suppose we want to insert 5 in the tree on left below. Act as if you are searching for 5, once you get to key 4, you are to look on the right to see if 5 is there. But there is nothing on the right, so that is the place that 5 goes. So insert 5 there. How about duplicates? What if we wanted to insert 3 ? Duplicate is handled differently. We can keep an extra field in the node recording the frequency of occurrences. This may add some extra space to the entire tree but is better than putting duplicates in the tree which makes the tree very deep.
Remove The hardest operation is deletion, as it is the case with most data structures. In order to remove a node, we first need to find it, but then we have to consider several possibilities. 1. If a node is a leaf, it can be deleted immediately. 2. If the node has one child, the node can be removed after its parent adjust a pointer to bypass the node. By doing this the target node will be unreferenced, so you will remove it using a temp pointer that (saved) is pointing to the location of the target node. 3. If the node has two children, replace the root of the target node with the root of the node on the right. Then remove the right node. temp
Remove temp Replace key 2 with 3, 3 is the smallest key in its right subtree
AVL Trees • An AVL (Adelson-Velskii and Landis) tree is a binary search tree with a balance condition. • The balance condition is to require that the left and right subtrees have the same height. We want to ensure that the tree is O(logn). (Remarks) • This insist that every node has the left and right subtrees have the same height. • If the height of an empty subtree is defined to be –1, then a perfectly balanced tree has 2k – 1 nodes. This would satisfy the balance condition. • This will guarantee smaller size trees, but it is too rigid and must be relaxed to be useful. • The AVL is the same as a binary tree, except that for every node in the tree, the height of the left and right subtrees can differ by at most 1. (Remarks) • An AVL tree is at most roughly 1.44log(N+2)-0.328, but in practice it is only slightly more than log(N). • In the following figures we see two examples of binary trees. One of these trees (the one on the right) is not an AVL, why ?
Figure 4.32: Height of Left and right do not differ by 1 A bad binary tree. Requiring balance at the root is not enough 12 7 18 2 22 1 35
Figure 4.31: 9 8 7 6 5 4 3 2 1
This AVL tree has height of 9. The right subtree is a and AVL tree of height 8 of minimum size. The minimum number of nodes, N(b), in an AVL tree of height h is given by: N(h) = N(h – 1) + N(h – 2) +1. Example: For h = 1, N(h) = 2. How about N(2) ? A A A A For h = 0, N(h) = 1. B C B B C F D E D For h = 2, N(h) = N(0) + N(1) + 1 N(h) = 1 + 2 + 1 = 4 G For h = 3, N(h) = N(1) + N(2) + 1 N(h) = 2 + 4 + 1 = 7 Have you seen the relation for N(h), N(h) = N(h-1) + N(h-2) + 1 before ? Isn’t that the Fibonacci’s numbers + 1, shifted Fibonacci’s number, maybe ..
Question: Using the relation for N(h), can you find the minimum number of nodes in an AVL tree of height 9? All tree operations can be performed in O(logn) time, except possibly insertion. Difficulties associated with the insertion: After inserting a node, we need to update all the balancing information for the nodes on the path back to the root. Insertion can violate the AVL tree property. For example inserting 6.5 into the AVL tree in Figure (4.32) would destroy the balance condition at the node with key 8. We need to restore the AVL tree properties before the insertion step consider to be over. Figure 4.32: 6.5
A simple solution called rotation can help us do such modifications of the tree. Violation might occur in four cases. Suppose in the node that needs rebalance: 1) An insertion into left subtree of the left child of 2) An insertion into the right subtree of the left child of 3) An insertion into the left subtree of the right child of 4) An insertion into the right subtree of the right child of How to fix these violations? Outside case is handled by a single rotation (cases 1 and 4) Inside case is solved by double rotations (cases 2 and 3)
Single Rotation Both trees are binary search trees. In both trees, k1 < k2. All elements in the subtree X are smaller than k1in both subtrees. All elements in subtree Z are larger than k2. All elements in subtree Y are in between k1and k2. The conversion of one of the trees to other is known as rotation. Rotation in general involves a few pointer changes. That changes the structure of the tree while preserving the search tree property. The rotation does not have to be done at the root of the tree. It can be done at any node of the tree. That node is the root of a subtree anyway.