1.42k likes | 1.82k Views
Chapter 6: Binary Trees . Objectives. Looking ahead – in this chapter, we’ll consider Trees, Binary Trees, and Binary Search Trees Implementing Binary Trees Searching a Binary Search Tree Tree Traversal Insertion Deletion. Objectives (continued). Balancing a Tree Self-Adjusting Trees
E N D
Objectives Looking ahead – in this chapter, we’ll consider • Trees, Binary Trees, and Binary Search Trees • Implementing Binary Trees • Searching a Binary Search Tree • Tree Traversal • Insertion • Deletion Data Structures and Algorithms in C++, Fourth Edition
Objectives (continued) Balancing a Tree Self-Adjusting Trees Heaps Treaps k-d Trees Polish Notation and Expression Trees Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees • While linked lists, stacks, and queues are useful data structures, they do have limitations • Linked lists are linear in form and cannot reflect hierarchically organized data • Stacks and queues are one-dimensional structures and have limited expressiveness • To overcome these limitations, we’ll consider a new data structure, the tree • Trees consist of two components, nodes and arcs (or edges) • Trees are drawn with the root at the top, and “grow” down • The leaves of the tree (also called terminal nodes) are at the bottom of the tree Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) • Trees can be defined recursively as follows: • A tree with no nodes or arcs (an empty structure) is an empty tree • If we have a set t1… tkof disjoint trees, the tree whose root has the roots of t1… tk as its children is a tree • Only structures generated by rules 1 and 2 are trees • Every node in the tree must be accessible from the root through a unique sequence of arcs, called a path • The number of arcs in the path is the path’s length • A node’s level is the length of the path to that node, plus 1 Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) The maximum level of a node in a tree is the tree’s height An empty tree has height 0, and a tree of height 1 consists of a single node which is both the tree’s root and leaf The level of a node must be between 1 and the tree’s height Some examples of trees are shown in Figure 6.1 Fig. 6.1 Some examples of trees Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) • The number of children of a given node can be arbitrary • Using trees may also to improve the process of searching for elements • In order to find a particular element in a list of n elements, we have to examine all those before that element in the list • This holds even if the list is ordered • On the other hand, if the elements of a list are stored in a tree that is organized in a predetermined fashion, the number of elements that must be looked at can be substantially reduced Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) The order of nodes in the figure below doesn’t achieve anything, because there is no consideration of searching incorporated into its design However, by applying a consistent ordering to the nodes, considerable savings in searching can be achieved Fig. 6.3 Transforming (a) a linked list into (b) a tree Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) A binary tree is a tree where each node has only two children, designated the left child and the right child These children can be empty; Figure 6.4 shows examples of binary trees Fig. 6.4 Examples of binary trees An important attribute of binary trees is the number of leaves This is useful in assessing efficiency of algorithms Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) As specified earlier, the level of a node is the number of arcs between it and the root, plus 1 The root is at level 1, its children at level 2, etc. So if each node at any given level (except the last) had two children, there would be 20 nodes at level 1, 21 nodes at level 2, etc. In general, there would be 2inodes at level i + 1 A tree that exhibits this is called a complete binary tree In such a tree, all nonterminal nodes have both children, and all leaves are on the same level Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) • Because leaves can occur throughout this tree (except at level 1), there is no general formula to calculate the number of nodes • However, it can be approximated: • For all the nonempty binary trees whose nonterminal nodes have exactly two nonempty children, the number of leaves m is greater than the number of nonterminal nodes k and m = k + 1 • This holds trivially for the tree consisting of only the root Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) For any given tree for which the condition holds, attaching two leaves to an existing leaf will make it nonterminal This decreases the leaf nodes by 1 and increases the number of nonterminals by 1 However, the two new leaves increase the number of leaves by 2, so the relation becomes (m – 1) + 2 = (k + 1) + 1 This simplifies to m = k + 1, which is the desired result This means that an i + 1 level complete decision tree has 2i leaves and 2i – 1 nonterminal nodes, totaling 2i+1 – 1 nodes Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) Fig. 6.5 (a) Adding a leaf to tree, (b) preserving the relation of the number of leaves to the number of nonterminal nodes Data Structures and Algorithms in C++, Fourth Edition
Trees, Binary Trees, andBinary Search Trees (continued) • In a binary search tree (or ordered binary tree), values stored in the left subtree of a given node are less than the value stored in that node, and values stored in the right subtree of a given node are greater than the value stored in that node • The values stored are considered unique; attempts to store duplicate values can be treated as an error • The meanings of the expressions “less than” and “greater than” will depend on the types of values stored Fig. 6.6 Examples of binary search trees Data Structures and Algorithms in C++, Fourth Edition
Implementing Binary Trees • We can use arrays or linked structures to implement binary trees • If using an array, each element stores a structure that has an information field and two “pointer” fields containing the indexes of the array locations of the left and right children • The root of the tree is always in the first cell of the array, and a value of -1 indicates an empty child Fig. 6.7 Array representation of the tree in Figure 6.6c Data Structures and Algorithms in C++, Fourth Edition
Implementing Binary Trees (continued) • Implementing binary tree arrays does have drawbacks • We need to keep track of the locations of each node, and these have to be located sequentially • Deletions are also awkward, requiring tags to mark empty cells, or moving elements around, requiring updating values • Consequently, while arrays are convenient, we’ll usually use a linked implementation • In a linked implementation, the node is defined by a class, and consists of an information data member and two pointer data members • The node is manipulated by methods defined in another class that represents the tree • The code for this is shown in Figure 6.8 on pages 220-222 Data Structures and Algorithms in C++, Fourth Edition
Searching a Binary Search Tree Locating a specific value in a binary tree is easy: Fig. 6.9 A function for searching a binary search tree For each node, compare the value to the target value; if they match, the search is done If the target is smaller, we branch to the left subtree; if larger, we branch to the right If at any point we cannot proceed further, then the search has failed and the target isn’t in the tree Data Structures and Algorithms in C++, Fourth Edition
Searching a Binary Search Tree (continued) • Using this approach and referring to Figure 6.6c, we can find the value 31 in only three comparisons • Finding (or not finding) the values 26 – 30 requires the maximum of four comparisons; all other values require less than four • This also demonstrates why a value should occur only once in a tree; allowing duplicates requires additional searches: • If there is a duplicate, we must either locate the first occurrence and ignore the others, or • We must locate each duplicate, which involves searching until we can guarantee that no path contains another instance of the value • This search will always terminate at a leaf node Data Structures and Algorithms in C++, Fourth Edition
Searching a Binary Search Tree (continued) The number of comparisons performed during the search determines the complexity of the search This in turn depends on the number of nodes encountered on the path from the root to the target node So the complexity is the length of the path plus 1, and is influenced by the shape of the tree and location of the target Searching in a binary tree is quite efficient, even if it isn’t balanced However, this only holds for randomly created trees, as those that are highly unbalanced or elongated and resemble linear linked lists approach sequential search times Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal Tree traversalis the process of visiting each node in a tree data structure exactly one time This definition only specifies that each node is visited, but does not indicate the order of the process Hence, there are numerous possible traversals; in a tree of n nodes there are n! traversals Two especially useful traversals are depth-first traversals and breadth-first traversals Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Breadth-First Traversal • Breadth-first traversal proceeds level-by-level from top-down or bottom-up visiting each level’s nodes left-to-right or right-to-left • This gives us four possibilities; a top-down, left-to-right breadth-first traversal of Figure 6.6c yields 13, 10, 25, 2, 12, 20, 31, 29 • This can be easily implemented using a queue • If we consider a top-down, left-to-right breadth-first traversal, we start by placing the root node in the queue • We then remove the node at the front of the queue, and after visiting it, we place its children (if any) in the queue • This is repeated until the queue is empty Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Breadth-First Traversal (continued) • An implementation of this is shown in Figure 6.10 Fig. 6.10 Top-down, left-to-right, breadth-first traversal implementation Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • The queue (middle) and output (right) from a breadth-first traversal of the tree from Figure 6.6c (left) • Breadth-First Traversal (continued) • The following diagram shows a traversal of the tree from Figure 6.6c, using the queue-based breadth-first traversal Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Depth-First Traversal • Depth-first traversal proceeds by following left- (or right-) hand branches as far as possible • The algorithm then backtracks to the most recent fork and takes the right- (or left-) hand branch to the next node • It then follows branches to the left (or right) again as far as possible • This process continues until all nodes have been visited • While this process is straightforward, it doesn’t indicate at what point the nodes are visited; there are variations that can be used • We are interested in three activities: traversing to the left, traversing to the right, and visiting a node • These activities are labeled L, R, and V, for ease of representation Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Depth-First Traversal(continued) • Based on earlier discussions, we want to perform the traversal in an orderly manner, so there are six possible arrangements: • VLR, VRL, LVR, LRV, RVL, and RLV • Generally, we follow the convention of traversing from left to right, which narrows this down to three traversals: • VLR – known as preorder traversal • LVR – known as inorder traversal • LRV – known as postorder traversal • These can be implemented straightforwardly, as seen in Figure 6.11 Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) Fig. 6.11 Depth-first traversal implementations Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Fig. 6.12 Inorder tree traversal • Depth-First Traversal (continued) • While the code is simple, the power lies in the recursion supported by the run-time stack, which places a heavy burden on the system • To gain more insight into the behavior of these algorithms, let’s consider the inorder routine • In this traversal, if the tree is nonempty, we traverse the left subtree of the node, then visit the node, then traverse the right subtree Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Depth-First Traversal (continued) • Because of the order of the recursion in the code, the V and R steps are held pending until the L step completes • This is the function of the stack, to “remember” the backtrack point, so that after a left traversal ends, the routine can back up to visit the branch point node, and then proceed to the right • This is illustrated in Figure 6.13, where each node is labeled with the activities “LVR”, and they are scratched out as they are performed for a given node • To see how this works, we can observe the operation of the runtime stack shown in Figure 6.14 on page 230; the numbers in parentheses refer to return addresses indicated in the code on page 228 Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) Fig. 6.13 Details of several of the first steps of inordertraversal Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Depth-First Traversal (continued) • Now let’s consider nonrecursive implementations of the traversal algorithms • As we’ve learned, recursive algorithms tend to be less efficient than their nonrecursive versions • So we need to determine if it is useful to pursue nonrecursive versions of the traversal algorithms • Let’s first consider a nonrecursive version of the preorder algorithm, shown in Figure 6.15 • While still readable, it makes extensive use of the stack, and the number of calls in the processing loop is actually twice the number in the recursive version of the code, which is hardly an improvement Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) Fig. 6.15 A nonrecursive implementation of preorder tree traversal Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Depth-First Traversal (continued) • Recursive algorithms can easily be derived from one another by simply transposing the function calls • This is not the case with the nonrecursive algorithms, however; the order of the calls and their interaction with the stack is critical • So the inorder and postorder nonrecursive algorithms have to be developed separately • Fortunately, creating a postorder algorithm can be accomplished easily by noting that an LRV traversal is simply a reversed VRL traversal • This is a right-to-left preorder traversal, so we can adapt the preorder algorithm to create the postorder one • This will require two stacks to handle the reversal process from preorder to postorder Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Depth-First Traversal (continued) • We can utilize a single stack, however, if we push the node based on the number of descendants it has • We can push the node once before traversing its left subtree, and then again before traversing its right subtree • An auxiliary pointer is used to keep track of the two cases • Nodes with one descendant get pushed only once, and leaf nodes are not put on the stack • This approach is the basis for the code in Figure 6.16 • Inorder traversal is also complicated; the algorithm in Figure 6.17 is both hard to follow and hard to understand without documentation Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) Fig. 6.16 A nonrecursive implementation of postorder tree traversal Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) Fig. 6.17 A nonrecursive implementation of inorder tree traversal Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Threaded Trees • The previous algorithms were all characterized by the use of a stack, either implicitly through the system, or explicitly in code • In both cases, additional processing time is required to handle stack operations, and memory has to be allocated for the stack • In extreme cases where the tree is highly skewed, this can be a serious processing concern • A more efficient implementation can be achieved if the stack is incorporated into the design of the tree itself • This is done by using threads, pointers to the predecessor and successor of a node based on an inorder traversal • Trees using threads are called threaded trees Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Threaded Trees (continued) • To implement threads, four pointers would be needed for each node, but this can be reduced by overloading the existing pointers • The left pointer can be used to point to the left child or the predecessor, and the right pointer can point to the right child or successor • This is illustrated in Figure 6.18(a) • The figure suggests that threads to both predecessors and successors need to be used, but this is not always true • Figure 6-18b shows the inorder traversal of a threaded tree, using only successor threads Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Threaded Trees (continued) Fig. 6.18 (a) A threaded tree and (b) an inorder traversal’s path in a threaded tree with right successors only • The implementation of this is relatively simple; the traversal is indicated by the dashed lines in Figure 6.18b • Only a single variable is needed for this; no stack is required • However, the memory savings will be highly dependent on the implementation, shown in Figure 6-19 on pages 235 and 236 Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Threaded Trees (continued) • We can also use threads to support preorder and postorder traversal • In preorder, the existing threads can be used to determine the appropriate successors • Postorder requires somewhat more work, but is only slightly more complicated to accomplish Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Tree Transformation • The approaches to traversal thus far considered have used stacks to support the traversal or incorporated the stack into the tree • Both of these have memory overhead that can impact the efficiency of the algorithms • However, it is possible to carry out traversals without using stacks or threads • These algorithms rely on making temporary changes in the tree structure during traversal, and restoring the structure when done • One elegant algorithm to accomplish this was developed by Joseph M. Morris in 1979 and is shown here for inorder traversal Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Tree Transformation (cont’d) • The algorithm is based on the observation that inorder traversal is very simple for trees that have no left children (see Figure 6.1e) • Since no left subtree has to be considered, the LVR traversal reduces to VR • Morris’s algorithm utilizes this observation by modifying the tree so that the node being processed has no left child • This allows the node to be visited and then the right subtree can be investigated • Since this changes the tree’s structure, the traversal can only be done once, and information must be kept to restore the original tree Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Tree Transformation (cont’d) • The algorithm can be described as follows: MorrisInorder() while not finished if node has no left descendant visit it; go to the right; else make this node the right child of the rightmost node in its left descendant; go to this left descendant; Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Tree Transformation (cont’d) • An implementation of this algorithm is shown in Figure 6.20 Fig. 6.20 Implementation of the Morris algorithm for inordertraversal Data Structures and Algorithms in C++, Fourth Edition
Tree Traversal (continued) • Stackless Depth-First Traversal: Tree Transformation (cont’d) • Details of the traversal are shown in Figure 6-21 (page 239); letters for the subfigures are referred to in the process steps on pages 237 and 238 • Preorder and postorder traversals can be implemented in a similar fashion • The preorder traversal requires moving the visit() operation from the inner else to the inner if • Postorder requires additional restructuring of the tree, as described on page 239 Data Structures and Algorithms in C++, Fourth Edition
Insertion • Searching a binary tree does not modify the tree • Traversals may temporarily modify the tree, but it is usually left in its original form when the traversal is done • Operations like insertions, deletions, modifying values, merging trees, and balancing trees do alter the tree structure • We’ll look at how insertions are managed in binary search trees first • In order to insert a new node in a binary tree, we have to be at a node with a vacant left or right child • This is performed in the same way as searching: • Compare the value of the node to be inserted to the current node • If the value to be inserted is smaller, follow the left subtree; if it is larger, follow the right subtree • If the branch we are to follow is empty, we stop the search and insert the new node as that child Data Structures and Algorithms in C++, Fourth Edition
Insertion (continued) • This process is shown in Figure 6.22; the code to implement this algorithm shown in Figure 6.23 Fig. 6.22 Inserting nodes into binary search trees Data Structures and Algorithms in C++, Fourth Edition
Insertion (continued) Fig. 6.23 Implementation of the insertion algorithm Data Structures and Algorithms in C++, Fourth Edition
Insertion (continued) In looking at tree traversal, we considered three approaches: stack-based, thread-based, and via transformations Stack based traversals don’t change the trees; transformations change the tree but restore it when done Threaded approaches, though, do modify the tree by adding threads to the structure While it may be possible to add and remove the threads as needed, if the tree is processed frequently, we might want to make the threads a permanent part of the tree This requires incorporating threads into the insertion process Data Structures and Algorithms in C++, Fourth Edition
Insertion (continued) The algorithm for inserting a node in a threaded tree is a simple modification of the original function that adjusts the threads where needed The implementation of this algorithm is shown in Figure 6.24 on page 242; the first insertions are shown in Figure 6.25 Fig. 6.25 Inserting nodes into a threaded tree Data Structures and Algorithms in C++, Fourth Edition
Deletion • Deletion is another operation essential to maintaining a binary search tree • This can be a complex operation depending on the placement of the node to be deleted in the tree • The more children a node has, the more complex the deletion process • This implies three cases of deletion that need to be handled: • The node is a leaf; this is the easiest case, because all that needs to be done is to set the parent link to null and delete the node (Figure 6.26) • The node has one child; also easy, as we set the parent’s pointer to the node to point to the node’s child (Figure 6.27) Data Structures and Algorithms in C++, Fourth Edition