170 likes | 240 Views
One more definition: A binary tree, T, is balanced if T is empty, or if abs ( height (leftsubtree of T) - height ( right subtree of T) ) <= 1 and if the left subtree of T and the right subtree of T are balanced. That is, a binary tree, T, is balanced if for every node N of T,
E N D
One more definition: A binary tree, T, is balanced if T is empty, or if abs ( height (leftsubtree of T) - height ( right subtree of T) ) <= 1 and if the left subtree of T and the right subtree of T are balanced. That is, a binary tree, T, is balanced if for every node N of T, abs ( height (left subtree of N) - height (right subtree of N) ) <= 1
balanced Unbalanced at this node
The basic reason for using a binary search tree for storing data is to optimize search time. If a binary search tree, T, containing n nodes is full, this optimized search time is realized, and n = 2k - 1 for some non-negative integer k The height (T) = k - 1 The tree T has k levels. The tree T has nodes at levels 0 (root node), 1, 2, 3 . . . k - 1 (and all leaf nodes of T are at level k - 1. Solving the equation above gives k = log2 (n + 1)
n = 31 n = 2k - 1 k = log2 (n + 1) = 5 The height of the tree is k - 1 = 4 All leaf nodes are at level 4.
At the opposite extreme, a binary search tree with n nodes could be a chain, or vine. A tree such as this one has n levels, and a height of n - 1 The maximum number of comparisons in a search for a particular node is n - one for each level.
The maximum number of comparisons in a search of a full binary search tree for a particular node is k = log2 (n + 1) - one comparison at each level. The maximum number of comparisons in a search of a binary search tree which is a chain for a particular node is n - one for each level. For example if n = 262,143 If the n nodes are stores in a full binary search tree, a maximum of 18 comparisons are needed in a search for a particular node. If the n nodes are stored in a binary search tree that forms a chain, a maximum of 262,143 comparisons are needed in a search for a particular node.
Clearly a full binary search tree realizes the optimal search time. But a full tree has no room to grow, or shrink. Between these two extremes, a balanced tree or a complete tree yields close to the optimal search time, and still has room to grow.
The minimum height of a binary tree with n nodes is ceiling ( log2 ( n + 1) ) - 1 And the number of levels is ceiling (log2 ( n + 1) ) To show this: Let k be the smallest integer for which n <= 2k - 1 Then 2k-1 - 1 < n <= 2k - 1 Add one to all three parts of this inequality and take the log2 of all three parts: k - 1 < log2 ( n + 1) <= k
k - 1 < log2 ( n + 1) <= k If the equality holds, the tree is full, and k = log2 ( n + 1) = the number of levels height of the tree = log2 ( n + 1) - 1 Otherwise, log2 ( n + 1) is not an integer; round it up, and k = ceiling (log2 ( n + 1) ) = the number of levels and ceiling (log2 ( n + 1) ) - 1 = the height of the tree.
Suppose T is a binary search tree with 300,000 nodes having minimal height, for instance T may be a complete tree. The smallest integer, k, for which n <= 2k - 1 where n = 300,000 Is 19 219 - 1 = 524, 287 218 - 1 = 262,143 So 2k-1 - 1 < n <= 2k - 1 And the maximum number of comparisons in a search of this binary search tree for a particular node is 19 And if T is a complete tree, there are 150,000 leaf nodes, more than 112,000 are at next lowest level so the tree can grow without degrading search times.
In practice searching a set of data occurs MUCH MORE frequently than adding a new item of data, or removing an existing item of data. The algorithm presented in the text follows the premise that whenever a node is added to, or removed from, a balanced tree, the tree is tested, and if is unbalanced, the tree is rebalanced with one or more rotation operations.
A newer algorithm that will be presented in class follows a different premise. The tree is initially built as a complete binary search tree. As nodes are added, and removed (following the algorithms illustrated in class), the tree may become closer to a chain, and further from a complete tree. Consequently the search times become degraded. A statistical utility tracks the search times, and when the average number of comparisons per search exceeds log2 (n + 1) by some percentage, a rebalancing utility is called to reform the binary search tree to a complete binary search tree. So rebalancing occurs only when performance is suffering.
The rebalancing algorithm 1. Converts the tree to a vine. A vine is a binary tree in which the left child of every node is NULL. 2. Convert the chain to a complete binary search tree.
Step One - converting the tree to a vine: For each node, N, in the tree if N has a left child rotate N and its left child to the right (clockwise). If the left chilld of N has a right subtree, that subtree becomes the left subtree of N.
Step Two - converting the vine to a complete binary tree. The sequence { 2k - 1: k >= 1} = { 1, 3, 7, 15, 31, . . . } plays an important role in this step. Let n = the number of nodes in the vine. Let k be the smallest integer so that 2k-1 - 1 < n <= 2k - 1
Case I: n = 2k - 1 - the resulting complete tree will be a full tree. 1. At every second node, N, (and its parent), rotate to the left (counter clockwise). If N has a left subtree, it becomes the right subtree of N’s parent. The number of rotations = 2k-1 - 1 ( a value in the sequence above). 2. Repeat these rotations at every second node in the right chain for 2k-2- 1 nodes (the next smaller value in the sequence above). At the last repetition, perform a single left rotation at the second node and its parent.
Case II: 2k-1 - 1 < n < 2k - 1 The resulting tree will be complete, but not full. 1. Do a left rotation about every second node for a total of n - (2k-1 - 1) nodes. This is the number of nodes that will be in the lowest level of the complete tree. The resulting chain of right children will contain 2k-1 - 1 nodes. 2. Apply Case I.