330 likes | 425 Views
Lecture 9. Tutoring. Tutoring for CS122 , CS 120, and CS 201-203 is available for the rest of the term in the CS department office at the following times: Mondays 11-3 Wednesdays 10-1 Thursdays 3-6 Fridays 10-2. Schedule Weeks 8-9. I will be out of town for a long Memorial Day weekend
E N D
Tutoring • Tutoring for CS122, CS 120, and CS 201-203 is available for the rest of the term in the CS department office at the following times: • Mondays 11-3 • Wednesdays 10-1 • Thursdays 3-6 • Fridays 10-2
Schedule Weeks 8-9 • I will be out of town for a long Memorial Day weekend • Jeff Miller will sub this Wednesday. He will not have seen the lab assignment. • Next Monday, the 26th, is a holiday • My office hours are cancelled on Tuesday, May 27 • I will be back Weds, May 28.
Breadth First Traversal of a Binary Tree Breadth First Search is not a wise way to find a particular element in a binary search tree, since it does not take advantage of the (log n) binary search. However, the term is often used loosely to describe breadth-first traversal of an entire data structure. I have assigned this task in lab 7 mostly because we happen have a BST class to use. The most likely actual use case for BFS with a BST is to test whether your trees are being constructed correctly. Here is the algorithm: • Create a permanent queue to hold all elements. • Create another, temporary queue to hold elements that you are processing to find the BFS order. • Add the root to the temporary queue. • As long as there are node references in the temporary queue, poll them one at a time, add them to the permanent queue, and add any left or right references to the temporary queue • When the last node is polled from the temporary queue, the permanent list is complete.
BST Node Removal The algorithm for BST node removal from the last lecture is the one applied in the sample code from the Liang book. Horstmann describes a different but equivalent method: • If the node has no children, just reset the link from its parent to null (same as Liang) • If the node to be deleted has one child, whether left or right, change the link from the parent to point to the child instead • If the node to be deleted has two children, find the smallest value in the right subtree (mirror image of part of Liang's method) • move its data element into the node that holds the data to be removed • If it has a right child, change the link from its parent so that it points to the right child
BST Time Complexity Note that, if there are at least two elements to be placed in a BST, the same data can generate multiple different BSTs. If there are at least three nodes, it can generate BSTs with different heights. 1 2 2 1 3 3
BST Time Complexity Consider what happens if a BST is created from already-sorted data without any attempts at optimization: 1 Already-sorted data is not unusual! 2 3 4
BST Time Complexity A complete binary tree is one in which every level other than the lowest level is completely filled, and all nodes are as far to the left as possible:
BST Time Complexity If a tree is perfectly balanced, i.e., a complete binary tree, its height (number of levels minus 1) is the floor of log n. If it is completely unbalanced, its height is n-1 The time complexity for search, insertion and deletion is the height of the tree. In the worst case, the height of the tree is n - 1. In the best case it is log n. We want our trees to be balanced, so that we get the O(log n) behavior
Balancing Binary Search Trees We can keep trees in balance by rotating nodes in various ways after insertions and deletions. Rebalancing trees is expensive. In general, however, the more often we search the tree relative to the frequency with which we insert or delete nodes, the more advantageous it is to keep the tree balanced. It is possible to maintain a BST so that it is always perfectly balanced, but in most cases we can be a little bit lax about this in order to reduce the cost of rebalancing. The compromise is to maintain a well-balanced tree, i.e., the heights of two subtrees for every node are about the same. One way to accomplish this is by using a Red-Black Tree. The Java Collections Framework TreeSet and TreeMap are implemented this way.
Red-Black Trees Some of the following slides are from Liang; other are by Vicky Allen of Utah State University: digital.cs.usu.edu/~allanv/cs2420/RedBlackTrees.ppt
2-4 Trees Red-Black trees correspond to a type of tree called a 2-4 Tree or a B-Tree of degree 4. These are outside the scope of this class, and you don’t need to understand them to understand Red-Black trees. Wikipedia
Red-Black Trees A red-black tree is a binary search tree in which: • Every node is either red or black. • The root is black • If a node is red, then its children are black. • Every path from a node to a leaf or null reference contains the same number of black nodes.
Red-Black Trees • The black-height of a node, n, in a red-black tree is the number of black nodes on any path to a leaf / null reference. • Unlike the usual way of counting tree height, we are counting nodes, not edges. • Some ways of describing red-black trees refer to null references as black nodes; in this case, the black height does not count them.
A valid RBT tree with black-height = 4 Even though we would count the total height as only 3 An all-black tree must be full, or it would violate the equal-exit cost property.
Red-Black Trees A tree with a given black height has at least 2bh -1 nodes 2bh -1 <= n 2bh<= n + 1 bh<= log(n + 1) Since no red node may have a red child, the height of the tree is no more than twice bh, so h <= 2 * bh <= 2 * log(n + 1) If you want to get mathy, work out how log(n+1) relates to log(n) for increasing n. If not, take my word for it that this is O(log n). Therefore, while the tree may not be perfectly balanced, it is sufficiently balanced to preserve the O(log n) behavior
Node Insertion Two easy cases for node insertion: • If a node to be inserted will be the root of the tree, we just make the root reference point to it and color it black. • If the parent of the node to be inserted is black, we just make the parent's reference point to the new node and color the node red. This can't create a violation of any of the RBT properties.
Node Insertion If the parent node is red • We can’t just add the new node and color it black, because it will have a larger black height than other terminal nodes in the tree • We can’t just add the new node and color it red, because red parent nodes can’t have red child nodes
Node Insertion In the following images, the nodes and references are ordered in sort order. For example, n1 is earlier in sort order than n2, and t1 is earlier in sort order than t2). Here is the configuration we need to get to: Since we are moving the black grandparent down, its new sibling must also be black in order to preserve the equal-exit-cost property
Node Insertion There are four possible red parent-situations. In all cases, the grandparent is black, since the tree was a valid BST before the insertion.
private void fixDoubleRed(Node child) { Node parent = child.parent; Node grandParent = parent.parent; if (grandParent == null) { parent.color = BLACK; return; } Node n1, n2, n3, t1, t2, t3, t4; if (parent == grandParent.left) { n3 = grandParent; t4 = grandParent.right; if (child == parent.left) { n1 = child; n2 = parent; t1 = child.left; t2 = child.right; t3 = parent.right; } else { n1 = parent; n2 = child; t1 = parent.left; t2 = child.left; t3 = child.right; } } else { n1 = grandParent; t1 = grandParent.left; if (child == parent.left) { n2 = child; n3 = parent; t2 = child.left; t3 = child.right; t4 = parent.right; }
else { n2 = parent; n3 = child; t2 = parent.left; t3 = child.left; t4 = child.right; } } replaceWith(grandParent, n2); n1.setLeftChild(t1); n1.setRightChild(t2); n2.setLeftChild(n1); n2.setRightChild(n3); n3.setLeftChild(t3); n3.setRightChild(t4); n2.color = grandParent.color - 1; n1.color = BLACK; n3.color = BLACK; if (n2 == root) { root.color = BLACK; } else if (n2.color == RED && n2.parent.color == RED) { fixDoubleRed(n2); } }
Node Insertion This procedure may create a double-red situation farther up the tree, so the rebalancing is applied recursively moving towards the root If rebalancing ends up turning the root red, we just change it to black. This has the same effect on the black height of all nodes, and also preserves to other properties.
Node Deletion First, apply the BST deletion algorithm • Recall that, under Horstmann's method, if we removed data that was in a node with two children, we did not remove this node; we removed a node farther down instead • If we removed a red node, all the RBT properties are preserved • If we removed a black node, though, we need to rebalance to preserve the equal-exit-cost property
Node Deletion • Note that every black node has a sibling; otherwise there would be a violation of the equal exit cost rule. • The sibling may be black, but it is also possible to have a red sibling that has a black child node
Node Deletion If the sibling is black and the parent is red, remove the node to be deleted, change the parent to black, and change the sibling to red • if this causes a double-red violation in one or both of the sibling's subtrees, rebalance to correct this
Black Parent It is not possible to have a red sibling and a red parent, since this would violate the double-red property. The node may have a black parent and either a red or black sibling. In this case, deleting the black node caused an exit-cost imbalance. Recall that red is defined as 0 and black as 1. "Bubble up" the black cost to the black parent, which temporarily holds a black cost of 2, and consider a red sibling as "negative red", with a cost of -1, or a black sibling as red
Red Sibling-Black Parent This creates the situation shown on the left. The cure is much like the two-child BST deletion. Note that n4 below donates its extra unit of cost to n1, which becomes a regular red node. It may be necessary to correct a double-red situation in n1's subtrees.
Double-Black And Double-Red Bubbling up the cost of a black sibling to a black parent results in a double black parent and a now-red sibling. When this creates a double-red configuration below, we can rotate nodes and color all three black: note that this does not change the exit cost for any path Before:
Double-Black Without Double-Red If we don’t have a double-red situation, we bubble up the extra cost again until we either • Have a black sibling-red parent situation, in which case we turn the parent black and turn the sibling red, preserving equal-exit costs • Have a red sibling-black parent situation, in which case we can correct by rotation as shown several slides above • Reach the root, in which case we can reduce all the costs in the tree by one by simply making it a regular black node