440 likes | 579 Views
Chapter 10 Search Structures. Instructors: C. Y. Tang and J. S. Roger Jang. All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU). Balance Matters. Binary search trees can be degenerate.
E N D
Chapter 10Search Structures Instructors: C. Y. Tang and J. S. Roger Jang All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).
Balance Matters • Binary search trees can be degenerate. • If you insert in a sorted order using the insertion algorithm introduced before, you’ll obtain a degenerate BST. • O(n) time search in such cases.
Balanced Binary Search Trees • There are binary search trees that guarantees balance. • Balance factor of a node: (height of left subtree) – (height of right subtree) • An AVL tree is a tree where each node has a balance factor of -1, 0, or 1.
AVL Trees • Balance is maintained by the insertion and deletion algorithms. Both take O(log n) time. • For example, if an insertion causes un-balance, then some rotation is performed. • For details please refer to your textbook.
2-3 Trees • Each node can have a degree of 2 or 3. < 40 > 40 External nodes are at the same level.
2-3 Trees • The number of elements is between 2h– 1 and 3h– 1, where h is the height of the tree. • So if there are n elements, the height is between log3 (n+1) and log2 (n+1). • Hence to search in a 2-3 tree, you need O(log n) time.
Search in A 2-3 Tree • The search algorithm is similar to that for a BST. • At each node (suppose we are searching for x): k1 k2 Go this way if x > k2 Go this way if x < k1 Go this way if x > k1 and x < k2
Insertion Into A 2-3 Tree To insert 70: First we search for 70. It is not in the tree, and the leaf node encountered during the search is node C. Since there is only one key there, we directly insert 70 into node C.
median • Now we want to insert 30. The leaf we encounter is B. • B is full. So we must create a new node D. • The keys that will be concerned in this operation are 10, 20 (elem. in B) and 30 (elem. to be inserted). • Largest (30): put into D. Smallest (10): remain in B. Median (20): insert into A, the parent of B. • Add a link from A to D. largest smallest
70 will be inserted to A A 20 | 40 10 | 30 | 60 | 80 | • Now we want to insert 60. We encounter leaf node C when we search for 60. • Node C is full, so we: • Create node E to hold max{70, 80, 60} = 80. • min{70,80,60} = 60 will remain in C. • The median, 70, will be inserted into A. • But A is also full, so… • New node F will be created. • F has children C (where 60 is in) and E (where 80 is in). B D C E
70 will be inserted to A A 20 | 40 80 | E 10 | 30 | 60 | • But A is also full, so… • Create node F to hold max{20,40,70} = 70. • F has children C and E. • min{20,40,70} = 20 will remain in A. • med{20,40,70} = 40 should be inserted into parent of A. But A has no parent, so create G to hold 40. • G has children A and F. B D C
Split A 3-Node • Inserting y into a 3-node B causes a split. med will be inserted into A. A A B B x | z min | max | G(new) C F D E (F is a node that does not have a parent yet. ) min, max, and med are the minimum, maximum, and median of {x, y, z}, respectively.
Split • Observe that this pattern repeats. med(next y) parent(p) (next p) parent(p) p p x | z min | max | (next q) ch1(p) ch2(p) ch3(p) q ch1(p) ch2(p) ch3(p) q The position to insert the link to q depends on the situation. q is initialized to be null. At that time p is a leaf.
Split • Split is simpler when p is the root. New root med | p p x | z min | max | ch1(p) ch2(p) ch3(p) q ch1(p) ch2(p) ch3(p) q The position to insert the link to q depends on the situation.
Insertion Algorithm • We are to insert key y in tree t. • First, search for y in t. When you visit each node, push it into a stack to facilitate finding parents later. • Assume that y is not in t (otherwise we need not insert). Let p be the leaf node we encountered in the search. • So, if we pop a node from the above stack, we’ll obtain the parent of p (assume that p itself is not pushed into the stack).
Insertion Algorithm • Initialize q to be null. • If p is a 2-node, then simply insert y into p. • Put q immediately to the right of y. That is, if w is originally in p, then we have two cases: p p w | y y | w q=nil nil nil q=nil nil nil And we’re done!
Insertion Algorithm • If p is a 3-node, then split p. med(next y) parent(p) (next p) parent(p) p p x | z min | max | (next q) nil nil nil q=nil nil nil nil q=nil Then, let p = parent(p), q be the new node holding max, and y = med. We’ll now consider the insertion of the new y into the new p.
Insertion Algorithm • In the remaining process, if p is a 2-node, then simply insert y into p, and update the links as: p p w | y y | w q a b q b a And we’re done!
Insertion Algorithm • If p is a 3-node, then split. Then we’ll continue to insert the new y into the new p. med(next y) parent(p) (next p) parent(p) p p x | z min | max | (next q) ch1(p) ch2(p) ch3(p) q ch1(p) ch2(p) ch3(p) q The position to insert the link to q depends on the situation.
Insertion Algorithm • If p (3-node) is the root, then the split is done in the manner as stated before. We’re done after this. New root med | p p x | z min | max | ch1(p) ch2(p) ch3(p) q ch1(p) ch2(p) ch3(p) q The position to insert the link to q depends on the situation.
Correctness of Insertion • Note that, all keys in part B, including y and keys in q, lie between u and v. • Because we followed the middle link of parent(p) when we did the search in the example below, the input key (to be inserted) falls between u and v. • Besides the (input) key to insert, all keys in B were originally there and fall between u and v. parent(p) u | v y to be inserted in p p ? | ? ch1(p) ch2(p) ch3(p) q A C B
Correctness of Insertion • So the global relationship is ok. As to the local relationship among the keys, the insertion actions clearly maintain such properly. parent(p) u | v p w | y ch1(p) ch2(p) q
Correctness of Insertion • You should use induction as well as these observations to give a more rigorous proof. • p and q are always 2-3 trees after each iteration.
Time Complexity of Insertion • At each level, the algorithm takes O(1) time. • There are O(log n) levels. • So insertion takes O(log n) time.
Deletion From A 2-3 Tree • Deletion of any element can be transformed into deletion of a leaf element. • To delete 50, we replace 50 by 60 or 20. Then delete correspondingly the leaf element 60 or 20. • 60 is the leftmost leaf element in the right subtree of 50. • 20 is the rightmost leaf element in the left subtree of 50. 20 | 80 10 | 60 | 70 90 | 95 Use the algorithm presented later to delete 20 in the leaf.
Deletion From A 2-3 Tree • Delete 70 (in C). This case is straightforward, as the resulting C is non-empty.
Deletion From A 2-3 Tree • Delete 90 (in D). This is also simple; a shift of 95 in D suffices.
Deletion From A 2-3 Tree • Delete 60 (in C). C becomes empty. • Left sibling of C is a 3-node. Hence (rotation): • Move 50 from A to C. • Move 20 from B to A.
Deletion From A 2-3 Tree • Delete 95 (from D): D becomes empty. Its left sibling C is a 2-node, hence (combine): • Move 80 from A to C. • Delete D.
Deletion From A 2-3 Tree • Delete 50 (in C). Simply shift.
Deletion From A 2-3 Tree • Delete 10 (in B): B becomes empty. The right sibling of B is a 2-node, hence (combine): • Move 20 from A to B. Move 80 from C to B. • The parent A, which is also the root, is empty. Hence simply let B be the new root.
Rotation and Combine • When a deletion in node p leaves pempty, then: • Let r be the parent of p. • If p is the left child of r, then let q be the right sibling of p. • Otherwise, let q be the left sibling of p. • If q is a 3-node, then combine. • If q is a 2-node, then rotation.
Rotation • If p is the left child of r: • (“?” means don’t care) • Observe the correctness.
Rotation • If p is the middle child of r.
Rotation • If p is the right child of r.
Combine • If p is the left child of r: • Case 1: If r is a 2-node. • r becomes empty, so we set p to be r, and continue to consider to rotate/combine the new p. If r is a root, then let p become the new root.
Combine • If p is the left child of r. • Case 2: If r is a 3-node.
Combine • If p is the middle child of r: • Case 1: If r is a 2-node. • Continue to handle the empty r as before. r r w | | q p p y | | y | w b c c a b a
Combine • If p is the middle child of r: • Case 2: If r is a 3-node. r r w | x x | q p p y | | d y | w d b c c a b a
Combine • If p is the right child of r: r r w | w | x p q p y | x y | | a a c d b c d b
Correctness of Deletion • Observe that, if a combine results in a new empty node, then that node must have the following appearance (r with one tail): r | • r will become p in the next iteration. • In the (left-hand side) pictures we’ve seen, p has the above appearance. So applicable.
Correctness of Deletion • We begin with p being a leaf. At that time, the children of p are all null. So rotation/combine as illustrated in the previous figures are also applicable. • Correctness of other parts should be clear.
Time Complexity of Deletion • At each level: O(1) time. • Rotation/combine need O(1) time. • #levels: O(log n). • Total: O(log n) time.