500 likes | 631 Views
Introduction to Computer Science 2 Lecture 14: Balanced Binary Search Trees. “Trees with reorganization and limited path length difference”. Balanced Binary Trees. Motivation: natural search trees can degenerate In worst case search can cost (n)! Example: for n = 10 6 Elements
E N D
Introduction to Computer Science 2 Lecture 14: Balanced Binary Search Trees “Trees with reorganization and limited path length difference”
Balanced Binary Trees • Motivation: natural search trees can degenerate • In worst case search can cost (n)! • Example: for n = 106 Elements • Zavg,min = log2n - 1 = 19 • Zavg,max = ½ (n + 1) = 500000 • Result: one should try to avoid degenerated trees through restructuring the tree
Balanced Trees • Consider a tree where we want to insert Vienna/Athens • Since the tree will contain 7 nodes, we should be able to keep it balanced Paris Bonn Rome Bern Oslo Prague
Balanced Trees • For Vienna we have no problem, still a balanced tree Paris Bonn Rome Bern Oslo Prague Vienna
Balanced Trees • For Athens the organization of the tree must change to keep it balanced Oslo Bern Prague Athens Bonn Paris Rome No nodes are at the same place as before
Costs for Reorganization • Note: • The tree was “globally” reorganized to achieve balance • Global restructure costs (n)! • Questions: • Can the advantages of balanced trees be used, without paying the costs for the global reorganization? • When is the best time to do a reorganization? • Tradeoff: balance vs. costs: • “Almost balanced” binary trees are easy to reorganize • Definition of “almost balanced” should depend on locally testable criteria and algorithms • Goal: Search and reorganization in O( log2 n )!
Measure of Balance • Two different possible definitions of “almost balanced”: • Constraint on the height difference in the respective subtrees TODAY! • Constraint on the weight difference of the subtrees (weight = #leaves) Next lecture • Now we consider the height difference |x-y| y x
Height Balanced Trees • Constraint on the height difference of the subtrees • Let TL(x) and TR(x) be the left and right subtrees for a node x, and let h(T) be the height of tree T • Definition: A k-balanced binary tree is either empty or a tree where the following holds for each node x:| h( TL( x )) - h( TR( x )) | k • The height difference of the two sub-trees must not exceed k • With the parameter k one can make a tradeoff between cost and performance • Optimal for k=0, but already for k≥1 one can find a tree for all n • Larger k means higher search costs, but smaller reorganization costs
Example London | h( TL(Sofia)) - h( TR(Sofia)) | = 1 | h( TL(Rome)) - h( TR(Rome)) | = 1 | h( TL(Paris)) - h( TR(Paris)) | = 3 | h( TL(London)) - h( TR(London)) | = 2 … This tree is 3-balanced Bonn Paris Rome Bern Dublin Sofia Prague Vienna
Node that violates the criterion Locality of the Decision and Logarithmic Costs • Criterion can be locally detected • At each node only the (local) subtrees are relevant for the decision • Criterion can be locally “corrected” • When the criterion is violated at node x, one can change the structure of the subtree starting in x such that it holds again • Idea: Correct on a path starting from the node: cost=O(log2n) “Area” affected by change
AVL Trees • Introduced by Adelson-Velskii and Landis (1962) • Definition: A AVLTree is a 1-balanced binary search tree where the AVL criterion holds: | h( TL( x )) - h( TR( x )) | 1 for all nodes x • Consequences: • The left and right sub-tree for a node are also AVL trees • AVL trees are also binary search trees: search algorithms can be directly applied • The maintenance operations (insert, delete) must be changed, such that the AVL criterion still holds • AVL criterion guarantees “globally” balanced trees (more to come on that…)
AVL Trees • Recursive definition: • If TL and TR are AVL trees of heights m or m+1, then the resulting tree is also an AVL tree of height m+1 or m+2 m+1 m+2 m+2 m+1 m m m m m+1 TL TR TL TR TL TR • Now consider the operations where reorganizations must be done
0 Bonn Insert for AVL Trees • Definition: The balancing factorBF(x) of a node x is: BF( x ) = h( TL( x )) - h( TR( x )) • Store the balancing factor for each node in the AVL tree • Example: Insert the key “Bonn” in an empty AVL tree • The tree fulfills the AVL criterion • Now what happens when we add more keys? • Example: Insert “Bern” and “Athens”
+1 0 Bonn Bern Insert for AVL Trees • Insertion of “Bern” requires recalculating the balance factor for “Bonn” BF( x ) = h( TL( x )) - h( TR( x ))
+2 +1 Bonn Bern Insert for AVL Trees Now we insert “Athens” L L The criterion no longer holds Reorganization! 0 Athens BF( x ) = h( TL( x )) - h( TR( x ))
Bonn +2 +1 Bonn Bern Insert for AVL Trees 0 L Bern 0 0 L Athens Right rotation 0 Athens BF( x ) = h( TL( x )) - h( TR( x ))
Bonn Sofia Insert for AVL Trees • Now insert “Sofia” • How about “Vienna”? -1 Bern -1 0 Athens 0
-2 Bern Bonn Sofia -2 0 Athens -1 0 Vienna Insert for AVL Trees Reorganization needed R R Left rotation
Insert for AVL Trees -1 Bern Now we insert Paris 0 0 Athens Sofia 0 0 Bonn Vienna
Paris Insert for AVL Trees -2 Bern R 0 +1 Athens Sofia L -1 0 Bonn Vienna 0
Vienna Insert for AVL Trees -2 Oops! One rotation was not enough! Bern 0 -2 Athens Bonn 0 Sofia 0 0 Paris
Athens Insert for AVL Trees 0 Bonn +1 0 Bern Sofia 0 0 0 Paris Vienna And now Oslo and Rome …
Vienna Rome Bonn Paris Sofia Oslo Insert for AVL Trees -1 +1 +1 Bern 0 0 0 Athens 0 0 And now Prague…
Prague Vienna Rome Bonn Paris Sofia Oslo Insert for AVL Trees -2 +2 +1 L Bern 0 0 -1 R Athens 0 +1 0
Prague Vienna Rome Bonn Paris Sofia Oslo Insert for AVL Trees -2 +2 +1 Bern 0 0 +2 Athens 0 0 0
Prague Vienna Rome Bonn Paris Sofia Oslo Insert for AVL Trees -1 0 +1 Bern -1 0 0 Athens 0 0 0
Systematic Rotations Note: • Changes are only done on a path from the root to the inserted node (i.e., Wien) • Pivot for the rotation is always the closest parent with BF= 2 • The closest parent previously had BF = 1 -2 Bern -2 0 Bonn Athens -1 Sofia 0 Vienna
Systematic Rotations • Starting in the closest parent node having BF=±2, consider the path to the insertion point: • RR: Right-Right Left rotation • LL: Left-Left Right rotation • RL: Right-Left Double rotation “right” • LR: Left-Right Double rotation “left” R L R L R L L R
RR – left rotations -1 -2 R 0 0 -1 0 R b b a a b a
+1 0 0 0 LL – right rotations -2 L -1 L b b a a a b
LR – double rotation +2 +1 L R -1 0 +1 -1 d d b c c a a b Added node here
LR – double rotation 2. LL here 1. RR here +2 0 -1 0 0 +1 d c a c a b d b Added node here
LR – double rotation 2. LL here 1. RR here +2 0 -1 0 0 -1 d b a c b a c d Added node here It doesn’t make any difference (in terms of rotations) if a node is added left or right (compare with previous slide)
0 40 0 0 35 55 0 0 0 0 30 37 50 60 Example Try this out at home: • Add the following numbers to an AVL tree • 50, 30, 40, 35, 37, 60, 55 • And you should get this AVL tree: Does the order of insertion matter?
Correctness and Completeness of the Rotations • The missing RL rotation is symmetric to LR • Note: • The set of rotation types is complete (i.e., there are no other types starting at the closes parent with BF= 2) • Before the rotation BF=1, after = 0 All sub trees are balanced • The rebalanced subtree with x as root has the same height as before the insertion • After a rotation the tree is again a correct AVL tree • The AVL criterion holds after repeated rotations, i.e., the tree remains a balanced binary search tree!
Deletion in AVL Trees • Insertion needs maximum one rotation (LR/RL is one) • Deletion is more complex, as more than one rotation may be needed on the path to the root • Strategy: • Deletion is done as for binary search trees, i.e., deletion is done by “deleting leaves” • Deletion of a node with two descendants either the “biggest left” or “smallest right” is used • The balance factor in the new subtree must be recalculated • The balance factor on the path to the root must be (recursively) checked and updated! How many rotations are maximally needed?
Deletion in Binary Search Trees • We differentiate between three cases: • Case 1: Node x is a leaf: The leaf can be deleted. There is no need for additional operations. y y z x z • Case 2: • Node x has one child: delete node x, set the reference to the unique subtree of x. x z z Tl Tr Tl Tr
Deletion in Binary Search Trees • Case 3:Node x has two children: Search either for the smallest right (sr) descendent or for the greatest left (gl) ancestor. Replace x with kr or gl and delete kr respectively gl from its original position.
+2 Prague Athens Vienna Tokyo Rome Cairo Sofia Paris Bern Oslo Example 1 1 1 Lima 1 1 1 0 Bonn 1 0 0 0 0 Deleting Vienna causes rebalancing
Prague Athens Tokyo Rome Cairo Sofia Paris Bern Oslo Example – Deleting Vienna +1 +1 +2 Lima +1 +1 +1 Bonn LL in Tokyo +1 0 0 0 0 Idea: treat it as the case where Rome is added LL Rotation for Tokyo
Prague Athens Tokyo Rome Cairo Sofia Paris Bern Oslo Example +2 +1 0 Lima +1 +1 0 0 Bonn +1 0 0 0 Idea: treat it as the case where Athens is added LL Rotation for Prague
Prague Athens Tokyo Rome Cairo Paris Sofia Lima Oslo Example 0 +1 0 Bonn +1 0 +1 0 Bern 0 0 0 0 Recursive testing of balancing factors ends at the root
General Deletion • Possible cases by deletion are treated with more care and rotation cases • For each case, test if the BF of the subtree has changed • If yes: Rebalancing recursively starting with the parent node • If no: finished • Cases are again complete (also symmetric cases, please check it!) • Rotations are the same as by insertion: deletion corresponds to a “virtual” insertion in the opposite subtree
Deletion: Case 1 • Nodes are deleted from the left subtree • AVL Criterion holds: no rotation • Height remains unchanged: no recursion h+1 h+1 0 -1 Tl Tr Tl Tr h h-1 h h
Deletion: Case 2a • Nodes are deleted from the left subtree • AVL Criterion holds: no rotation • Height of the tree is reduced: recursively check the balancing factor if the parent nodes! h+1 h +1 0 Tl Tr Tl Tr h h-1 h-1 h-1
Deletion: Case 2b • Nodes are deleted from the right subtree • AVL Criterion violated: rotation needed • Which rotation to do depends on the structure of the left subtree • Rotation can lead to a height reduction... h+1 h+1 +1 +2 Tl Tr Tl Tr h-2 h h-1 h
Deletion: Case 3a h+2 h+2 +2 -1 • Rotation absorbs the height reduction in c: AVL Criterion holds again • Height of the tree remains unchanged: no recursion for parent nodes needed LL or LR? h+1 0 c +1 h-1 a b a b c h-1 h h h
Deletion: Case 3b h+1 h+2 +2 0 • Right rotation leads to fulfilling the AVL criterion for root • Height of the tree is reduced! h+1 h +1 c 0 h-1 a b a b c h-1 h-1 h h
Deletion: Case 3c h+2 h+3 +2 0 • Double rotation fulfills the AVL criterion in root • Height is reduced: recursion on the path to the root • No more cases h+2 h -1 d h+1 0 0 a h+1 0 h a b c d h b c h h
Height of AVL Trees • AVL trees are defined by the height difference of subtrees • Original goal: the tree should be as “balanced” as possible • How balanced is an AVL tree? • The answer is given by the theorem of height of an AVL tree: Theorem: For the height h(T) of an AVL tree with n nodes holds: log2n + 1 h(T) 1.44 log2( n+1 )