450 likes | 471 Views
Algorithms and data structures. Balanced trees Red-Black trees Context trees B- trees. Balanced trees. Time of BST operations is proportional to height of the tree Perfectly balanced tree – for any node the size of left and right subtree are equal (with tolerance 1) .
E N D
Algorithmsand data structures Balanced trees Red-Black trees Context trees B-trees
Balanced trees • Time of BST operations isproportional to height of the tree • Perfectly balanced tree– for any node the size of left and right subtree are equal (with tolerance 1). • Perfectly balanced tree– for each node length of any path from the node to leaf could differ at most by 1. • Approximately balanced tree –for each node length of any path from the node to leaf could differ at most two times.
Examples of balanced trees • AVL trees • Red-Black Trees • B-Trees
Red-Black tree • Any node is black or red • NULL is black • A red node cannot have a red child • For each node any path from this node to leaf has to contain exactly the same number of black nodes (black height)
Red-Black Tree (RBT) 30 • Height of the tree: h(RBT) 2lg(n+1) 25 41 15 28 35 45 33 37 8 18 27 29 NULL NULL NULL 4 10 36 40 NULL NULL NULL 26 NULL 20 NULL NULL NULL 1 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
Balance vs RBT Any path from the node to leaf(or to the NULL) contains the same number of black nodes Cannot exist the node that has than 2 children and has any black descendant NULL NULL
Properties of RBT h(RBT) 2log2(n+1) Search O(log(n)) Min O(log(n)) Max O(log(n)) Succesor O(log(n)) Predecesor O(log(n))
Node definition class NODE : data = Nonecolor = BLACKleft = Noneright = Noneparent = none
Rotation y x x y C A C B A B RightRotate(T,y) LeftRotate(T,x)
LeftRotate y x x y C A C B A B def LeftRotate(root, x):y = x.rightif (y==None): returnx.right = y.leftif (y.left!=None): y.left.parent = xy.parent = x.parentif x.parent==None:root = yelif x==x.parent.left:x.parent.left = yelse:x.parent.right = yy.left = xx.parent = yreturn root How does the code change for RightRotate?
Insertion of the node Cannot break the black-len of path rule (4) Can break the black-len of path rule (4) Can break the lack of red children rule (3)
Insertion of the node • Insert node x as a leaf • Set x.color = red • Fix-up the tree(the rule #3 can be broken) • means the child-side of x (i.e. left or right)
RBT:INS Fix-up process - 1 15 20 4 8 25 1 Case #1 15 6 10 uncle 20 4 5 x new x 8 25 1 6 10 5 Case #1: uncle(x) is red recolor grandparent(x), parent(x), uncle(x) continue fix-up from grandparent, i.e. x = grandparent(x)
RBT:INS Fix-up process - 2 15 uncle 20 15 4 uncle 25 20 x 8 new x 8 1 25 4 10 6 10 Case #2 6 1 5 5 Case #2: x is same side-child as uncle(x) (i.e. both are ) set x = parent(x) ’-rotate tree on x (on the new x and against uncle direction) Note: After this operation x will be the oposite side-child of his parent in comparison to uncle(x) vs. parent(uncle(x))
RBT:INS Fix-up process - 3 15 uncle 20 8 x 25 4 10 8 x 15 20 6 4 1 10 6 25 Case #3 5 1 5 Case #3: x is the oposite side-child vs. uncle(x) (i.e. uncle(x) is ’) recolor the parent(x), grandparent(x), -rotate tree on grandparent(x). (i.e. in the uncle-direction)
Fixing the treeafterbINS op. assumption: the root is black (why not?) whilex!= root and parent(x).color == Color.RED: ifuncle(x).color == Color.RED: recolor(grandparent(x), parent(x), uncle(x))#1 x = grandparent(x)#continue from grandparent else: if is__child(uncle(x)):#uncle(x) is same-child as x x = parent(x) #2’-rotate(root, x)# i.e. ’ on new x Recolor(parent(x), grandparent(x))#3 ’-rotate(root, grandparent(x))#i.e. towards uncle(x) root.color = Color.BLACK root.color = Color.BLACK# for one-element tree
The node insertion – implem. def RBTInsertNode(root, x): root = BSTInsertNode(root, x) x.color = Color.RED # assumption: root.color == BLACK while x != root and x.parent.color == Color.RED: if x.parent == x.parent.parent.left: # father is a left child # so uncle is a right child fix-up_for_right-child_uncle else: fix-up_for_left-child_uncle root.color = Color.BLACK root.color = Color.BLACK return root
Fix-up for right-child uncle uncle = x.parent.parent.right if GetColor(uncle) == Color.RED : x.parent.color = Color.BLACK # case 1 uncle.color = Color.BLACK x.parent.parent.color = Color.RED x = x.parent.parent else: if x == x.parent.right: x = x.parent # case 2 root = LeftRotate(root, x) x.parent.color = Color.BLACK # case 3 x.parent.parent.color = Color.RED root = RightRotate(root, x.parent.parent)
Getting collor vs. NULL def GetColor(node):if node !=None:return node.colorelse:return Color.BLACK
Removing of the node NULL NULL We move the additional black color to the child (or maybe children?) Problems: What if the node have any child? Could the node have a black child (chilren)?
Removing of the node • Remove the node as usual • If removed node is black give additional black color to child. • If it is doubly black fix-up is required
RBT: DEL fix-up process - 1 2 x brother 10 1 B A A 7 15 D C E F Case #1 10 2 15 brother (new) x 7 1 E F Case #1: if brother is red recolor nodes brother(x), parent(x), -rotate tree on parent(x) (i.e. towards x) and update brother Note: after this procedure brother(x) is black B C A A D
RBT: DEL fix-up process - 2 E C E C F D F D 2 x brother 7 1 new x A B Case #2 5 9 2 7 1 A B 5 9 Case #2: if both children of brother (x) are black set brother(x).color = RED set x = parent(x)
RBT: DEL fix-up process - 3 C E E F D F 2 x brother 7 1 Case #3 2 A B x 5 9 brother (new) 5 1 A C B 7 D Case #3 (... at least one child of brother(x) is red...) : if farther child (’) of brother(x) is black recolor brother(x) and -child of brother(x) (i.e. closer child of brother) ’-rotate the tree on brother(x) (i.e. against x-direction) and update brother 9 Note: after this procedure farther child of brother(x) is red
RBT: DEL fix-up process - 4 E E C C D F F D 2 x brother 5 1 5 A B Case #4 3 7 2 7 x 3 1 A B STOP: i.e. x = root Case #4: if farther child of brother(x) is red Set brother(x).color = parent(x).color Set parent(x).color = BLACK Set ’-child(brother(x)).color = BLACK (i.e. farther child of brother) -rotate the tree on parent(x) (i.e. towards x) STOP i.e. x = root
Implementation notes To avoid checking if child (children) != None before get color/check left/right - a function (similar to GetColor) could be defined - a guard pattern could be implemented i.e. all the None values could be replaced by a special node
Augmented RBT ordinal stats. Update of sizes: def LeftRotate (root, x): ..... y.size = getsize(y); x.size = getsize(x->left) + getsize(x->right) +1return root 93 19 y 42 19 x y T=RightRotate(T,y) x 42 11 7 93 12 6 T=LeftRotate(T,x) 4 6 7 4
B-tree . M . n.keys[1] n.keys[0] . D . H . . Q . T . X . n.sons[0] n.sons[2] n.sons[1] B C F G J K L N P R S V W Y Z • The node with i-1 keys has i children • i-th key is are greater (or equal) than all the keys for i-th child • i-th key is are smaller (or equal) than all the keys for i+1-th child • Each node (except for root) contains at lest T-1 keys (i.e. T sons) • Each node contain at most 2T-1 keys (i.e. 2T sons)
Minimal B-tree (h=3) T - 1 T - 1 T - 1 T - 1 T - 1 T - 1 T T T T T T root 1 2 2t 2t2 1 T - 1 T - 1 T - 1 T - 1 T - 1 T - 1 T - 1 T - 1 For T = 2 we get so called „2-3-4 tree”
Properties of B-trees • B-tree is perfectly balanced • The number of keys (and children) varies • All the leaves are on the same depth • Small height of the tree • Designed for minimizing the number of accesses to the storage (the root node is kept in memory) Most of keys is stored in leaves
Node definition T = 5 class BNODE: isLeaf=truecntKey=0keys = Array(2*T-1, None) sons = Array(2*T, None) #position of the node in the storagethisNodeDiscPos = None#positions of data (for particular keys) #in the starage dataDiscPos = Array(2*T-1, None) def Array(size, initVal=None): return map(lambda x: initVal, range(0,size)) class DISCPOS:...
Helper functions def LoadNode(nodeDiscPos) # allocation in memory + read from storage def WriteNodeToDisc(node) # writing to storage -> node.thisNodeDiscPos AllocateNode() # allocation in memory and in the storage # writing data to storage p = BNODE() #p.isLeaf = true, p.cntKey = 0 p.thisNodeDiscPos = AllocateSpaceOnDisc() WriteNodeToDisc(p) return p
Search in B-tree BTreeFind(p,k): if node_contains_key(p, k): returnp elif p.isLeaf: return None else: #p is_not_leaf_and_doesnt_contain k c = get_child_of_node_that_can_contain(p, k) ptmp = LoadNode(c) ret = BTreeFind(ptmp, k) #be sure that ptmp is freed if ret!=ptmp return ret
Splitting the node T = 4 keys[i-1] keys[i] p N . W sons[i] w . P . Q . R . S . T . U . V . keys[i] keys[i-1] keys[i+] p N . S . W sons[i] sons[i+1] w y . P . Q . R . . T . U . V .
Splitting the root node T=4 w . P . Q . R . S . T . U . V . root keys[0] p . S . sons[0] sons[1] w y . P . Q . R . . T . U . V .
Splitting the node in B-tree Split of the maximal node w, i-th child of p • Center key w of 2*T-1 keys is moved into p node (before i-th key) • Pointer to the new node z is insert into p node (before i-th child pointer) • T-1 keys from w are moved into z • T pointers from w are moved into z • The new node z should be returned (if necessary the receiver shold free the memory after ussage)
B-tree: Splitting the node BTreeSplit(p, i, w):#Assumption: p!=w if we want to split the root node #the new node should be added first (above the old root)z = AllocateNode()z.isLeaf = w.isLeafz.cntKeys, w.cntKeys = T-1, T-1 for j in range(p.cntKey-1,i,-1): p.keys[j]=p.keys[j-1] #p.data[j]=p.data[j-1] for j in range(p.cntKey, i,-1): p.sons[j]=p.sons[j-1] p.keys[i] = w.keys[T-1] #p.data[i]=w.data[T-1] p.sons[i] = zp.cntSons = p.cntSons +1for j in range(0, T-1): z.keys[j] = w.keys[T+j] #z.data[j]=w.data[T-1+j] for i in range(0,T): z.sons[j] = w.sons[T+j] WriteNodeToDisc(p) WriteNodeToDisc(w) WriteNodeToDisc(z) return z
T=3 . G . M . P . X . A C D E J K N O R S T U V Y Z +B . G . M . P . X . A B C D E J K N O R S T U V Y Z +Q . G . M . P . T . X . A B C D E J K N O Q R S U V Y Z
T=3 . G . M . P . T . X . A B C D E J K N O Q R S U V Y Z +L . P . . G . M . . T . X . A B C D E J K L N O Q R S U V Y Z +F . P . . C . G . M . . T . X . A B D E F J K L N O Q R S U V Y Z
B-tree: Insertion of the key #1 w = root if is_maximal(root):new_root = add_a_new_root(root) split_node(root, root) w = new_root #2 c = get_child_of_node_that_can_contain(w, k) if is_maximal_node(c): split_node(root, c) c = get_child_of_node_that_can_contain(w, k) if c.isLeaf: add_to_node_a_key(c, k) else: recursve_continue_#2_for_node(c)
T=3 . P . . C . G . M . . T . X . A B D E F J K L N O Q R S U V Y Z -F . P . . C . G . M . . T . X . A B D E J K L N O Q R S U V Y Z -M . P . . C . G . L . . T . X . A B D E J K N O Q R S U V Y Z
T=3 . P . . C . G . L . . T . X . A B D E J K N O Q R S U V Y Z . L . -S . C . G . . P . T . X . A B D E J K N O Q R S U V Y Z -S . L . . C . G . . P . T . X . A B D E J K N O Q R U V Y Z
T=3 . P . . C . G . L . . T . X . A B D E J K N O Q R S U V Y Z -G . P . . C . L . . T . X . A B D E J K N O Q R S U V Y Z -D . C . L . P . T . X . A B E J K N O Q R S U V Y Z
B-tree: Removing of the key #do not visit minimal nodes! w = root if w.isLeaf and node_contains_key(w , k) : remove_form_node_the_key(w, k) elif not w.isLeaf and node_contains_key(w , k) : p = get_child_preceeding_the_key(w, k) n = get_child_sucseeding_the_key(w, k) ifis_minimal_node(p) and is_minimal_node(n): new_node = merge_nodes(p, n)recursivelly_remove_key_from_node(new_node, k) else: if not is_minimal_node(p): k1= find_predecessor(p, k) recursivelly_remove_key_from_node(p, k) else:k1= find_successor(n, k) recursivelly_remove_key_from_node(n, k) replace_k_with_k1
B-tree: Removing of the key elif not w.isLeaf and not node_contains_the_key(w, k) : p = get_child_of_node_that_can_contain(w, k) if is_minimal_node(p): l = get_left_brother(p) r = get_right_brother(p)if not is_minimal_node(l): move_key_from_node_to_node(l, w, p) elif not is_minimal_node(r): move_key_from_node_to_node(r, w, p) else: p = merge_nodes(p, l)continue_the_process_from(p) else: # i.e. Is w is a leaf and doesn’t contain k return