330 likes | 488 Views
Lecture X Augmenting Data Structures. CS473-Algorithms I. Augmenting Data Structures. When dealing with a new problem Data structures play important role Must design or addopt a data structure. Only in rare situtations We need to create an entirely new type of data structure. More Often
E N D
Lecture X Lecture X Augmenting Data Structures CS473-Algorithms I
Lecture X Augmenting Data Structures • When dealing with a new problem • Data structures play important role • Must design or addopt a data structure. • Only in rare situtations • We need to create an entirely new type of data structure. • More Often • It suffices to augment a known data structure by storing additional information. • Then we can program new operations for the data structure to support the desired application
Lecture X Augmenting Data Structures (2) • Not Always Easy • Because, added info must be updated and maintained by the ordinary operations on the data structure. • Operations • Augmented data structure (ADS) has operations inherited from underlying data structure (UDS). • UDS Read/Query operations are not a problem. (ie. Min-Heap Minimum Query) • UDS Modify operations should update additional information without adding too much cost. (ie. Min-Heap Extract Min, Decrease Key)
Lecture X Dynamic Order Statistics • Example problem; • Dynamic Order Statistics, where we need two operations; • OS-SELECT(x,i): returns ith smallest key in subtree rooted at x • OS-RANK(T,x): returns rank (position) of x in sorted (linear) order of tree T. • Other operations • Query: Search, Min, Max, Successor, Predecessor • Modify: Insert, Delete
Lecture X Dynamic Order Statistics (2) • Sorted or linear order of a binary search tree T is determined by inorder tree walk of T. • IDEA: • Use Red-Black (R-B) tree as the underlying data structure. • Keep subtree size in nodes as additional information.
Lecture X Dynamic Order Statistics (3) The node itself • Relation Between Subtree Sizes; • size[x] = size[left[x]] + size[right[x]] + 1 • Note on implementation; • For convenience use sentinel NIL[T] such that; • size[NIL[T]] = 0 • Since high level languages do not have operations on NIL values. (ie. Java has NullPointerException)
Lecture X Dynamic Order Statistics Notation KEY SUBTREE SIZE • Node Structure: • Key as with any Binary Search Tree (Tree is indexed according to key) • Subtree Size as additional Data on Node.
Lecture X Dynamic Order Statistics Example 10=7+2+1 M P C A F H K G Q D 1 1 7 1 1 3 2 5 10 1 5=1+3+1 1= size[NIL] + size[NIL] + 1 = 0+0+1 3=1+1+1 Linear Order A C D F G H K M P Q
Lecture X Retrieving an Element With a Given Rank • OS-SELECT(x, i) • r size[left[x]] + 1 • if i = r then • return x • elseif i < r then • return OS-SELECT( left[x], i) • else • return OS-SELECT(right[x], i)
Lecture X OS-SELECT Example i=6 r=8 H F G A K Q P M C D 1 1 1 5 1 7 2 10 3 1 r > i Go Left
Lecture X OS-SELECT Example r < i Go Right i=6 r=8 H K F G M Q C P A D 10 3 5 1 1 1 7 1 2 1 i=6 r=2
Lecture X OS-SELECT Example i=6 r=8 M P Q C G A K F H D 7 10 5 3 1 1 1 2 1 1 i=6 r=2 r < i Go Right i=4 r=2
Lecture X OS-SELECT Example i=6 r=8 G K H Q A F C P M D 7 5 10 1 2 3 1 1 1 1 i=6 r=2 i=4 r=2 r = i Found!!! i=2 r=2
Lecture X OS-SELECT • IDEA: • Knowing size of left subtree (number of nodes smaller than current) tells you which subtree the answer is in • Running Time: O(lgn) • Each recursive call goes down one level in the OS-TREE • Running time = O(d) where d = depth of the ith element • Worst case running time is proportional to the height of the tree • Since tree is a R-B tree, its height is balanced and O(lgn)
Lecture X Determining the Rank of an Element • OS-RANK(T, x) • r size[left[x]] + 1 • yx • while y root[T] do • if y = right[p[y]] then • r r + size[left[p[y]]] + 1 • yp[y] • return r
Lecture X Dynamic Order Statistics Example M P Q C A G F K H D 1 1 1 5 1 7 2 10 3 1 init r=2 right child r = 2 + 1 + 1 = 4
Lecture X Dynamic Order Statistics Example M Q P C G A F K H D 3 1 1 5 1 1 7 2 10 1 right child r = 4 + 1 + 1 = 6
Lecture X Dynamic Order Statistics Example M Q P G C K A F H D 3 5 1 1 7 1 2 1 10 1 left child r = 6
Lecture X Dynamic Order Statistics Example M Q P G C K A F H D 3 5 7 1 1 1 2 1 10 1 Root and Answer!!! r = 6
Lecture X OS-RANK • IDEA: • rank[x] = # of nodes preceeding x in an inorder tree walk + 1 (itself) • Follow the simple upward path from node x to the root. • All left subtrees in this path contribute to rank[x] • Running Time: O(lgn) • Each iteration of the while-loop takes O(1) time • y goes up one level in the tree with each iteration. • Running time = O(dx), where dx is the depth of node x • Running time, at worst proportional to the height of the tree. (if x is a leaf node)
Lecture X Determining the Rank of an Element r z y • Follow the nodes y (y=x initially) on the path from x to root. • Consider Subtree Tp[y] rooted at p[y], where z is y’s sibling. • Important: r contains the number of nodes in Ty that preceedes x. x Tz Ty
Lecture X Determining the Rank of an Element r=r+size[z]+1 z y • Case 1: • If y is a right child z is a left child then • All nodes in Tz and p[y] precedes x • Thus, must add size[z] + 1 to r x Tz Ty
Lecture X Determining the Rank of an Element r=r y z • Case 2: • If y is a left child z is a right child then • Neither p[y] nor any node in Tz precedes x. • Don’t update r x Tz Ty
Lecture X Maintaining Subtree Sizes • OS-SELECT and OS-RANK works if we are able to update the subtree sizes with modifications. • Two operations INSERT and DELETE modifies the contents of the Tree. We should try to update subtree size without extra traversals. • If not, would have to make a pass over the tree to set up the sizes whenever the tree is modified, at cost (n)
Lecture X Red Black Tree Insertion • Insertion is a two phase process; • Phase 1: Insert node. Go down the tree from the root. Inserting the new node as a child of an existing node. (Search in O(lgn) time) • Phase 2: Balance tree and correct colors. Go up the tree changing colors and ultimately performing rotations to maintain the R-B properties.
Lecture X Maintaining Subtree Sizes in an Insert Operation • Phase 1: • Increment size[x] for each node x on the downward path from root to leaves • The new added node gets the size of 1 • O(lgn) operation • Phase 2: • Only rotations cause structural changes • At most two rotations • Good News!!! Rotation is a local operation. Invalidates only two size fields of the two nodes, around which the rotation is performed • O(1) time operation
Lecture X Maintaining Subtree Sizes in an Insert Operation 19 x 19 y LEFT-ROTATE(T,x) 11 7 6 12 x y 6 4 4 7 After the rotation: size[y]size[x] size[x]size[left[x]] + size[right[x]] + 1 Note: only size fields of x and y are modified
Lecture X Red Black Tree Deletion • Deletion is a two phase process; • Phase 1: • Splice out one node y • Phase 2: • Performs at most 3 rotations • Otherwise performs no structural changes
Lecture X Maintaining Subtree Sizes in an Delete Operation • Phase 1: • Traverse a path from node y up to the root. Decrementing the size field of each node on the path. • Length of this path = dy = O(lgn) • O(lgn) Time Operation • Phase 2: • The O(1) Time Rotations can be handled in the same manner as for insertion.
Lecture X Application: Counting the Number of Inversions • Definition: Let A[1..n] be an array of n distinct numbers. • (i,j) is an inversion of A if i<j and A[i] > A[j] • Inversions of A = <12, 13, 18, 16, 11> • • 1 2 3 4 5 • I(A) = {(1,5), (2,5), (3,4), (3,5), (4,5)}
Lecture X Application: Counting the Number of Inversions • Question : What array with elements from the set {1...n} has the most inversions? • A = <n, n-1, ... 2, 1> • number of inversions;
Lecture X Application: Counting the Number of Inversions • Question : What is the relationship between the running time of insertion sort and the number of inversions • |I(A)| = # of inversions = # of element moves (shifts) • where; • i.e; • Let r(j) be the rank of A[j] in A[i...j], then;
Lecture X Number of inversions • INVERSION(A) • sum0 • T • for j1 to n do • x MAKE-NEW-NODE() • left[x]right[x]p[x]NIL • key[x]A[j] • OS-INSERT(T,x) • rOS-RANK(T,x) • Ijj-r • sumsum+Ij • return sum