330 likes | 362 Views
CS473-Algorithms I. Lecture X Augmenting Data Structures. CS 473. Lecture X. 1. Augmenting Data Structures. When dealing with a new problem Data structures play important role Must design or addopt a data structure. Only in rare situtations
E N D
CS473-Algorithms I Lecture X Augmenting Data Structures CS 473 Lecture X 1
Augmenting Data Structures • When dealing with a new problem • Data structures play important role • Must design or addopt a data structure. • Only in rare situtations • We need to create an entirely new type of data structure. • More Often • It suffices to augment a known data structure by storing additional information. • Then we can program new operations for the data structure to support the desired application CS 473 Lecture X 2
Augmenting Data Structures (2) • Not Always Easy • Because, added info must be updated and maintained by the ordinary operations on the data structure. • Operations • Augmented data structure (ADS) has operations inherited from underlying data structure (UDS). • UDS Read/Query operations are not a problem. (ie. Min-Heap Minimum Query) • UDS Modify operations should update additional information without adding too much cost. (ie. Min-Heap Extract Min, Decrease Key) CS 473 Lecture X 3
Dynamic Order Statistics • Example problem; • Dynamic Order Statistics, where we need two operations; OS-SELECT(x,i): returns ith smallest key in subtree rooted at x OS-RANK(T,x): returns rank (position) of x in sorted (linear) order of tree T. • Other operations • Query: Search, Min, Max, Successor, Predecessor • Modify: Insert, Delete CS 473 Lecture X 4
Dynamic Order Statistics (2) • Sorted or linear order of a binary search tree T is determined by inorder tree walk of T. IDEA: • Use Red-Black (R-B) tree as the underlying data structure. • Keep subtree size in nodes as additional information. CS 473 Lecture X 5
Dynamic Order Statistics (3) • Relation Between Subtree Sizes; size[x] = size[left[x]] + size[right[x]] + 1 • Note on implementation; • For convenience use sentinel NIL[T] such that; size[NIL[T]] = 0 • Since high level languages do not have operations on NIL values. (ie. Java has NullPointerException) The node itself CS 473 Lecture X 6
Dynamic Order Statistics Notation KEY SUBTREE SIZE • Node Structure: • Key as with any Binary Search Tree (Tree is indexed according to key) • Subtree Size as additional Data on Node. CS 473 Lecture X 7
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 Dynamic Order Statistics Example 10=7+2+1 5=1+3+1 1= size[NIL] + size[NIL] + 1 = 0+0+1 3=1+1+1 Linear Order A C D F G H K M P Q CS 473 Lecture X 8
Retrieving an Element With a Given Rank OS-SELECT(x, i) r size[left[x]] + 1 if i = r then return x elseif i < r then return OS-SELECT( left[x], i) else return OS-SELECT(right[x], i) CS 473 Lecture X 9
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 OS-SELECT Example i=6 r=8 r > i Go Left CS 473 Lecture X 10
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 OS-SELECT Example r < i Go Right i=6 r=8 i=6 r=2 CS 473 Lecture X 11
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 OS-SELECT Example i=6 r=8 i=6 r=2 r < i Go Right i=4 r=2 CS 473 Lecture X 12
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 OS-SELECT Example i=6 r=8 i=6 r=2 i=4 r=2 r = i Found!!! i=2 r=2 CS 473 Lecture X 13
OS-SELECT • IDEA: • Knowing size of left subtree (number of nodes smaller than current) tells you which subtree the answer is in • Running Time: O(lgn) • Each recursive call goes down one level in the OS-TREE • Running time = O(d) where d = depth of the ith element • Worst case running time is proportional to the height of the tree • Since tree is a R-B tree, its height is balanced and O(lgn) CS 473 Lecture X 14
Determining the Rank of an Element OS-RANK(T, x) r size[left[x]] + 1 yx while y root[T] do if y = right[p[y]] then r r + size[left[p[y]]] + 1 yp[y] return r CS 473 Lecture X 15
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 Dynamic Order Statistics Example init r=2 right child r = 2 + 1 + 1 = 4 CS 473 Lecture X 16
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 Dynamic Order Statistics Example right child r = 4 + 1 + 1 = 6 CS 473 Lecture X 17
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 Dynamic Order Statistics Example left child r = 6 CS 473 Lecture X 18
M 10 C P 7 2 A F Q 1 5 1 D H 1 3 G K 1 1 Dynamic Order Statistics Example Root and Answer!!! r = 6 CS 473 Lecture X 19
OS-RANK • IDEA: • rank[x] = # of nodes preceeding x in an inorder tree walk + 1 (itself) • Follow the simple upward path from node x to the root. All left subtrees in this path contribute to rank[x] • Running Time: O(lgn) • Each iteration of the while-loop takes O(1) time • y goes up one level in the tree with each iteration. • Running time = O(dx), where dx is the depth of node x • Running time, at worst proportional to the height of the tree. (if x is a leaf node) CS 473 Lecture X 20
Determining the Rank of an Element r • Follow the nodes y (y=x initially) on the path from x to root. • Consider Subtree Tp[y] rooted at p[y], where z is y’s sibling. z y x Tz Ty Important: r contains the number of nodes in Ty that preceedes x. CS 473 Lecture X 21
Determining the Rank of an Element r=r+size[z]+1 z y x Tz Ty Case 1: If y is a right child z is a left child then • All nodes in Tz and p[y] precedes x • Thus, must add size[z] + 1 to r CS 473 Lecture X 22
Determining the Rank of an Element r=r y z x Tz Ty Case 2: If y is a left child z is a right child then • Neither p[y] nor any node in Tz precedes x. • Don’t update r CS 473 Lecture X 23
Maintaining Subtree Sizes • OS-SELECT and OS-RANK works if we are able to update the subtree sizes with modifications. • Two operations INSERT and DELETE modifies the contents of the Tree. We should try to update subtree size without extra traversals. • If not, would have to make a pass over the tree to set up the sizes whenever the tree is modified, at cost (n) CS 473 Lecture X 24
Red Black Tree Insertion • Insertion is a two phase process; • Phase 1: Insert node. Go down the tree from the root. Inserting the new node as a child of an existing node. (Search in O(lgn) time) • Phase 2: Balance tree and correct colors. Go up the tree changing colors and ultimately performing rotations to maintain the R-B properties. CS 473 Lecture X 25
Maintaining Subtree Sizes in an Insert Operation • Phase 1: • Increment size[x] for each node x on the downward path from root to leaves • The new added node gets the size of 1 • O(lgn) operation • Phase 2: • Only rotations cause structural changes • At most two rotations • Good News!!! Rotation is a local operation. Invalidates only two size fields of the two nodes, around which the rotation is performed • O(1) time operation CS 473 Lecture X 26
Maintaining Subtree Sizes in an Insert Operation 19 x 19 y LEFT-ROTATE(T,x) 11 7 6 12 x y 6 4 4 7 After the rotation: size[y]size[x] size[x]size[left[x]] + size[right[x]] + 1 Note: only size fields of x and y are modified CS 473 Lecture X 27
Red Black Tree Deletion Deletion is a two phase process; • Phase 1: • Splice out one node y • Phase 2: • Performs at most 3 rotations • Otherwise performs no structural changes CS 473 Lecture X 28
Maintaining Subtree Sizes in an Delete Operation • Phase 1: • Traverse a path from node y up to the root. Decrementing the size field of each node on the path. • Length of this path = dy = O(lgn) • O(lgn) Time Operation • Phase 2: • The O(1) Time Rotations can be handled in the same manner as for insertion. CS 473 Lecture X 29
Application: Counting the Number of Inversions • Definition: Let A[1..n] be an array of n distinct numbers. (i,j) is an inversion of A if i<j and A[i] > A[j] Inversions of A = <12, 13, 18, 16, 11> 1 2 3 4 5 I(A) = {(1,5), (2,5), (3,4), (3,5), (4,5)} CS 473 Lecture X 30
Application: Counting the Number of Inversions • Question : What array with elements from the set {1...n} has the most inversions? • A = <n, n-1, ... 2, 1> • number of inversions; CS 473 Lecture X 31
Application: Counting the Number of Inversions • Question : What is the relationship between the running time of insertion sort and the number of inversions • |I(A)| = # of inversions = # of element moves (shifts) where; i.e; Let r(j) be the rank of A[j] in A[i...j], then; CS 473 Lecture X 32
Number of inversions INVERSION(A) sum0 T for j1 to n do x MAKE-NEW-NODE() left[x]right[x]p[x]NIL key[x]A[j] OS-INSERT(T,x) rOS-RANK(T,x) Ijj-r sumsum+Ij return sum CS 473 Lecture X 33