240 likes | 376 Views
Lecture X Augmenting Data Structures. CS473-Algorithms I. How to Augment a Data Structure. FOUR STEP PROCEDURE: 1. Choose an underlying d ata s tructure (UDS) 2. Determine a dditional i nfo to be maintained in the UDS
E N D
Lecture X Lecture X Augmenting Data Structures CS473-Algorithms I
Lecture X How to Augment a Data Structure • FOUR STEP PROCEDURE: • 1. Choose an underlying datastructure (UDS) • 2. Determine additionalinfo to be maintained in the UDS • 3. Verify that additionalinfo can be maintainedfor the modifyingoperations on the UDS • 4. Develop newoperations • Note: Order of steps may vary and be intermixed • in real design.
Lecture X Example • Designofourorder statistic trees: • 1. Choose Red Black (R-B) TREES • 2. Additional Info: Subtreesizes • 3. INSERT, DELETE=> ROTATIONS • 4. OS-RANK, OS-SELECT • Baddesign choice forOS-TREES: • 2. Additional Info: Store in each node its rank in the subtree • OS-RANK, OS-SELECT would run quickly but; • Inserting a new minimum element would cause a change to • this info in every node of the tree.
Lecture X Augmenting R-B Trees: Theorem • Theorem: • Let f be a field that augments a R-B Tree T of n nodes • Suppose that f[x] for a node x can be computed using only • The info in nodes, x, left[x], right[x] • f [ left[x] ] and f [ right[x] ] • ProofMain idea: • Changing f[x] => Update only f[p[x]] but nothing else • Updating f[p[x]] => Update only f[p[p[x]]] but nothing else • And so on up to the tree until f[root[x]] is updated • When f[root] is updated, no other node depends on new value • So the process terminates • Changing an f field in a node costs O(lgn) time since the height of a R-B tree is O(lgn)
Lecture X INTERVALS:DEFINITIONS • DEFINITION: A Closed interval • An ordered pair of real numbers [t1,t2] with t1≤ t2 • [t1,t2] = { t ∈ R : t1 ≤t≤ t2} • INTERVALS: • Used to represent events that each occupya continuous period of time • We wish to query a database of time intervals to find outwhatevents occurred during a given interval • Represent an interval [t1,t2] as an object i, with the fieldslow[i] = t1 & high[i] = t2 • Intervals i & i' overlap if i∩i' ≠∅that is low[i] ≤ high [i'] AND low[i'] ≤ high [i]
Lecture X INTERVALS:DEFINITIONS • Any two intervals satisfy the intervaltrichotomy • That is exactly one of the following 3 properties hold • a) i and i' overlap • iiii • i' i' i' i' • b) high[i] < low[i'] • ii' • c) high[i'] < low[i] • i' i
Lecture X INTERVAL TREES • Maintain a dynamic set of elements with each element x containingan interval int[x] • Support the following operations: • INSERT(T,x) : Adds an element x whose int field contains an interval to the tree • DELETE(T,x): Removes the element x from the tree T • SEARCH(T,i): Returns a pointer to an element x in T suchthat • int[x] overlaps with i' • NIL if no such element in the set.
Lecture X INTERVAL TREES(Cont.) • S1: Underlying Data Structure • Choose R-B Tree • Each node x contains an interval int[x] • Key of x=low[ int[x] ] • Inorder tree walk of the treelists the intervals in sorted order by low endpoints • S2: Additional information • Store in each node x the maximum endpoint“max[x]” in the subtree rooted at the node
Lecture X EXAMPLE [7,10] [5,11] [17,19] [4,8] [15,18] [21,23] int max [17,19] 23 [5,11] 18 [21,23] 23 [4,8] 8 [15,18] 18 [7,10] 10
INTERVAL TREES(cont.) • S3:Maintaining Additional Info(max[x]) • max[x] = minimum {high[int[x]], max[left[x]], max[right[x]]} • Thus, by theorem INSERT & DELETE run in O(lgn) time
INTERVAL TREES(cont.) • INSERT OPERATION • Fix subtree Max’s on the way down • As traverse path for INSERTION while comparing “new low” to that of node intervals • Use “new high” to update “max” of nodes as appropriate • Restore balance with rotations; updating of “max” fields for rotation Z X X Y Right Rotate Y Z • Thus, fixing “max” fields for rotation takes O(1) time. No change [11,35] 35 [6,20] 35 [6,20] 20 30 14 [11,35] 35 14 19 19 14
INTERVAL TREES(cont.) • S4: Developing new operations INTERVAL-SEARCH(T,i) x root[T] while x≠NIL and i ∩int[x] = ∅do if left[x] ≠NILandmax[left[x]] < low[i] then x left[x] else x right[x] return x
INTERVAL TREES(cont.) Time: O(lgn) • Starts with x at the root and proceeds downward • On a single path, until • EITHER an overlapping interval is found • OR x becomes NIL • Each iteration takes O(1) time • Height of the tree = O(lgn)
Correctness of the Search Procedure • Key Idea: Need to check only 1 of the node’s 2 children • Theorem • Case 1: If search goes right then Either overlap in the right subtree or no overlap • Case 2: If search goes left then Either overlap in the left subtree or no overlap
Correctness of the Search Procedure Case 1: Go Right • If overlap in right, then done • Otherwise (if nooverlap in RIGHT) • Either left[x] = NIL No overlap in LEFT • OR left[x]≠ NIL and max[left[x]] < low[i] For each interval i’’ in LEFT high[i’’] <= max[left[x]] < low[i] Therefore, No overlap in LEFT
Correctness of the Search Procedure Case 2: GO LEFT • If overlap in left, then done • Otherwise (if no overlap in LEFT) • low[i] <= max[left[x]] =high[i’] for some i’ in LEFT • Since i & i’ don’t overlap and low[i] <= high[i’] We have high[i] < low [i’] (Interval Trichotomy) • Since tree is sorted by lows we have high[i] < low[i’]<Any lows in RIGHT • Therefore, no overlap in RIGHT
Pictorial View of Case 1 & Case 2 i’ i i’’ i’’ max[left[x]] max[left[x]] Case 1 t Case 2 i’’: any interval in left i’’: any interval in right i’ in left such that high[i’]=max[left[x]]
Interval Trees • How to enumarate all intervals overlapping a given interval • Can do in O(klgn) time, where k = # of overlapping intervals • Find and Delete overlapping intervals one by one; • When done reinsert them • Theoritical Best is O(k+lgn)
How to maintain a dynamic set of numbers that support min-gap operations MIN-GAP(Q): retuns the magnitude of the difference of the two closest numbers in Q Example: Q={1,5,9,15,18,22} MIN-GAP(Q) = 18-15 = 3 • Underlying Data Structure: • A R-B Tree containing the numbers keyed on the numbers • Additional Info at each Node: • min-gap[x]: minimum gap value in the subtree TX rooted at x • min[x] : minimum value (key) in TX • max[x] : maximum value (key) in TX • These values are ∞ if x is a leaf node
3. Maintaining the Additional Info min[left[x]] if left[x] NIL min[x] = key[x] otherwise min[left[x]] if left[x] NIL min[x] = key[x] otherwise min-gap[left[x]] min-gap[x] = Min min-gap[right[x]] key[x] – max[left[x]] min[right[x]] – key[x] • Each field can be computed from info in the node & its children • Hence, by theorem, they would be maintained during insert & delete operation without affecting the O(lgn) running time
How to maintain a dynamic set of numbers that support min-gap operations(cont.) • The reason for defining the min & max fields is to make it possible to compute min-gap from the info at the node & its children Develop the new operation: MIN-GAP(Q) • MIN-GAP(Q) simply returns the min-gap value of the root • It is an O(1) time operation It is also possible to find the two closest numbers in O(lgn) time
How to maintain a dynamic set of numbers that support min-gap operations(cont.) CLOSEST-NUMBERS(Q) x root[Q] gapmin min-gap[x] while x ≠ NIL do if gapmin = min-gap[left[x]] then x left[x] elseif gapmin = min-gap[right[x]] x right[x] elseif gapmin = key[x] - max[left[x]] return { key[x], max[left[x]] } else return { min[right[x]], key[x] }
How to find the overlap of rectilinearly rectangles • Given a set R of n rectilinearly oriented rectangles • i.e sides of all rectangles are paralled to the x & y axis • Each rectangle r R is represented with 4 values • xmin[r], xmax[r], ymin[r], ymax[r] • Give an O(nlgn)-time algorithm To decide whether R contains two rectangle that overlap
OVERLAP(R) TY ∅ SORT xmin & xmax values of rectangles in R for each extremum x value in the sorted order do r rectangle [x] yint [ ymin[r], ymax[r] ] if x = xmin[r] then v INTERVAL-SEARCH(TY, yint) if v ≠NIL then returnTRUE else z MAKE-NEW-NODE() left[z] right[z] p[z] NIL int[z] yint INSERT(TY,z) else /* x=xmax[r] */ DELETE(TY,yint) returnFALSE