260 likes | 366 Views
Augmenting Data Structures. Advanced Algorithms & Data Structures Lecture Theme 07 – Part II Prof. Dr. Th. Ottmann Summer Semester 2006. Examples for Augmenting DS. Dynamic order statistics: Augmenting binary search trees by size information
E N D
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part II Prof. Dr. Th. Ottmann Summer Semester 2006
Examples for Augmenting DS • Dynamic order statistics: Augmenting binary search trees by size information • D-dimensional range trees: Recursive construction of (static) d-dim range trees • Min-augmented dynamic range trees: Augmenting 1-dim range trees by min-information • Interval trees • Priority search trees
Interval Trees (CLR-Version) Problem: Given a set R of intervals that changes under insertions and deletions, construct a data structure to store R that can be updated in O(log n) time and that can find for any given query interval i an interval in R that overlaps i, and returns nil if there is no such interval, in O(log n) time. Idea: Store the set of intervals in an appropriately augmented balanced binary search tree and design an algorithm for the new operation. Interval-search(T, i): Report an interval stored in T that overlaps the query interval i, if such an interval exists, and Nil otherwise.
Observation Let i = [low(i), high(i)] and i‘ = [low(i‘), high(i‘)] be two intervals. Then i and i‘ overlap if and only if low(i) ≤ high(i‘) and low(i‘) ≤ high(i) Any two intervals satisfy the interval trichotomy: • i and i‘ overlap • high(i) ≤ low(i‘) • high(i‘) ≤ low(i)
x1 y1 x1 y1 x2 y2 x2 y2 x1 y1 x1 y1 x2 y2 x2 y2 x1 y1 x1 y1 x2 y2 x2 y2 Interval Trichotomy The cases when intervals I1 and I2 overlap: The cases when intervals I1 and I2 do not overlap:
Interval tree (CLR Version) [15,23] Each node v stores an interval int(v) and the maximum upper endponit of all intervals stored in the subtree rooted at v; The interval tree is a search tree on the lower endpoints of intervals. Max(v) is the maximum value of all right endpoints in the subtree rooted at v. 33 [6,10] [17,19] 10 33 [5,8] [8,9] [16,21] [27,33] 8 9 33 21 [0,3] [25,30] [29,30] 3 30 30 [19,20] [26,26] 20 26
Maintaining max-information Max-information can be maintained during updates and rebalancing operations (rotations). Max (x) = max(high(int(x), max (left(x)), max (right(x))
Finding an interval in T that overlaps interval i Interval-search(T, i) x root(T) while x ≠ Nil and i does not overlap int(x) do if left(x) ≠ Niland max(left(x))≥ low(i) then x left(x) else x right(x) return x [15,23] 33 [6,10] [17,19] 10 33 [5,8] [8,9] [16,21] [27,33] 8 9 33 21 [0,3] [25,30] [29,30] Observation: if int(x) does not overlap i, the search always proceeds in a safe direction! 3 30 30 [19,20] [26,26] 20 26 Interval-search can be carried out in time O(height T).
Interval Trees: Point-set Variant Problem: Given a set R of intervals that changes under insertions and deletions, construct a data structure to store R that can be updated in O(log n) time and that can find for any given query interval i an interval in R that overlaps i, and returns nil if there is no such interval, in O(log n) time. Solution: Map intervals to points and store points in appropriately augmented tree. (l, r) i l = low[i] r = high[i]
Max-augmented Range Tree Store intervals as points in sorted x-order in a leaf search tree. Store max y-coordinates at internal nodes. 11 28 3 14 15 28 Leaf-search tree on x-coordinates of points 2 4 17 (14, 17) 5 15 28 Max-tournament tree on y-coordinates of points 10 15 21 (2, 5) (3, 4) (4, 5) 15 22 28 8 (11, 12) (15, 22) (17, 18) (21, 28) 15 (8, 13) (10, 15)
Interval Search Interval-Search(T, i) /* Find an interval in tree T that overlaps i */ 1P = root[T] 2 while p is not a leaf do 3ifmax-y[left[p]] low[i] 4then p = left[p] 5else p = right[p] /* Now p is a leaf storing interval i’ */ 6 If (i and i’ overlap) then return “found” else return “not found”
Correctness Proof Case 1: We go right, low[i] > max-y[left[p]] i Intervals in left subtree of p None of them can overlap with i.
Correctness Proof Case 2: We go left, low[i] ≤max-y[left[p]] If T[left[p]] does not contain an interval i‘ that overlaps i, then T[right[p]] cannot contain such an interval as well! i i‘ max-y[left[p]] (Here we utilize the fact that the intervals are sorted according to their x-coordinates!)
Interval Tree-Summary A interval tree for a set of n intervals [l1, r1], …, [ln, rn] on the line is a daynamic max-augmented range tree for the set of points P = {(l1, r1), …, (ln, rn)}. Interval trees can be used to carry out the following operations: Interval-Insert(T, i) inserts the interval i into the tree T Interval-Delete(T, i) removes the interval i from the tree T Interval-Search(T, i) returns a pointer to a node storing an interval i‘ that overlaps i, or NIL if no such interval is stored in T.
Examples for Augmenting DS • Dynamic order statistics: Augmenting binary search trees by size information • D-dimensional range trees: Recursive construction of (static) d-dim range trees • Min-augmented dynamic range trees: Augmenting 1-dim range trees by min-information • Interval trees • Priority search trees
3-Sided Range Queries Goal: Report all k points in the query range in O(log n + k) time.
3-Sided Range Queries Salary Age Goal: Report all k points in the query range in O(log n + k) time.
Priority Search Trees Two data structures in one: 11 ● Search tree on points’ x-coordinates ● Heap on points’ y-coordinates 3 14 { (2, 12), (3, 4) (4, 11), (5, 3), (8, 5), (11, 21), (14, 7), (15, 2), (17, 30), (21, 8), (33, 33) } 2 4 17 14 8 15 21 2 3 4 5 11 15 17 21 33 5 8
Priority Search Trees Two data structures in one: 11 (33, 33) ● Search tree on points’ x-coordinates 3 14 ● Heap on points’ y-coordinates (11, 21) (17, 30) 2 4 17 14 (14, 7) (2, 12) (4, 11) (21, 8) 8 15 21 2 3 4 (8, 5) (15, 2) (3, 4) 5 11 15 17 21 33 (5, 3) 5 8
3-Sided Range Queries on a Priority Search Tree • Query procedure: • Inspect all nodes on the two bounding paths and report the points that match the query. • For every tree between the two bounding paths, apply the following strategy: • Inspect the root. • If this reports a point, recursively visit the children of the root. O(log n) time to query red paths O(log n + k) time to query blue subtrees
Correctness of the Query Procedure • Observations: • We never report a point that is not in the query range. • Points in the yellow subtrees cannot match the query. • Points in the blue subtrees that are not reported cannot match the query.
Insertion into a Priority Search Tree 11 Insertion procedure: (33, 33) 1. Insert new leaf based on point’s x-coordinate. 3 14 2. Insert point down the tree, based on its y-coordinate. (11, 21) (17, 30) 2 4 17 14 (14, 7) (2, 12) (4, 11) (21, 8) 8 15 21 2 3 4 (8, 5) (15, 2) (3, 4) 5 11 15 17 21 33 (5, 3) 5 8
Deletion from a Priority Search Tree Deletion procedure: 11 1. Search for the point and delete it. (33, 33) 2. Fill the gap by pulling-up points according to their y-values 3 14 (11, 21) (17, 30) 2 4 17 14 (14, 7) (2, 12) (4, 11) (21, 8) 8 15 21 2 3 4 (8, 5) (15, 2) (3, 4) 5 11 15 17 21 33 (5, 3) 5 8
Priority Search Tress: Observations Insertion and deletion of point in a priority search tree T of n nodes can be carried out in time O(height(T)). Priority search trees support north-grounded range reporting, if the heap-structure is a max-heap, and they support south-grounded range reporting, if the heap-structure is a min-heap. Maintaining the height of the leaf search tree underlying a priority search tree such that the height is always of order O(log n) for a priority search tree storing n nodes requires rebalancing! In order to obtain O(log n) algorithms for insertion and deletion of points one must use a rebalancing scheme with constant restructuring cost per update! A PST storing n points requires space O(N).
Rotations in a Priority Search Tree y x • Push p2 to the appropriate child of y. • Store p1 at y. • Propagate the point with maximal y-coordinate from the appropriate child of x. p1 p1 y x p2 ?
Priority Search Trees — Summary • Theorem:There exists a data structure to represent a dynamically changing set S of points in two dimensions with the following properties: • The data structure can be updated in O(log n) time after every insertion or deletion into or from S. • The data structure allows us to answer 3-sided range queries in O(log n + k) time. • The data structure occupies O(n) space.