600 likes | 807 Views
Augmenting Data Structures. Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006. Augmentation Process. Augmentation is a process of extending a data structure in order to support additional functionality. It consists of four steps:
E N D
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006
Augmentation Process Augmentation is a process of extending a data structure in order to support additional functionality. It consists of four steps: • Choose an underlying data structure. • Determine the additional information to be maintained in the underlying data structure. • Verify that the additional information can be maintained for the basic modifying operations on the underlying data structure. • Develop new operations.
Examples for Augmenting DS • Dynamic order statistics: Augmenting binary search trees by size information • D-dimensional range trees: Recursive construction of (static) d-dim range trees • Min-augmented dynamic range trees: Augmenting 1-dim range trees by min-information • Interval trees • Priority search trees
Examples for Augmenting DS • Dynamic order statistics: Augmenting binary search trees by size information • D-dimensional range trees: Recursive construction of (static) d-dim range trees • Min-augmented dynamic range trees: Augmenting 1-dim range trees by min-information • Interval trees • Priority search trees
Dynamic Order Statistics • Problem:Given a set S of numbers that changes under insertions and deletions, construct a data structure to store S that can be updated in O(log n) time and that can report the k-th order statistic for any k in O(log n) time. S 5 51 85 13 48 7 34 22 14
17 7 19 5 13 18 37 1 25 49 33 Binary Search Trees and Order Statistics
17 7 19 5 13 18 37 1 25 49 33 Binary Search Trees and Order Statistics • Retrieving an element with a given rank: • For a given i, find the i-th smallest key in the set. • Determining the rank of an element: • For a given (pointer to a) key k, determine the rank of k in the set of keys.
Augmenting the Data Structure 33 • Every node v stores two pieces of information: • Its key • The number of its descendants (The size of the subtree with root v) 11 17 51 4 6 9 21 48 92 2 1 4 1 4 73 124 1 2 1 81 1
How To Determine The Rank of an Element 33 11 • Find the rank of key x in the tree with root node v: • Rank(v, x) • 1ifx = key(v) • 2then return 1 + size(left(v)) • 3ifx < key(v) • 4then return Rank(left(v), x) • 5else return 1 + size(left(v)) + Rank(right(v), x) 51 17 4 6 9 21 48 92 1 1 4 2 4 73 124 1 2 1 81 1
33 11 51 17 4 6 9 21 48 92 1 1 4 2 4 73 124 1 2 1 81 1 How to Find the k-th Order Statistic • Find (a pointer to) the node containing the • k-th smallest key in the subtree rooted • at node v. • Select(v, k)1ifk = size(left(v)) + 12then returnv3ifk ≤ size(left(v))4then returnSelect(left(v), k)5else returnSelect(right(v), k – 1 – size(left(v)))
33 11 17 51 4 6 9 21 48 92 2 1 4 1 4 73 124 1 2 1 81 1 Maintaining Subtree Sizes Under Insertions • Insert operation • Insert node as into a standard binary search tree. • Add 1 to the subtree size of every ancestor of the new node.
Maintaining Subtree Sizes Under Insertions 33 • Insert operation • Insert node as into a standard binary search tree • Add 1 to the subtree size of every ancestor of the new node 11 17 51 4 6 9 21 48 92 2 1 4 1 4 73 124 1 2 1 64 81 1 1
33 12 17 51 4 7 9 21 48 92 2 1 5 1 4 73 124 1 3 1 64 81 1 1 Maintaining Subtree Sizes Under Insertions • Insert operation • Insert node as into a standard binary search tree • Add 1 to the subtree size of every ancestor of the new node
Maintaining Subtree Sizes Under Deletions • Delete operation • Delete node as from a standard binary search tree • Subtract 1 from the subtree size of every ancestor of the deleted node
s5 + s3 + 1 s4 s3 s5 Maintaining Subtree Sizes Under Rotations s1 s1 s3 s2 s4 s5
Dynamic Order Statistics—Summary • Theorem: There exists a data structure to represent a dynamically changing set S of numbers with the following properties: • The data structure can be updated in O(log n) time after every insertion or deletion into or from S. • The data structure allows us to determine the rank of an element or to find the element with a given rank in O(log n) time. • The data structure occupies O(n) space.
Examples for Augmenting DS • Dynamic order statistics: Augmenting binary search trees by size information • D-dimensional range trees: Recursive construction of (static) d-dim range trees • Min-augmented dynamic range trees: Augmenting 1-dim range trees by min-information • Interval trees • Priority search trees
4-Sided Range Queries Goal: Build a static data structure of size O(n log n) that can answer 4-sided range queries in O(log2n + k) time.
Orthogonal d-dimensional Range Search Build a static data structure for a set P of n points in d-space that supports d-dim range queries: d-dim range query: Let R be a d-dim orthogonal hyperrectangle, given by d ranges [x1, x1‘], …, [xd, xd‘]: Find all points p = (p1, …, pd) P such that x1≤p1≤ x1‘,…,xd≤ pd≤ xd. Special cases: 1-dim range query: 2-dim range query: x2‘ x2 x1 x1‘ x1 x1‘
1-dim Range Search Standard binary search trees support also 1-dim range queries: 68 37 99 81 55 18 90 74 23 42 61 12 21 49 80 30
1-dim Range Search Leaf-search-tree: 68 37 99 ∞ 81 55 18 90 74 23 42 61 12 90 99 74 81 12 18 37 42 61 68 21 49 23 21 49 55 80 30
1-dim Range Tree A 1-dim range tree is a leaf-search tree for the x-values (points on the line). Internal nodes have routers guiding the search to the leaves: We choose the maximal x-value in left subtree as router. Range search: In order to find all points in a given range [l, r] search for the boundary values l and r. This is a forked path; report all leaves of subtrees rooted at nodes v in between the two search paths whose parents are on the search path.
The selected subtrees Split node l r
Canonical Subsets The canonical subset of node v, P(v), is the subset of points of P stored at the leaves of the subtree rooted at v. If v is a leaf, P(v) is the point stored at this leaf. If v is the root, P(v) = P. Observations: For each query range [l, r] the set of points with x-coordinates falling into this range is the disjoint union of O(log n) canonical subsets of P. A node v is called an umbrella node for the range [l, r], if the x-coordinates of all points in its canonical subset P(v) fall into the range, but this does not hold for the predecessor of v. All k points stored at the leaves of a tree rooted at node v, i.e. the k points in a canonical subset P(v), can be reported in time O(k).
1-dim Range Tree: Summary Let P be a set of n points in 1-dim space. P can be stored in a balanced binary leaf-search tree such that the following holds: Construction time: O(n log n) Space requirement: O(n) Insertion of a point: O(log n) time Deletion of a point: O(log n) time 1-dim-range-query: Reporting all k points falling into a given query range can be carried out in time O(log n + k). The performance of 1-dim range trees does not depend on the chosen balancing scheme!
2-dim Range tree: The Primary Structure • Static binary leaf-search tree over x-coordinates of points.
The Primary Structure • Static binary leaf-search tree over x-coordinates of points.
The Primary Structure • Static binary leaf-search tree over x-coordinates of points.
The Primary Structure • Static binary leaf-search tree over x-coordinates of points. • Every leaf represents a vertical slab of the plane.
The Primary Structure • Static binary leaf-search tree over x-coordinates of points. • Every leaf represents a vertical slab of the plane. • Every internal node represents a slab that is the union of the slabs of its children.
The Primary Structure • Static binary leaf-search tree over x-coordinates of points. • Every leaf represents a vertical slab of the plane. • Every internal node represents a slab that is the union of the slabs of its children.
The Primary Structure • Static binary leaf-search tree over x-coordinates of points. • Every leaf represents a vertical slab of the plane. • Every internal node represents a slab that is the union of the slabs of its children.
The Primary Structure • Static binary leaf-search tree over x-coordinates of points. • Every leaf represents a vertical slab of the plane. • Every internal node represents a slab that is the union of the slabs of its children.
Answering 2-dim Range Queries • Normalize queries to end on slab boundaries. • Query decomposes into O(log n) subqueries. • Every subquery is a1-dimensional range query on y-coordinates of all points in the slab of the corresponding node.(x-coordinates do not matter!)
The selected subtrees Split node l r
Answering Queries • Normalize queries to end on slab boundaries. • Query decomposes into O(log n) subqueries. • Every subquery is a1-dimensional range query on y-coordinates of all points in the slab of the corresponding node.(x-coordinates do not matter!)
Answering Queries • Normalize queries to end on slab boundaries. • Query decomposes into O(lg n) subqueries. • Every subquery is a1-dimensional range query on y-coordinates of all points in the slab of the corresponding node.(x-coordinates do not matter!)
Answering Queries • Normalize queries to end on slab boundaries. • Query decomposes into O(log n) subqueries. • Every subquery is a1-dimensional range query on y-coordinates of all points in the slab of the corresponding node.(x-coordinates do not matter!)
2-dim Range Tree y Ty(v) x Ix(v) v Tx
2-dim Range Tree A 2-dimensional range tree for storing a set P of n points in the x-y-plane is: • A 1-dim-range tree Tx for the x-coordinates of points. • Each node v of Tx has a pointer to a 1-dim-range-tree Ty(v) storing all points which fall into the interval Ix(v). That is: Ty(v) is a 1-dim-range-tree based on the y-coordinates of all points p P with p Ix(v). Leaf-search-tree on y-coordinates of poins v Leaf-search-tree on x-coordinates of points
2-dim Range Tree A 2-dim range tree on a set of n points in the plane requires O(n log n) space. A point p is stored in all associated range trees Ty(v) for all nodes v on the search path to px in Tx. Hence, for each depth d, each point p occurs in only one associated search structure Ty(v) for a node v of depth d in Tx. The 2-dim range tree can be constructed in time O(n log n). (Presort the points on y-coordinates!) p p p p
The 2-Dimensional Range Tree • Primary structure: • Leaf-search tree onx-coordinates of points • Every node stores a secondary structure: • Balanced binary search tree on y-coordinates of points in the node’s slab. Every point is stored in secondary structures of O(log n) nodes. Space: O(n log n)
Answering Queries • Every 2-dimensional range query decomposes into O(log n) 1-dimensional range queries • Each such query takes O(log n + k′) time • Total query complexity: • O(log2 n + k)
2-dim Range Query Let P be a set of points in the plane stored in a 2-dim range tree and let a 2-dim range R defined by the two intervals [x, x‘], [y, y‘] be given. The all k points of P falling into the range R can be reported as follows: • Determine the O(log n) umbrella nodes for the range [x, x‘], i.e. determine the canonical subsets of P that together contain exactly the points with x-coordinates in the range [x, x‘]. (This is a 1-dim range query on the x-coordinates.) • For each umbrella node v obtained in 1, use the associated 1-dim range tree Ty(v) in order to select the subset P(v) of points with y-coordinates in the range [y, y‘]. (This is a 1-dim range query for each of the O(log n) canonical subsets obtained in 1.) Time to report all k points in the 2-dim range R: O(log2 n + k). Query time can be reduced to O(log n +k) by a technique known as fractional cascading.
The 3-Dimensional Range Tree • Primary structure: • Search tree onx-coordinates of points • Every node stores a secondary structure: • 2-dimensional range tree on points in the node’s slab. Every point is stored in secondary structures of O(log n) nodes. Space: O(n log2n)
Answering Queries • Every 3-dimensional range query decomposes into O(log n) 2-dimensional range queries • Each such query takes O(log2n + k′) time • Total query complexity: • O(log3 n + k)
d-Dimensional Range Queries • Primary structure: • Search tree on x-coordinates • Secondary structures: • (d – 1)-dimensional range trees • Space requirement: • O(n logd – 1 n) • Query time: • O(n logd – 1 n)
Updates are difficult! Insertion or deletion of a point p in a 2-dim range tree requires: • Insertion or deletion of p into the primary range tree Tx according to the x-coordinate of p • For each node v on the search path to the leaf storing p in Tx, insertion or deletion of p in the associated secondary range tree Ty(v). Maintaining the primary range tree balanced is difficult, except for the case d = 1! Rotations in the primary tree may require to completely rebuild the associated range trees along the search path!
Range Trees–Summary • Theorem:There exists a data structure to represent a static set S of n points in d dimensions with the following properties:The data structure allows us to answer range queries inO(logd n + k) time. The data structure occupies O(n logd – 1n) space. • Note: The query complexity can be reduced to O(logd – 1n + k), for d ≥ 2, using a very beautiful technique called fractional cascading.