1 / 30

Multidimensional Range Search

This article explores the concept of multidimensional range search, specifically focusing on static collections of records and performing queries. It covers various data structures like k-d trees and range trees, and discusses their performance measures.

laurenc
Download Presentation

Multidimensional Range Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multidimensional Range Search • Static collection of records. • No inserts, deletes, changes. • Only queries. • Each record has k key fields. • Multidimensional range query. • Given k ranges [li, ui], 1 <= i <= k. • Report all records in collection such that li <= ki <= ui, 1 <= i <= k.

  2. Multidimensional Range Search • All employees whose age is between 30 and 40 and whose salary is between $40K and $70K. • All cities with an annual rainfall between 40 and 60 inches, population between 100K and 200K, average temperature >= 70F, and number of horses between 1025 and 2500.

  3. Data Structures For Range Search • Unordered sequential list. • Sorted tables. • k tables. • Table i is sorted by i’th key. • Cells. • k-d trees. • Range trees. • k-fold trees. • k-ranges.

  4. Performance Measures • P(n,k). • Preprocessing time to construct search structure for n records, each has k key fields. • For many applications, this time needs only to be reasonable. • S(n,k). • Total space needed by the search structure. • Q(n,k). • Time needed to answer a query.

  5. k-d Tree • Binary tree. • At each node of tree, pick a key field to partition records in that subtree into two approximately equal groups. • Pick field i with max spread in values. • Select median key value, m. • Node stores i and m. • Records with ki <= m in left subtree. • Records with ki > m in right subtree. • Stop when partition size <= 8 or 16 (say).

  6. f a b c d g a e c b d e f g 2-d Example

  7. a c b d e f g Performance • P(n,k) = O(kn log n). • O(kn) time to select partition keys at each level. • O(n) time to find all medians and split at each level of the tree. • O(log n) levels.

  8. a c b d e f g Performance • S(n,k) = O(n). • O(n) needed for the n records. • Tree takes O(n) space.

  9. Performance • Q(n,k) depends on shape of query. • O(n1-1/k + s),k > 1,where s is number of records that satisfy the query. Bound on worst-case query time. • O(log n+ s), average time when query is almost cubical and a small fraction of the n records satisfy the query. • O(s), average time when query is almost cubical and a large fraction of the n records satisfy the query.

  10. 10 12 15 20 24 26 27 29 35 40 50 55 Range Trees—k=1 • Sorted array on single key. • P(n,1) = O(n log n). • S(n,1) = O(n). • Q(n,1) = O(log n + s).

  11. Range Trees—k=2 • Let the two key fields be x and y. • Binary search tree on x. • x value used in a node is the median x value for all records in that subtree. • Records with x value <= median are in left subtree. • Records with larger x value in right subtree.

  12. Range Trees—k=2 • Each node has a sorted array on y of all records in the subtree. • Root has sorted array of all n records. • Left and right subtrees, each have a sorted array of about n/2 records. • Stop partitioning when # records in a partition is small enough (say 8).

  13. a SA c b d e f g Example • a-g are x values. • x-range of a node begins at min x value in subtree and ends at max x value in subtree.

  14. a SA c b d e f g Example—Search • If x-range of root is contained in x-range of query, search SA for records that satisfy y-range of query. Done. query x-range root x-range

  15. a SA c b d e f g Example—Search • If entire x-range of query <= x(> x)value in root, recursively search left (right) subtree. query x-range root x-value

  16. a SA c b d e f g Example—Search • If x-range of query containsvalue in root, recursively search left and right subtrees. query x-range root x-value

  17. a SA c b d e f g Performance • P(n,2) = O(n log n). • O(n log n) – sort all records by y for the SAs. • O(n) time to find all medians at each level of the tree.

  18. a SA c b d e f g Performance • P(n,2) = O(n log n). • O(n) time to construct SAs at each level of the tree from SAs at preceding level. • O(log n) levels.

  19. a SA c b d e f g Performance • S(n,2) = O(n log n). • O(n) needed for the SAs and nodes at each level. • O(log n) levels.

  20. a SA c b d e f g Performance • Q(n,2) = O(log2 n + s). • At each level of the binary search tree, at most 2 SAs are searched. • O(log n) levels.

  21. Range Trees—k=3 • Let the three key fields be w, x and y. • Binary search tree on w. • w value used in a node is the median w value for all records in that subtree. • Records with w value <= median in left subtree. • Records with larger w value in right subtree.

  22. Range Trees—k=3 • Each node has a 2-d range tree on x and y of all records in the subtree. • Stop partitioning when # records in a partition is small enough (say 8).

  23. a 2-d c c b d e f g Example • a-g are w values. • w-range of a node begins at min w value in subtree and ends at max w value in subtree.

  24. a 2-d c c c b d e f g Example—Search • If w-range of root is contained in w-range of query, search 2-d range tree in root for records that satisfy x- and y-ranges of query. Done. • If entire w-range of query <= w(> w) value in root, recursively search left (right) subtree.

  25. a 2-d c c c b d e f g Example—Search • If w-range of query containsvalue in root, recursively search left and right subtrees.

  26. a 2-d c c c b d e f g Performance —3-d Range Tree • P(n,3) = O(n log2 n). • O(n) time to find all medians at each level of the tree.

  27. a 2-d c c c b d e f g Performance —3-d Range Tree • P(n,3) = O(n log2 n). • O(n log n) time to construct 2-d range trees at each level of the tree from data at preceding level. • O(log n) levels.

  28. a 2-d c c c b d e f g Performance —3-d Range Tree • S(n,3) = O(n log2 n). • O(n log n) needed for the 2-d range trees and nodes at each level. • O(log n) levels.

  29. Performance —3-d Range Tree • Q(n,3) = O(log3 n + s). • At each level of the binary search tree, at most 2 2-d range trees are searched. • O(log2 n + si) time to search each 2-d range tree. si is # records in the searched 2-d range tree that satisfy query. • O(log n) levels.

  30. Performance—k-d Range Tree • P(n,k) = O(n logk-1 n), k > 1. • S(n,k) = O(n logk-1 n). • Q(n,k) = O(logk n + s).

More Related