1 / 26

Improving Min/Max Aggregation over Spatial Objects

Improving Min/Max Aggregation over Spatial Objects. Donghui Zhang, Vassilis J. Tsotras University of California, Riverside. ACM GIS’01. Outline. Problem Definition Straightforward Solutions Our Solution Performance Results By-Product: Optimized the MSB-tree Conclusions. ACM GIS’01.

kuper
Download Presentation

Improving Min/Max Aggregation over Spatial Objects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01

  2. Outline • Problem Definition • Straightforward Solutions • Our Solution • Performance Results • By-Product: Optimized the MSB-tree • Conclusions ACM GIS’01

  3. Problem Definition • Consider a collection of spatial objects. • Each object: rectangle r, value v. • Spatial Aggregation: find aggregate value over objects intersecting a given rectangle. We focus on MAX. • E.g.: a database of rainfalls over geographical areas. Find max rainfall in Los Angeles area. Problem Definition ACM GIS’01

  4. Straightforward Solutions • Use an R*-tree [BKS+90] to index the objects. • Reduce to range search. • Better approach: aR-tree [PKZ+01, LM01]. Store MAX of the sub-tree in internal nodes; • If query rectangle contains a sub-tree, no need to search it. Straightforward Solutions ACM GIS’01

  5. Straightforward Solutions • Use an R*-tree [BKS+90] to index the objects. • Reduce to range search. • Better approach: aR-tree [PKZ+01, LM01]. Store MAX of the sub-tree in internal nodes; • If query rectangle contains a sub-tree, no need to search it. Straightforward Solutions ACM GIS’01

  6. Our Solution -- overview • The MR-tree: a specialized index for Min/Max aggregation. It uses the R*-tree and four optimization techniques: • k-max : increase the chance for the search algorithm to stop at higher tree levels; • box-elimination : erase information from the tree that will not contribute to any query; • union : do not insert an object which will not contribute to any query; • area-reduction : reduce the area of the object to be inserted. Our Solution ACM GIS’01

  7. The k-max Optimization • Motivation: The aR-tree is not efficient if the query rectangle intersects but does not fully contain a sub-tree rectangle. Optimization Techniques ACM GIS’01

  8. The k-max Optimization • Motivation: The aR-tree is not efficient if the query rectangle intersects but does not fully contain a sub-tree rectangle. Optimization Techniques ACM GIS’01

  9. The k-max Optimization • Along with each index record r, store the k max-value objects in sub-tree(r). • Upon query, if the query rectangle intersects any of the k objects at r, omit sub-tree(r). • Trade-off: larger k more sub-trees to be omitted during query; but also  more space & update. Optimization Techniques ACM GIS’01

  10. The box-elimination Optimization • Motivation: if for objects o1 and o2 , o1.box contains o2 .box and o1.valueo2 .value, o2 is obsolete, i.e. does not contribute to any query and thus can be deleted. Optimization Techniques ACM GIS’01

  11. The optimization: at insertion, remove obsolete objects and sub-trees along the insertion path. • Ideally, remove all obsolete objects/sub-trees, but too expensive. Instead, pick c (c : constant) paths. The box-elimination Optimization • Similar for object o1 and index record r2 , i.e. if o1.box contains r2 .box and o1.value max value in sub-tree(r2), the whole sub-tree is obsolete. • Trade-off: larger c smaller index size and faster query time; but also  more update time. Optimization Techniques ACM GIS’01

  12. The union Optimization • Motivation 1: if a new object o1 is obsolete due to an existing object o2 , o1should not be inserted. • Motivation 2: a new object o1 may be obsolete due to the union of several existing objects. Optimization Techniques ACM GIS’01

  13. The union Optimization • Motivation 1: if a new object o1 is obsolete due to an existing object o2 , o1should not be inserted. • Motivation 2: a new object o1 may be obsolete due to the union of several existing objects. Optimization Techniques ACM GIS’01

  14. The union Optimization • Along with each index record r, store the union of boxes of all objects in sub-tree(r); also store the MIN value of all these objects. • Do not perform the insertion of object o1 if: • o1.box is contained in r.union, and • o1.value  r.min. • Question: how is the union computed and stored? Optimization Techniques ACM GIS’01

  15. The union Optimization • Store an approximate union representation using t (t : constant) boxes. • The approximation should be fully contained in the actual union, and should cover as much space as possible. • Def: given a set of n boxes S={s1,…, sn}, the covered t-union of S is a set of t boxes A={a1,…, at} s.t. • si covers ai , and • ai covers max area possible. Optimization Techniques ACM GIS’01

  16. The union Optimization • To compute the exact covered t-union: O(n 2t+4). • We propose an much faster approximate algorithm: O(n logn). • Idea of our algorithm: pick up t largest boxes and expand them. Optimization Techniques ACM GIS’01

  17. The area-reduction Optimization • Motivation: the box of a new object o1can be reduced if an existing object o2 intersects it with a larger or equal value. Optimization Techniques ACM GIS’01

  18. The area-reduction Optimization • Motivation: the box of a new object o1can be reduced if an existing object o2 intersects it with a larger or equal value. Optimization Techniques ACM GIS’01

  19. The area-reduction Optimization • Reduce the area of new object o1 when: •  index record r s.t. r.union intersects o.box and r.min  o.value, or • one of the k max-value objects intersects o1 with a larger or equal value, or •  leaf object o2 s.t. o2 .box intersects o1.box and o2 .value  o1.value . Optimization Techniques ACM GIS’01

  20. The area-reduction Optimization • Benefit 1: reduce overlap among sibling nodes. Optimization Techniques ACM GIS’01

  21. The area-reduction Optimization • Benefit 1: reduce overlap among sibling nodes. • Benefit 2: increase chance to make new objects obsolete. Optimization Techniques ACM GIS’01

  22. Performance Results • Datasets: 5 million square objects, size randomly chosen from 10 to 10000 (space in each dimension is 1 to one million). • Implemented algorithms: • R*: the R*-tree [BKS+90]; • aR: the aR-tree [PKZ+01, LM01]; • kaR: the aR-tree with k-max optimization; • MR: the MR-tree (with all the optimizations). Performance Results ACM GIS’01

  23. Index Sizes Performance Results ACM GIS’01

  24. Query Performance (log scale) • Query time is the total of 100 random queries of the same query rectangle size. Performance Results ACM GIS’01

  25. Optimizing the MSB-tree • The MSB-tree [YW00]: efficiently maintains and computes MIN/MAX aggregates over 1-dim interval data. • Insertion/Query: O(logB m), B is page capacity, m is number of leaf records. • [YW00]: periodically reconstruct the whole tree to maintain a small m. During reconstruction, the index is off-line. • Can avoid reconstruction by applying the box-elimination optimization. Idea: if a new interval contains all intervals in a sub-tree with a larger value, the sub-tree is obsolete. Optimizing the MSB-tree ACM GIS’01

  26. Conclusions • Addressed the MIN/MAX aggregation problem over spatial objects; • Four optimization techniques; • The MR-tree; • Much smaller index size and query time; • By-product: optimized the MSB-tree. Thank You! Conclusions ACM GIS’01

More Related