1 / 28

CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees

CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees. Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric Lo Sindy Shou Hugh Wang. Efficient Processing of Spatial Join Using R-trees. What is Spatial Data?

Download Presentation

CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSIS 7101:Spatial Data (Part 2)Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric Lo Sindy Shou Hugh Wang

  2. Efficient Processing of Spatial Join Using R-trees • What is Spatial Data? • Consists of points, lines, rectangles, polygons, surfaces… • Two types of queries in DBS • Single scan and Multiple scan queries • How to retrieve spatial objects in GIS efficiently? • Spatial Access Method (SAM) – eg. R*-tree

  3. What is Spatial Access Method? • Designed to support single scan query • eg. Window query • “Find all objects which intersect a given window” • Attempts to store objects which are close together in the data space on a common page • Reduces number of disk accesses

  4. How is window query processed by SAM? • 1) Filter step • Find all objects whose minimum bounding rectangles intersects the query rectangle • 2) Refinement step • Check whether the objects fulfill the query condition

  5. What is Spatial Join? • To combine two sets of spatial objects according to some spatial properties • It is an important type of query for multiple scanning in spatial DBS

  6. Example of Spatial Join • Two relations: forests, cities (Assume an attributes in each relation represents the borders of forests and cities) • Example query would be: • “Find all forests which are in a city”

  7. Problems when performing Spatial Join • It is too expensive in terms of CPU time and I/O time • Traditional index structure is not efficient for spatial join • How to make it more efficient? • R*-tree

  8. Why using R*-tree for Spatial Join ? • To optimize CPU-time and I/O time • Less comparison than a simple nested loop • Other algorithms cannot be efficiently applied to spatial join

  9. R*-tree Approach for Spatial Join • Suppose there are two R*-trees • R, S • Idea: • To use the property that directory rectangles form the minimum bounding box of data rectangles in the corresponding subtrees. • If the rectangles of two directory entries ER and ES have common intersection then there is a pair (rectR, rectS)

  10. Minimum Bounding Box

  11. Is there anyway to be more efficient? • There are two areas we need to take into account in order to be more efficient • CPU – Time Tuning • I/O – Time Tuning

  12. CPU – Time Tuning • Two ways to improve CPU – time • Restricting the search space • Spatial sorting and plane sweep

  13. Restricting the search space • Idea: • Scan through each of two nodes marks all entries which are required for performing the join, (i.e. which intersect the intersecting rectangles of two nodes. ) • Then, each marked entry of one node is tested against all marked entries of the other node.

  14. Restricting the search space (cont’d) Original: 7 of R * 7 of S 5 = 49 joins 1 4 6 2 1 2 1 1 5 3 2 6 2 3 7 Now: 3 of R * 2 of S 7 3 =6 joins Plus Scanning: 7 of R + 7 of S 4 = 14 times

  15. Spatial sorting and plane sweep • Idea: • Sort the entries in a node of the R*-tree according to the spatial location of the corresponding rectangles. • Then move the Sweep-Line perpendicular to one of the axes from left to right to compute the intersections.

  16. Example of Sorted Intersection Test r1.xu • t = r1 : r1 <--> s1 • t = s1 : s1 <--> r2 • t = r2 : r2 <--> s2, r2 <--> s3 • t = s2 : - • t = r3: r3 <--> s3 s1.xl < r1.xu s1.xl Sweep-Line

  17. I/O Time Tuning • To achieve good I/O-performance with a buffer size as small as possible • R*-tree might occupy only small portion of LRU-buffer • Compute a read schedule of the pages to minimize the number of disk accesses • Local optimization policy based on spatial locality • Idea of Read Schedule: If a frequently used page always resides in the buffer, the number of disk access can be improved by a lot

  18. Three such techniques • Local plane sweep • Local plane sweep with pinning • Local z-order

  19. Local Plane-Sweep Order • Idea: • Based on spatial ordering, the plane-sweep algorithm creates a sequence of pairs of intersecting rectangles. • This sequence can be used to determine the read schedule of the spatial join.

  20. Read schedule: Local Plane-Sweep Order (cont’d) 6 r3 r3 4 s2 s2 < , , , , , > 3 r1 r1 r4 r4 s1 s1 5 r2 1 r2 2

  21. Local Plane-Sweep Order w/ Pinning • Idea: • Determine a pair of (Er,Es) of entries wrt local plane sweep order. Compute the degree of the rectangles of both entries • Deg(E.rect) = # of intersections between E.rect and the rectangles which belong to entries of the other tree that are not yet processed • Pin the page in the buffer whose corresponding rectangle has maximal degree • Perform spatial join on the pinned page with all other pages

  22. Local Plane-Sweep Order w/ Pinning (cont’d) Er.rect = r1 Es.rect = s2 r3 Es 1 s2 0 2 Deg(r1) = Deg(s2) = 2 Er r1 r4 s1 r2

  23. Local Z-Order • Idea: • Compute the intersections between each rectangle of the one node and all rectangles of the other node • Sort the rectangles according to the spatial location of their centers • Decompose the underlying space into cells of equal size and provide an ordering on this set of cells

  24. Local Z-Order (cont’d) r3 III III s2 II IV IV II r1 r4 s1 I I r2 Read schedule: <s1,r2,r1,s2,r4,r3>

  25. Number of Disk Access > 5384 5290 Size of LRU Buffer < 2392 2373

  26. Number of Disk Access (cont’d) Size of LRU Buffer

  27. Q & A That’s it for the Presentation Any Questions?

  28. Reference • Brinkhoff T., Kriegel H.P., Seeger B. (1993). Institute of Computer Science, University of Munich. Efficient Processing of Spatial Joins Using R-trees. Washington, DC, USA: ACM-SIGMOD.

More Related