130 likes | 358 Views
R* Trees. Presenters:- Twisha Surender. R* Trees. Variant of R-tree Supports point and spatial data efficiently at the same time Implementation cost only slightly higher than that of other R-trees. Supports map-overlay operation – Spatial Join
E N D
R* Trees Presenters:- Twisha Surender
R* Trees • Variant of R-tree • Supports point and spatial data efficiently at the same time • Implementation cost only slightly higher than that of other R-trees. • Supports map-overlay operation – Spatial Join • E.g. of Spatial Join queries: Two spatial relations S1 and S2, find all pairs: x in S1, y in S2 s.t. x rel y = true where rel = intersect, inside etc. • Completely Dynamic
R*-tree • Optimization Criteria: • (O1) Area covered by an index MBR • (O2) Overlap between index MBRs • (O3) Margin of an index rectangle • (O4) Storage utilization • Sometimes it is impossible to optimize all the above criteria at the same time!
Difference between R* Trees and R Trees • Optimization in ChooseSubTree module for leaf nodes • Revised Node-Split Algorithm • Forced Reinsertion at Node Overflow
ChooseSubTree for Insertion • ChooseSubtree: • If next node is a leaf node, choose the node using the following criteria: • Least overlap enlargement • Least area enlargement • Smaller area • Else • Least area enlargement • Smaller area • Perform better especially inQueries with small query rectangles on datafiles with non-uniformly distributed small rectangles or points
Split Node • SplitNode • Choose the axis to split • Choose the two groups along the chosen axis • ChooseSplitAxis • Along each axis, sort rectangles and break them into two groups (M-2m+2 possible ways where one group contains at least m rectangles). Compute the sum S of all margin-values (perimeters) of each pair of groups. Choose the one that minimizes S • ChooseSplitIndex • Along the chosen axis, choose the grouping that gives the minimum overlap-value, then area-value
Forced Reinsert • Forced Reinsert: • defer splits, by forced-reinsert, i.e.: instead of splitting, temporarily delete some entries, shrink overflowing MBR, and re-insert those entries • Which ones to re-insert? • How many? A: 30%
Forced Reinsert • OverflowTreatment( parameter level) • If level is not root level and this is first call to OverflowTreatment for this level • Then, invoke Reinsert • Else, invoke Split • Reinsert • Sort entries on their distance from center • Remove first p entries, adjust BR • Invoke insert to reinsert the entries
Forced Reinsert • Forced reinsert changes entries between neighboring nodes and decreases the overlap • Storage utilization is improved • Due to restructuring, less splits occur • Since outer rectangles are re-inserted, more quadratic directory rectangles.
R* Trees – Why Robust? • For every query file and every data file less disk accessed are required than any other variants. • Highly robust against ugly data distributions
Performance • Likely significant improvement over other R tree variants. In spite of forced reinsertion, average insertion cost is not increased but essentially decreased regarding the R-tree variants. • Efficiently supports point and spatial data at the same time
References: • Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles • www.corelab.ntua.gr/courses/ds.grad/lect2NTUA07.ppt