160 likes | 195 Views
File Processing : Multi-dimensional Index. 2018, Spring Pusan National University Ki-Joune Li. Multi-Dimensional Index. Multi-Attributes Query vs. Single Attribute Query Single Attribute : Only ONE attribute to specify query condition Example : Find Students whose record is in [3.5, 4.5]
E N D
File Processing : Multi-dimensional Index 2018, Spring Pusan National University Ki-Joune Li
Multi-Dimensional Index • Multi-Attributes Query vs. Single Attribute Query • Single Attribute : Only ONE attribute to specify query condition • Example : Find Students whose record is in [3.5, 4.5] • Multi-Attributes : Several attributes • Example : Find students whose height is greater than 180 cm and weight is less than 70 Kg • Each attribute corresponds to a dimension • Multi-Attribute Query : Multi-Dimensional Query
< 70 Result 180 Processing Multi-dimensional Queries • Example : Find students whose height > 180 cm and weight < 70 Kg • Method 1 : Using a B+-tree • Step 1 : Apply B+-tree to search student taller than 180 cm • Step 2 : Search students lighter than 70 Kg from the result of step 1 • Height and Weight or Weight and Height ?
< 70 < 70 180 = = Result Processing Multi-dimensional Queries • Method 2 : Using Two B+-trees • Step 1 : Result1 ← Students taller than 180 cm by B+-tree • Step 2 : Result2 ← Students lighter than 70 Kg by B+-tree • Step 3 : Result ← Result1 Result2 • Comparison of Method 1 and Method 2
Index for Height and Weight Processing Multi-dimensional Queries • Method 3 : Unified Index for Several Attributes • One index for several attributes • Multi-Dimensional Space • Two approaches • Extending B+-tree • Extending Dynamic Hashing Weight Height
Block Pointer Array Query block pointer . . . Height Weight Fixed Grid Method Fixed Variable Grid File Extending Hashing : Grid Approach
Extending Hashing : Grid File Directory (x1, y1) (x2, y2) Block Pointer Query
Problem 1: Dead Space No objects in this query area 5 block accesses Query Dead Space Empty space with no objects How to reduce dead space
Minimum Bounding Rectangle MBR(Minimum Bounding Rectangle) Query Only 1 Disk Access
Problem 2: Non-Point Object Where to store this object
(X1max , X2max ) (X1min , X2min) Minimum Bounding Rectangle • MBR (Minimum Bounding Box) • Two dimensional geometric simplification of objects • Not the Whole space, • only in the region occupied by objects
Extending B+-tree : R-tree • B+-tree vs. R-tree • B+-tree : Interval (1-D rectangle) • R-tree : Multi-Dimensional Interval (Rectangle) • R-tree : Rectangle B+-tree • Each Node • MBR (Minimum Bounding Rectangle) instead of Interval (or Delimiter) • No Linked-List for External Nodes • A certain amount of overlapping is indispensable
Extending B+-tree : R-tree • Example Root Query
New MBR Upward Split like B-tree • Split MBR in the case of overflow • Line sweeping : Compare Cost-X and Cost-Y Splitting Line
Good Split Bad Split Splitting Strategy • 50:50 Split • Instead of 50:50 split, other cost measures • Area, • Perimeter • Overlapping Area 1. Make them as COMPACT as possible 2. Preserve spatial proximity as possible
More Compact Newly Inserted Object Delete and Re-Insert this Re-Inserted Object R*-tree: An Improvement of R-tree • Re-Insertion Strategy on Overflow • Most Popular Index for Multi-Dimensional Index Overflow