1 / 16

File Processing : Multi-dimensional Index

File Processing : Multi-dimensional Index. 2018, Spring Pusan National University Ki-Joune Li. Multi-Dimensional Index. Multi-Attributes Query vs. Single Attribute Query Single Attribute : Only ONE attribute to specify query condition Example : Find Students whose record is in [3.5, 4.5]

kvincent
Download Presentation

File Processing : Multi-dimensional Index

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. File Processing : Multi-dimensional Index 2018, Spring Pusan National University Ki-Joune Li

  2. Multi-Dimensional Index • Multi-Attributes Query vs. Single Attribute Query • Single Attribute : Only ONE attribute to specify query condition • Example : Find Students whose record is in [3.5, 4.5] • Multi-Attributes : Several attributes • Example : Find students whose height is greater than 180 cm and weight is less than 70 Kg • Each attribute corresponds to a dimension • Multi-Attribute Query : Multi-Dimensional Query

  3. < 70 Result 180 Processing Multi-dimensional Queries • Example : Find students whose height > 180 cm and weight < 70 Kg • Method 1 : Using a B+-tree • Step 1 : Apply B+-tree to search student taller than 180 cm • Step 2 : Search students lighter than 70 Kg from the result of step 1 • Height and Weight or Weight and Height ?

  4. < 70 < 70 180 = = Result Processing Multi-dimensional Queries • Method 2 : Using Two B+-trees • Step 1 : Result1 ← Students taller than 180 cm by B+-tree • Step 2 : Result2 ← Students lighter than 70 Kg by B+-tree • Step 3 : Result ← Result1  Result2 • Comparison of Method 1 and Method 2 

  5. Index for Height and Weight Processing Multi-dimensional Queries • Method 3 : Unified Index for Several Attributes • One index for several attributes • Multi-Dimensional Space • Two approaches • Extending B+-tree • Extending Dynamic Hashing Weight Height

  6. Block Pointer Array Query block pointer . . . Height Weight Fixed Grid Method Fixed Variable Grid File Extending Hashing : Grid Approach

  7. Extending Hashing : Grid File Directory (x1, y1) (x2, y2) Block Pointer Query

  8. Problem 1: Dead Space No objects in this query area 5 block accesses Query Dead Space Empty space with no objects How to reduce dead space

  9. Minimum Bounding Rectangle MBR(Minimum Bounding Rectangle) Query Only 1 Disk Access

  10. Problem 2: Non-Point Object Where to store this object

  11. (X1max , X2max ) (X1min , X2min) Minimum Bounding Rectangle • MBR (Minimum Bounding Box) • Two dimensional geometric simplification of objects • Not the Whole space, • only in the region occupied by objects

  12. Extending B+-tree : R-tree • B+-tree vs. R-tree • B+-tree : Interval (1-D rectangle) • R-tree : Multi-Dimensional Interval (Rectangle) • R-tree : Rectangle B+-tree • Each Node • MBR (Minimum Bounding Rectangle) instead of Interval (or Delimiter) • No Linked-List for External Nodes • A certain amount of overlapping is indispensable

  13. Extending B+-tree : R-tree • Example Root Query

  14. New MBR Upward Split like B-tree • Split MBR in the case of overflow • Line sweeping : Compare Cost-X and Cost-Y Splitting Line

  15. Good Split Bad Split Splitting Strategy • 50:50 Split • Instead of 50:50 split, other cost measures • Area, • Perimeter • Overlapping Area 1. Make them as COMPACT as possible 2. Preserve spatial proximity as possible

  16. More Compact Newly Inserted Object Delete and Re-Insert this Re-Inserted Object R*-tree: An Improvement of R-tree • Re-Insertion Strategy on Overflow • Most Popular Index for Multi-Dimensional Index Overflow

More Related