1 / 10

Index Variations

Index Variations. Indexed Files - Part Three Analyzing the Options. Second Level Index in Multiple Files. Must have space to allow each 1:1 Index File to grow. Main index points to multiple secondary index files, which are each one cluster in size (e.g. 8KB). Sparse Second Level Index.

Download Presentation

Index Variations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Index Variations Indexed Files - Part Three Analyzing the Options

  2. Second Level Index in Multiple Files Must have space to allow each 1:1 Index File to grow. Main index points to multiple secondary index files, which are each one cluster in size (e.g. 8KB).

  3. Sparse Second Level Index • Q: Assuming 2-Levels, does the second level need to be 1:1? • A: It depends • if datafile is sorted, then no • if datafile is unsorted, then yes

  4. Multiple Indexes for Multiple Keys • If our key is Customer Name, then we cannot sort by Customer ID • please note that names are terrible keys, because names are not unique

  5. Multiple Indexes for Multiple "Views" • Suppose there are different types of users, with different levels of access. • Implemented in Linux via "set effective user id".

  6. Data File Level Three Three Level Index Level Two Level One

  7. When to Use 3 Levels • When the number of records in the datafile is very large. • Assume each index record for levels Two and Three take 28 bytes each • key = product ID = 20 bytes • RRN = long int = 8 bytes • BestMax Index section for internal sorting = 315 records • 8KB cluster / 28B = 315.01 • If N = 1,000,000 data records • then level three contains 1,000,000 records • and level two contains 3175 index records • 1,000,000 / 315 = 3174.6 • and level one contains 11 index records

  8. Timing Analysis 1 • Given a data file: • 100,000 records • no index • sorted • How much time does a searchtake on average? • Using a binary file search • log2N reads of datafile records • log2100,000 = 16.6 • = 17 reads of datafile records

  9. Timing Analysis 2 • Given an Indexed File with • 100,000 records • two index levels • top level index size = 1,000 elements • How much time does a search take on average? • Level One • linear search time = N / 2 • 1000 / 2 = 500 memory compares • Level Two • read the portion into memory = 1 read of 100 index records • binary search that portion = Log2 100 = 7 memory compares • Data File • 1 datafile record read

  10. To index, or not to index... • No Index File • 17 reads of datafile records • Two Levels of Index • 507 memory compares • 1 read of 100 index records • 1 datafile record read

More Related