1 / 10

Indexed Files

Indexed Files. Part One - Simple Indexes All of this material is stolen from Dr. Foster's CSCI325 Course Notes. Non-Indexed Relative Files. Usage direct file manipulation is required when data will not fit into memory Minor Problems :

etan
Download Presentation

Indexed Files

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Indexed Files Part One - Simple Indexes All of this material is stolen from Dr. Foster's CSCI325 Course Notes

  2. Non-Indexed Relative Files • Usage • direct file manipulation is required when data will not fit into memory • Minor Problems: • binary searching a file is a bit more difficult than binary searching an array • sorting a big file is difficult and slow • Major Problems: • Time - disk operations take a long time!!! • Deleting a record from the middle of a file is more difficult than deleted an element from the middle of an array. • Adding must be done at end-of-file

  3. Indexed Files • An Indexed File is actually two separate, but related, binary files: • the Index File • the Data File • Index File contains information on how to find specific records in the data file. • Our primary objective is speed searching. • adding records gets easier too

  4. Example This is a simple indexed file. The index to data relationship is 1:1. Can we do this with just one file?

  5. Index File What if the Index will not fit into memory? • Key field • uses a unique identifier • same idea as in databases • arranged for fast searching • e.g., sorted by Key for binary searching • Notice that the Index File is much smaller than the Data File. The Index file must fit into memory.

  6. Retrieval Algorithm Does step one need to happen for every search? Best Search algorithm? Read the Index into an array in memory Search the array for the Key File Position = array[index].RRN * sizeof(data record) SeekG (datafile, File Position) Read record from datafile

  7. Add Record How do you know the RRN of the New Record? Does step 4 need to happen for every Add? write new record to end of data file add Key and RRN to end of index array sort the index array write index array to index file

  8. Delete Record When? • Locate the appropriate key in the index array • move all subsequent array elements up one space • Mark record in Data File for deletion • Clean up the Data File • create a new file with only non-deleted records • adjust RRNs in the Index Array • Write new index array into the Index File

  9. Analysis - Indexed v. Non-Indexed • Space • indexed files use a big chunk of main memory for the index array • one more (small) file • Time • searching an array in memory is much faster than searching a file • it is not the comparisons, it is the disk operations • Deletion is time consuming, but it is a rare operation

  10. Limitations? • Adding Records • Deleting Records • Searching for Records

More Related