140 likes | 236 Views
From Unordered Files to Indexing. CSC 402 – File Management Techniques. Unordered Files. Placed in file in the order inserted (appended only) Operations Insertion Deletion Search DB needs the file structure to support efficient access to records by content. Key. Address.
E N D
From Unordered Files to Indexing CSC 402 – File Management Techniques
Unordered Files • Placed in file in the order inserted (appended only) • Operations • Insertion • Deletion • Search • DB needs the file structure to support efficient access to records by content
Key Address 32 240|Blazing Saddles|comedy| 25 189|Animal House|comedy| 33 150|To Kill A Mockingbird|drama| 0 29 65 65 29 0 150 189 240 Indexes To Access by Content • Index • Key-Reference pair • Simple Index Data file address
Key Address 32 240|Blazing Saddles|comedy| 25 189|Animal House|comedy| 33 150|To Kill A Mockingbird|drama| 0 29 65 65 29 0 150 189 240 Operations using an Index • Search by key • Update Data file address
Key Address 32 240|Blazing Saddles|comedy| 25 189|Animal House|comedy| 33 150|To Kill A Mockingbird|drama| 65 0 29 29 0 65 189 240 150 Insert with a Simple Index Data file address Insert Chicago the musical with key of 201
Key Address 21 201|Chicago|musical| 32 240|Blazing Saddles|comedy| 33 150|To Kill A Mockingbird|drama| 25 189|Animal House|comedy| 102 65 29 0 29 65 102 0 150 240 189 201 Result of Insert Data file address
Key Address 25 189|Animal House|comedy| 32 240|Blazing Saddles|comedy| 21 201|Chicago|musical| 33 150|To Kill A Mockingbird|drama| 65 102 29 0 29 65 102 0 150 240 189 201 Deletion Using a Simple Index Data file address Delete key 150
Key Key Address Address *2 240|Blazing Saddles|comedy| 21 201|Chicago|musical| 21 201|Chicago|musical| 33 150|To Kill A Mockingbird|drama| 33 150|To Kill A Mockingbird|drama| 25 189|Animal House|comedy| 32 240|Blazing Saddles|comedy| 25 189|Animal House|comedy| 102 29 0 29 0 102 102 65 102 0 0 29 29 65 189 240 201 201 240 189 Result of Deletion Not reclaiming space Data file address Reclaiming space Data file address
The Index File • A File of SimpleIndex • Normal File Operations • When created vs opened? • When read? • When written?
Adjusted File API • bool Create(filename) • bool Open(filename) • bool Close(filename) • int Read(data, key) • int Write(data, key) • int Append(data)
Simple Indexed File Data File (a record file of rectype) Index File (a record file of simple index) Simple Index Simple Indexed File Organization
Key Address 37 115|The Good Shepherd… 21 201|Chicago.. 32 240|Blazing Saddles.. 33 150|To Kill A Mockingbird.. 25 189|Animal House.. Secondary key drama comedy 189 150 Primary key 115 musical 201 comedy 240 drama 127 65 102 0 65 0 29 127 102 189 115 150 201 Secondary Indexes • Non-primary key accesses • Not guaranteed to be unique Primary index Secondary index 240 29
Link 4 -1 1 -1 -1 201 0 Secondary key Reference comedy 0 musical 3 drama 2 Primary Key 189 3 2 115 4 150 1 240 Reference Rec. Secondary Indexes with a Reference File Points to record Points to the “head” of list
Primary index 107 3 rrn key 107 0 Gene Wilder 0 0 203 -1 203 2 Harrison Ford 1 1 400 5 267 6 2 Matt Damon 5 406 -1 302 3 Michael J Fox 4 3 302 -1 400 1 423 -1 406 4 267 2 423 5 action 1 107 3 0 1 comedy 0 203 6 2 400 -1 drama 6 3 406 -1 sci fi 4 302 -1 423 -1 267 2 0 0 0 107|Blazing Saddles|comedy|Gene Wilder| 1 1 1 400|Air Force One|drama|Harrison Ford| 2 2 2 203|Indiana Jones|action|Harrison Ford| 3 3 3 302|Back to the Future|sci fi|Michael J Fox| 4 4 4 406|Young Frankenstein|comedy|Gene Wilder| 5 5 5 423|The Good Shepherd|drama|Matt Damon| 6 6 6 267|Witness|drama|Harrison Ford| Using the Secondary Indexes Star index Genre index