1 / 42

Database structure & file organization

Database structure & file organization. LIS 670 Bair-Mundy. Electronic databases. EBSCO Host. ERIC. Governmental databases. Commercial databases. OPAC. Specialty databases. Online Public Access Catalog. Trust Territory. The online catalog. Title Publisher Date of publication

scout
Download Presentation

Database structure & file organization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database structure & file organization LIS 670 Bair-Mundy

  2. Electronic databases EBSCO Host ERIC Governmental databases Commercial databases OPAC Specialty databases Online Public Access Catalog Trust Territory

  3. The online catalog Title Publisher Date of publication Extent of item Caroll, Lewis Karol, Luis Kerroll, L. Dodgson, Charles Bibliographic data Name authority data Holdings data Circulation data Copies of a book; Issues of a journal Who checked out what; for how long; patron data

  4. Entity-relationship diagram NAME CHKED_OUT_ITEM TRANS_ID NAME_ID TR_PA_ID (FK) TR_DATE DUE_DATE TR_HLD_ID (FK) PERS_NAM CRP_NAM CNF_NAM AD_P_NAM BIBL-ENTITY LCN HOLDING ISBN PERS_NAM (FK) TITLE PLACE_PUBL PUBLISHER SUBJECT (FK) CP_NO HLD_LCN (FK) SUBJECT LOCATION CALL_NO HLD_DTE (FK) PATRON SUBJ_ID PATRON_ID SUBJECT USE_FOR REL_TERM NAR_TERM PAT_NAME STREET_1 STREET_2 CITY STATE

  5. Bibliographic Entity (Resource) Abstraction of a book, journal, video, photograph, or artifact BIBL-ENTITY Unique ID (Primary Key) LCN ISBN PERS_NAM (FK) TITLE EDITION PLACE_PUBL PUBLISHER DATE_PUBL EXTENT_OF_ITEM SERIES SUBJECT (FK) Attributes

  6. Name Entity Abstraction of a book, journal, video, photograph, or artifact BIBL-ENTITY Unique ID LCN NAME ISBN PERS_NAM (FK) TITLE EDITION PLACE_PUBL PUBLISHER DATE_PUBL EXTENT_OF_ITEM SERIES SUBJECT (FK) NAME_ID PERS_NAM CRP_NAM CNF_NAM AD_P_NAM Attributes

  7. Subject Entity Abstraction of a book, journal, video, photograph, or artifact BIBL-ENTITY NAME LCN NAME_ID ISBN PERS_NAM (FK) TITLE EDITION PLACE_PUBL PUBLISHER DATE_PUBL EXTENT_OF_ITEM SERIES SUBJECT (FK) PERS_NAM CRP_NAM CNF_NAM AD_P_NAM SUBJECT SUBJ_ID SUBJECT USE_FOR REL_TERM NAR_TERM

  8. Additional entities NAME CHKED_OUT_ITEM TRANS_ID NAME_ID TR_PA_ID (FK) TR_DATE DUE_DATE TR_HLD_ID (FK) PERS_NAM CRP_NAM CNF_NAM AD_P_NAM BIBL-ENTITY LCN HOLDING ISBN PERS_NAM (FK) TITLE PLACE_PUBL PUBLISHER SUBJECT (FK) CP_NO HLD_LCN (FK) SUBJECT LOCATION CALL_NO HLD_DTE (FK) PATRON SUBJ_ID PATRON_ID SUBJECT USE_FOR REL_TERM NAR_TERM PAT_NAME STREET_1 STREET_2 CITY STATE

  9. Entity Relationships NAME CHKED_OUT_ITEM TRANS_ID NAME_ID TR_PA_ID (FK) TR_DATE DUE_DATE TR_HLD_ID (FK) PERS_NAM CRP_NAM CNF_NAM AD_P_NAM BIBL-ENTITY LCN HOLDING ISBN PERS_NAM (FK) TITLE PLACE_PUBL PUBLISHER SUBJECT (FK) CP_NO HLD_LCN (FK) SUBJECT LOCATION CALL_NO HLD_DTE (FK) PATRON SUBJ_ID PATRON_ID SUBJECT USE_FOR REL_TERM NAR_TERM PAT_NAME STREET_1 STREET_2 CITY STATE

  10. Foreign Keys NAME CHKED_OUT_ITEM TRANS_ID NAME_ID TR_PA_ID (FK) TR_DATE DUE_DATE TR_HLD_ID (FK) PERS_NAM CRP_NAM CNF_NAM AD_P_NAM BIBL-ENTITY LCN HOLDING ISBN PERS_NAM (FK) TITLE PLACE_PUBL PUBLISHER SUBJECT (FK) CP_NO HLD_LCN (FK) SUBJECT LOCATION CALL_NO HLD_DTE (FK) PATRON SUBJ_ID PATRON_ID SUBJECT USE_FOR REL_TERM NAR_TERM PAT_NAME STREET_1 STREET_2 CITY STATE

  11. Relational databases The logical view: data and data relationships in the database. Main Author: Eco, Umberto. Uniform Title: Nome della rosa. English Title: The name of the rose / Umberto Eco ; translated from the Italian by William Weaver. Publisher: San Diego : Harcourt Brace, 1994. Description: 1st Harvest ed. 536 p. ill. ; 21 cm. Series: Harvest in translation A Harvest book Subject(s): Historical fiction Detective and mystery stories Call Number: PQ4865 .C6 N613 1994 Status: Not Checked Out

  12. Relational databases Main Author: Eco, Umberto. Uniform Title: Nome della rosa. English Title: The name of the rose / Umberto Eco ; translated from the Italian by William Weaver. Publisher: San Diego : Harcourt Brace, 1994. . . . NAF 19568 NAF 13789 Norton, Peter NAF 29563 Frost, Robert NAF 19568 Eco, Umberto

  13. Relational databases Author: Dalyrimple, Jen Title: Try your best / by Jen Dalyrimple. … Dalyrimple, Jan Dalyrimple, Jen Oops, Typo! Author: Dalyrimple, Jen Title: The name of my nose / by Jen Dalyrimple. … Dalyrimple, Jan Dalyrimple, Jen NAF 25793 Dalyrimple, Jan NAF 25793 Dalyrimple, Jen Dalyrimple, J.S. Author: Dalyrimple, Jen Title: My name and fame / by Jen Dalyrimple. … Dalyrimple, Jan Dalyrimple, Jen

  14. Cataloging record SUBJECT records Main Author: Eco, Umberto. Uniform Title: Nome della rosa. English Title: The name of the rose / Umberto Eco ; translated from the Italian by William Weaver. Publisher: San Diego : Harcourt Brace, 1994. Description: 1st Harvest ed. 536 p. ill. ; 21 cm. Series: Harvest in translation A Harvest book Subject(s): Historical fiction Detective and mystery stories Call Number: PQ4865 .C6 N613 1994 Status: Not Checked Out 12340005 Historical fiction 12341117 Historical fiction 12340005 Detective and mystery stories Detective and mystery stories 12341117

  15. Subject heading change SUBJECT record Title: This is how we flow : rhythm in Black cultures / edited by Angela M.S. Nelson. Publisher: Columbia, S.C. : University of South Carolina Press, c1999. Description: vi, 160 p. : ill., maps, music ; 24 cm. Subject(s): Afro-Americans. 12342001 Afro-Americans African Americans 12342001 African Americans

  16. Tuples BIBL-ENTITY LCN 00001 00002 00003 00004 00005 TITLE My life of crime / … Gone with the wind / … Life after library school Drudgery made fun /… Mudpies… PLACE_PUBL London Athens, Ga. Paris San Diego Fresno PUBLISHER Bugsy Press Tara Pub. Co. Beau Gens Bowring Press Earth Press

  17. Data descriptions for a table BIBL-ENTITY Type Integer Text Integer Text Text Text Text Integer Size 5 20 75 200 100 100 100 10 Req'd Yes No No Yes No No Yes No Attribute Counter Fixed length Variable length Variable length Variable length Variable length Variable length Fixed length Attribute name LCN ISBN PERS_NAM (FK) TITLE PLACE_PUBL PUBLISHER PUB_DATE SUB_HD (FK) Data Updateable False True True True True True True True

  18. Telephone numbers Room no.Extension no. 101 67321 102 69518 103 65835 104 69112 105 69345 106 68123 107 67721

  19. Fixed-length fields The same amount of space is allocated for every instance of the field. 6 7 3 2 1 1 0 1 6 9 5 1 8 1 0 2 6 5 8 3 5 1 0 3

  20. Records with fixed-length fields Phone directory: office number (3 digits) telephone extension (5 digits) office no. 10167321$10269518$10365835$  Beginning of file End of file extension no. End of Record markers

  21. Titles Godzilla / by Simian Amicus Voyage around the world in the vessel La Perouse under Captain Swashbuckler during the years 1887, 1888, and 1889 with the full blessings of Her Majesty the Queen of Elbonia / by A. Hoy Maytees

  22. Variable-length fields Length of field varies according to the amount of data stored within. G o d z I l l a V o y a g e a r o u n d t h e w o r l d i n t h e v e s s e l L a

  23. Records with variable-length fields Pos. 139 Header 008000110001392450154 …Maytees, A. HoyVoyage around the world in the vessel La Perouse under Captain Swashbuckler during the years 1887, 1888, and 1889 with the full blessings of Her Majesty the Queen of Elbonia / by A. Hoy Maytees…

  24. File structures Physical views of data

  25. Methods for organizing records Sequential files Indexed files Lists Balanced trees Direct access structures

  26. Sequential files (1) Records stored contiguously in order on a sort key Record no. 1 2 3 4 5 6 Publ Date (sort key) 1902 1928 1937 1978 1984 1999 Title Prospect for a new century My life as a flapper Why the market crashed Polyester pantsuit revolution Where is Big Brother? The end is near Sort key

  27. Sequential files (2) 1 2 3 1902 1928 1937 Prospect for a new century My life as a flapper Why the market crashed Slow - when add new record must re-sort file Good for high search/record addition ratio Requires less space than indexed files

  28. Searching sequential files Examine each record in sequence Binary search

  29. Examine each record in sequence Record # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Name Au Baker Chou Dietrich Doi Ing Kawamoto Liebowitz Marcuse Rowling Seuss Smith Tanaka Torrance Zeus Searching for Rowling Accession 1: Does Au = Rowling? Accession 2: Does Baker = Rowling? . . . Accession 10: Does Rowling = Rowling?

  30. Binary search: step one Record # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Name Au Baker Chou Dietrich Doi Ing Kawamoto Liebowitz Marcuse Rowling Seuss Smith Tanaka Torrance Zeus Searching for Rowling Accession 1: Does Liebowitz = Rowling? Is Rowling below Liebowitz in the alphabet?

  31. Binary search: step two Record # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Name Au Baker Chou Dietrich Doi Ing Kawamoto Liebowitz Marcuse Rowling Seuss Smith Tanaka Torrance Zeus Searching for Rowling Accession 2: Does Smith = Rowling? Is Rowling below Smith in the alphabet?

  32. Binary search: step three Record # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Name Au Baker Chou Dietrich Doi Ing Kawamoto Liebowitz Marcuse Rowling Seuss Smith Tanaka Torrance Zeus Searching for Rowling Accession 3: Does Rowling = Rowling? Maximum no. of accessions to find a record = log2n where n is number of records in the file

  33. Binary search: our bead game Left Right L R No. cups Questions 1 0 2 1 4 2 8 3 Records L R n log2n 20=1 23=8

  34. Index files LCN 00001 00002 00003 00004 00005 ISBN 7534678945 5675849246 1234567890 4378159721 4678591357 TITLE My life of crime / … Gone with the wind / … Life after library sch… Drudgery made fun /… Mudpies… PLACE_PUBL Las Vegas Athens, Ga. Paris San Diego Fresno PUBLISHER Bugsy Press Tara Pub. Co. Beau Gens Bowring Press Earth Press ISBN 1234567890 4378159721 4678591357 5675849246 7534678945 LCN 00003 00004 00005 00002 00001 ISBN Index

  35. Multiple indexes to main file Publisher Index ISBN Index Browse Title Index Bibliographic records Keyword Index Series Index Call no. Index

  36. Index files - advantages Fast searches • Index file smaller than main file • Index file sorted so can use sequential or binary search Good for system with high volume of searches

  37. Index files - disadvantages Use additional storage space • When add new records must re-index

  38. Lists Tell the computer where to find the text record. Record # 0 1 2 3 4 5 6 7 8 9 Name Baker Doi Rowling Drew Ing Chou Marcuse Kawamoto Liebowitz Au Forward Pointer 5 3 eol 4 7 1 2 8 6 0 Backward Pointer 9 5 6 1 3 0 8 4 7 bol

  39. Searching lists Follow the pointers Searching for Doi Record # 1 2 3 4 5 6 7 8 9 10 Name Baker Doi Rowling Drew Ing Chou Marcuse Kawamoto Liebowitz Au Forward Pointer 6 4 Eol 5 8 2 3 9 7 1 Backward Pointer 10 6 7 2 4 1 9 5 8 bol Accession 1: Does Baker = Doi? Doi after Baker? Use forward pointer Accession 2: Does Chou = Doi? Doi after Chou? Use forward pointer Accession 3: Does Doi = Doi?

  40. Balanced trees Implement binary search logic in list form. Goo root internal nodes Baker Rowling Au Chou Ing Tanaka leaves

  41. Direct-access structures Do not go through index or follow list - use algorithm to yield address where file is stored. Example: Divide sort key value by prime number 11, use remainder as address ISBN 1234567892/11 Remainder 8 4834567891/11 Remainder 10 5489234831/11 Remainder 5 0 1 2 3 4 5 6 7 8 9 10

  42. Direct access pros & cons Advantage - fast Do not go through indexes or follow sequence of a list Computing algorithm faster than multiple disk accessions Disadvantage - may hash to same address (collision)

More Related