1 / 19

Existing records

Existing records. Problems and Solutions. The current situation. 2.5 million bib records in six libraries@cambridge databases 1 million short records At 10 books/hour it would take one cataloguer 62 years (10 over 6 years) Total cost (just salary) approx £1.5 million. Deduplication.

izzy
Download Presentation

Existing records

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Existing records Problems and Solutions

  2. The current situation • 2.5 million bib records in six libraries@cambridge databases • 1 million short records • At 10 books/hour it would take one cataloguer 62 years (10 over 6 years) • Total cost (just salary) approx £1.5 million

  3. Deduplication • Move to single bibliographic model? • Deciding on ‘best’ record, manually relinking holdings/deleting surplus bibs • Colleges – c. 50% duplication • Departments/Others – c. 25% duplication • A lot of relinking (1M records affected?)

  4. For single cataloguer – another 40 years work? Overall 100 years? • Or for 10 cataloguers, 10 years? • Total cost rises to £2.5 million • And that’s a lot of money …

  5. Prohibitively expensive/time consuming to handle these problems by manual recataloguing alone • Particularly in the light of a likely migration to a new Library Management System (Ex Libris or otherwise) in the medium term • A non-migratable situation? • Solutions?

  6. Automated Cataloguing Tools! • update@cambridge - short record enrichment • Automated MARC correction • Deduplication routines • Order important – need full, well coded records to deduplicate effectively

  7. How to get from this …

  8. to this!

  9. update@cambridge • Record enrichment program • Web interface for use in libraries • Looks for match in UL database • If found, corrects MARC in UL record (if necessary) and overlays local record • Match rates 60-70% on average

  10. In testing with 8 libraries • So far: 34,000 bibs processed 21,000 bibs enriched • Match rate of 62% • If all 1M short records were fed through, 620,000 records would be updated, leaving only 380,000 for manual recataloguing.

  11. MARC correction • Like the Bib Checker program, but corrects errors instead of just warning you • Already built into the update@cambridge program • Could be rolled out into other areas for large scale MARC correction

  12. How to get from this • =LDR 00472nam\\2200157\a\4500 • =001 662002 • =005 20071205064734.0 • =008 071129s1985\\\\nyua\\\\\\\\\\001\0\eng\d • =020 \\$a9780961751111 • =100 1\$aBroecker, W.S.,$d1931- • =245 10$aHow to build a habitable planet ;$cBy Wallace S. Broecker. • =260 \\$aNew York ;$bEldigio Press,$cc1985 • =300 \\$a291p $bill $c23cm • =504 \\$aIncludes index. • =650 \0$aAstronomy. • =650 \0$aAstrophysics.

  13. to this! • =LDR 00453nam 2200157 a 4500 • =001 662002 • =005 20071205064734.0 • =008 071129s1985\\\\nyua\\\\\\\\\\001\0\eng\d • =020 \\$a9780961751111 • =100 1\$aBroecker, W. S.,$d1931- • =245 10$aHow to build a habitable planet /$cby Wallace S. Broecker. • =260 \\$aNew York :$bEldigio Press,$cc1985. • =300 \\$a291 p. :$bill. ;$c23 cm. • =504 \\$aIncludes index. • =650 \0$aAstronomy. • =650 \0$aAstrophysics.

  14. Lists corrections • Bib id: 662002 • How to build a habitable planet ; By Wallace S. Broecker. • 100: UPDATE: Spaces inserted between initials in subfield _a • 245: UPDATE: By uncapitalised at start of subfield c • 245: UPDATE: Space forward slash inserted before subfield _c • 260: UPDATE: Full stop inserted at end of field • 260: UPDATE: Space colon inserted before subfield _b • 300: UPDATE: Full stop inserted after the p in pagination • 300: UPDATE: Full stop inserted at end of field • 300: UPDATE: Illustration abbreviation has been corrected • 300: UPDATE: Space colon inserted before subfield _b • 300: UPDATE: Space inserted between digits and cm • 300: UPDATE: Space inserted between digits and p in pagination • 300: UPDATE: Space semi-colon inserted before subfield c

  15. Deduplication • Routines and algorithms: • Find duplicate records • Find ‘best record’ • Relink holdings to this record • Run it through MARC correction routine • Delete duplicate bibs

  16. Tools for Cataloguers, not replacements! • Does the stuff programs do well, allowing you to concentrate on what humans do well • Won’t do all the work, just makes the project feasible

  17. What you can do • Record sharing • Adhere to the Bibliographic Standard • Make sure local information is in the holding record or correctly coded • If you are interested in short record enrichment, MARC correction or deduplication, get in touch

  18. Questions?

More Related