1 / 35

The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard

The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library. University Library Dependent libraries Medical Library Scientific Periodicals Squire Law Library Betty & Gordon Moore Library. College libraries

Download Presentation

The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

  2. University Library Dependent libraries Medical Library Scientific Periodicals Squire Law Library Betty & Gordon Moore Library College libraries Departmental & Faculty libraries Affiliated Institutions Other libraries associated with the University Libraries in the University of Cambridge UC

  3. The Union Catalogue: Beginnings and growth • Began in 1982 with the Union List of Serials – non-MARC records based on a printed list • 1985 5 libraries began contributing short records for books to a Union Catalogue • 1987 UC first made available to the public with 53,000 records • 2002 90+ contributing libraries • New contributors are still joining • Software was written in-house and continued to be used until 2002

  4. Standards ... • Early records were subject to no bibliographic standards to encourage contributions • Brief records due to cost of disk space in 1980s • No Authority control, even today • Independence of colleges, faculties and departments means no overall control of standards ... consequences for the UC • Serials records were non-MARC until 2002

  5. Pre-2002 Union Catalogue Model • Consortial model with duplicate bibliographic records • No authority control • Completely separate from the authority-controlled file for the University Library • Separate Union List of Serials which was de-duplicated • Can still be seen at http://linux01.lib.cam.ac.uk/Catalogues/OPAC/xunion.shtml

  6. Pre-2002 Union Catalogue

  7. Search Results in pre-2002 Union Catalogue

  8. Cambridge Union List ofSerials

  9. Advantages and disadvantages of the old UC model Advantages • Ability to request preferred 3 libraries first • Some patron functionality, e.g. Patrons able to view books on loan • Each library’s holdings could be distinguished immediately Disadvantages • Lack of de-duplication in the main Union Catalogue • Large numbers of search results • Exclusion of the University Library holdings from the UC • Separation of serials catalogue from monographs

  10. Voyager vision for Cambridge • Single de-duplicated Universal Catalogue incorporating all public databases, bringing University Library and other databases together • Based on authority-controlled records • All patron functionality possible through the UC • Libraries able to retain local rights over records and patron functionality • Local subject headings retained

  11. From Consortial Catalogue toUniversal Catalogue • Department/Faculty and College databases in Voyager have multiple owning libraries - no record sharing • Could move to a Union Catalogue module by allowing record sharing within databases but ... • Requires political will • Is very slow since records would merge on a individual basis • Interim stage of merging confusing for patrons

  12. Cambridge System Hardware Universal Catalogue Feeder databases Web Server

  13. Sun Fire 4800 4 x T3 arrays configured in 2 partner groups 2 x 4 x 750MHZ CPU’s 16GB memory (8GB for each domain) Disk space is: 2 x 18GB (used for Solaris)and 2 x 9 x 36GB (in one T3 partner pair) for each domain Domain A (Hookea) holds all production databases Domain C (Hookec) holds UC Web server = Sun 280R 2 x 750MHz UltraSPARC III processors 4GB memory 72GB disk Test server = Sun 220R Hardware specifications

  14. Cambridge Voyager Databases

  15. De-duplication • Indexes used: • 010, 020, 022, 0350, 0359 • Large proportion of records do not have ISBNs or LCCNs • De-duplication is very loose • Resulted in very low levels of de-duplication (3-15%) • De-duplication may actually reduce as the file accumulates due to addition of older records without control numbers

  16. Replace vs Merge in de-duplication • Bi-directional merge profile should have been available in 2001.2 but not yet working • Essential in order to preserve British Education Index and local subject headings in 650._4 and 650._7 • Might be used in future to preserve other fields, e.g. 856 fields

  17. Quality Hierarchy Leader/06 Leader/17 040$a 040$d * * DLC * as * * depfacaedb ab * * depfacaedb as * * depfacfmdb ab * * depfacfmdb as * * depfacozdb ab * * depfacozdb as * * collandb ab * * collandb as * * collpwdb ab * * collpwdb as * * otherdb ab * * otherdb * * * cambrdgedb

  18. Trial UC build no. 1: Aug 2001 • First UC build with 2000.1.3 – built before remainder of system went live • Contributing files were all test loads of data for all libraries - very slow to configure and build • UC Phase 2 – should have had link back to holdings records but bug in 2000.1.3 prevented it from working • Upgrade to 2000.2.1 needed to make it work (Oct 2001) • No UB functionality • Very generic build using only 010, 020, 022 and 035 to de-duplicate

  19. Trial build no. 2: Nov 2002 • 2 databases: cambrdgedb and depfacaedb with 2001.2 Beta • Bugs in Sysadmin affected • Duplicate detection profiles • Quality hierarchy • Bi-directional merge • Saving values in Sysadmin generally • Build failed several times at pre-bulk stage

  20. Trial no. 3: March 2003 • Began March 2003, again with 2 databases • Early problems with matching location codes and Oracle database names • Further pre-bulk problems • Delayed while databases were clustered in March and upgraded to 2001.2.1 in early April • Build completed but • quality hierarchy failed to work • bi-directional merge • unable to test patron functionality

  21. Production build • 21 July Initial load began with 2 databases: cambrdgedb and depfacaedb • Indexed and reviewed at this stage • 22 August load of remaining databases began • 28 August load and indexing complete • Currently under review • Authorities not loaded • UB not yet enabled • Bi-directional merge not yet functioning

  22. De-duplication in production build

  23. Newton OPAC

  24. UC Search Results

  25. Full Record View

  26. Major issues to tackle • De-duplication of short records with no match points at present • Authority control in a non-authority controlled environment • Presentation of results to users: • Display doesn’t support multiple libraries in database: shows database name as location rather than holding library • Public names in OPAC need to be revised to reflect multiple libraries - 60 characters is not always sufficient

  27. Short record with no de-duplication:

  28. Short record de-duplication Option 1: Additional indexes • Creation of index solely for de-duplication purposes • Manual matching by cataloguers • Addition of local control number in matching records • Accurate but extremely slow • However, additional left-anchored indexes for de-duping, like 015 (BNB numbers) would help.

  29. Short record de-duplication Option 2: • Combining indexes is probably the best way to tackle the very large numbers of short records • Algorithm to combine author, title, and publication date would be ideal Option 3: • Upgrading all short records through retrocon projects - expensive and not justified if only purpose is de-duplication

  30. Serials: a special problem • Two types of serials records: • Short Union List of Serials records: identical for all libraries but multiple copies in each database • Upgraded serials records in all department/faculty and college databases • Need to ensure that • Higher quality records from departments etc. take precedence • Former Union List of Serials records do not diverge by controlling standards as they are upgraded

  31. Authority control in the UC • Authority records from the University Library database will be loaded into UC • Local authorities discarded from Voyager build • No authorities in 7 out of 8 contributing databases • Options? • Load authorities into all databases? Too much space • Introduce authority control into other 7 databases through Web authorities or copying authority records from cambrdgedb - problem of cleaning up existing records

  32. Presentation of search results • Patrons are interested in library holdings not database holdings • Location Limits appear to be possible only by database not library • May be able to work with access control groups and holdings sort groups • Random order of MFHDs very confusing

  33. Patron issues: UB environment ... but not entirely • Full patron functionality in the UC OPAC was part of the Cambridge contract but recalls, holds and call slip requests not yet working • Patron records from all contributing libraries display in OPAC • Books on loan, requests, blocks, fines and fees from all libraries display in OPAC • Circulation clustered environment • UB installed but no reciprocal borrowing

  34. Top Enhancements • Additional tools for de-duplication, preferably allowing combinations of indexes • Fix for the multiple MFHDs being delivered in random order - incomprehensible to the user • ISBN matching not ignoring text after first 10 digits (problem nos. 13283, 58877, etc.) • 020 __ |a 0335203884 and • 020 __ |a 0335203884(pbk) • Link from the UC record to the record in the contributing database would be very useful for Cambridge

  35. University of Cambridge Universal Catalogue Can be seen at: http://hookec.lib.cam.ac.uk

More Related