640 likes | 721 Views
WorldCat Growth & Quality: Vision and Practice. Asia Pacific Regional Council 2010. April 15, 2010. Ted Fons Director WorldCat Global Metadata Network. OCLC The world’s libraries. Connected. More collaboration More institutions More Web-scale More synchronization More innovation.
E N D
WorldCat Growth & Quality: Vision and Practice Asia Pacific Regional Council 2010 April 15, 2010 Ted Fons Director WorldCat Global Metadata Network
OCLCThe world’s libraries. Connected. More collaboration More institutions More Web-scale More synchronization More innovation Local Group Global More Better
Union Catalogue – Pivotal Role blogs Repositories, various sites
WorldCat Growth – Growing WorldCat Faster
Create system-wide efficiencies in library managementWorldCat Growth since 1998 Millions of records
Local Group WorldCat Growth – Is It Working? Global
Create system-wide efficiencies in library management WorldCat Today 170 million records 1.5+ billion holdings 1 January 2010
Create system-wide efficiencies in library management Files loaded or pending for WorldCat • ABES (France) • Bavarian State Library • Bibliothek Alexandrina (Egypt) • Bibliothekszentrum Baden Württemberg (Germany) • British Library • DANBIB (Denmark) • GBV (Germany) • HeBIS (Germany) • IDS Informationsverbund Deutsch-Schweiz (Switzerland) • Lebanese American University • LIBRIS (Sweden) • Qatar University • UnityUK • Zayed University Consortium (UAE)
Create system-wide efficiencies in library management National files loaded or pending for WorldCat • Bibliothèque nationale de France • German National Library • Libraries Australia • National Central Library, Taiwan • National Library Board, Singapore • National Library of Barbados • National Library of China • National Library of Finland • National Library of Israel • National Library of Mexico • National Library of New Zealand • National Library of Scotland • National Library of Spain • National Library of Sweden • National Library of Wales • Swiss National Library
36% 1998 53.8% 2009 Create system-wide efficiencies in library managementMultilingual WorldCat Percentage of Non-English Records 199837.5m records 23.9 m 2.3 m 2.2 m 1.6 m .8 m .8 m .7 m .7 m .3 m .3 m .2 m .2 m 2009117.2 m records 64.3 m 8.5 m 17.9 m 4.5 m 2.8 m 2.3 m 4.3 m 2.1 m 1.9 m 1.1 m 2.9 m 1.2 m Total Records English French German Spanish Japanese Russian Chinese Italian Latin Portuguese Dutch Hebrew
Create system-wide efficiencies in library management The collective collection 1.9 billion items and growing! Physical holdings in WorldCat Licensed digital content in library collections Local library content being digitized 170 million bib records 3.6 million digital items 1.5 billion holdings 325 million electronic database records NEW! JSTOR Metadata: 4.5 million records 30 million items(Google, HathiTrust, OAIster)
Local Group WorldCat Growth – Synchronization with Libraries, Repositories & Metadata Hubs Global
Growing WorldCat Faster • New Data Ingest Platform under Services Oriented Architecture
Create system-wide efficiencies in library management Using Publisher Data to Grow WorldCat Establish partnerships with publishers Ingest publisher and vendor metadata in ONIX Enhance publisher metadata Enrich WorldCat with publisher metadata Output enhanced ONIX data to publishers/other partners http://www.oclc.org/partnerships/material/nexgen/nextgencataloging.htm
Metadata Services for Publishers Enriched Bib Data Book Seller Bib Data Enriched Bib Data Publisher
Real Time Update & Record Enrichment Union Catalog Union Catalogue
Local Group WorldCat Growth – Syndication Global
Create system-wide efficiencies in library management OCLC and Google to exchange data, link digitized books to WorldCat • Synchronizes WorldCat with digital collections of interest to the membership • Participating organizations provide OCLC with a regular feed of metadata • WorldCat is automatically updated with new MARC records as materials become available • Reciprocal linking between WorldCat and the host site • Automatic
Create system-wide efficiencies in library management OCLC Syndicates WorldCat Data with Google Books
Local Group WorldCat Quality – Improving the Quality of the Database Global
Reducing Duplicates – An Improved Algorithm • First production run, May 18, 2009 • Running small files (500 – 3000) • Statistics for May & June 2009 • 33,023 records processed • 1,777 duplicates removed (5.7%) • 846 records deferred for manual review
Reducing Duplicates – An Improved Algorithm • Full production run, Feb. 2, 2010 • Entire Database, beginning with OCN #1 • Statistics so far: • 7.5 Million records processed • Almost 650,000 records removed • Unique fields from deletes merged into the master record • The exception to this is non-Latin fields. We try to ensure that all non-Latin fields are in the retained record even if they are not on the list of mergeable fields.
Expert Community Experiment • Experiment to test “social cataloging” with OCLC’s expert community (modeled on Wikipedia) • Interest and motivation from WorldCat Local libraries that want to use WorldCat Local as their “database of record”
What are they saying? • “I am loving the ability to fix typos, add more subject headings, etc. Some of which were things I would do locally but were too much of a hassle to fix at the oclc level.” • “Thank you so much for the opportunity to participate in the community enhancement experiment! Having the ability to correct typos, flesh out minimal…cataloging…is really wonderful. I hope the experiment works out well … -- I would love to see it made a permanent feature”
WorldCat is much more than a warehouse of records • Continuous improvement of WorldCat records by members: • Enhance • Record enrichments • Expert Community Experiment • Error reporting • OCLC’s quality management role: • WorldCat Quality group • Automated record enrichment • FRBRization • Duplicate detection and resolution • Support for Program for Cooperative Cataloging – NACO, CONSER, BIBCO, etc. • Ongoing conformance to library standards A partnership of members and OCLC
Local Group WorldCat Quality– Let’s Probe What “Quality” Really Means Global
Online Catalogs: What Users and Librarians Want End-Users expect online catalogs: to look like popular Web sites to have summaries, abstracts, tables of contents to help find needed information Librarians expect online catalogs: to serve end users’ information needs to help staff carry out work responsibilities to have accurate, structured data to exhibit classical principles of organization http://www.oclc.org/us/en/reports/onlinecatalogs/default.htm
Librarian/Staff Results: Highlighted Differences End-User Results: Recommended Enhancements 14 Recommended enhancements to WorldCatTotal end-user responses 1 4
What did we learn?End-user focus group results • Key observations: • Delivery is as important, if not more important, than discovery. • Seamless, easy flow from discovery through delivery is critical. • Summaries and tables of contents are key elements of a description • Improved search relevance is necessary.
Local Group WorldCat Registry – Enabling Services Global
Metadata about Libraries • WorldCat Registry • A repository of metadata about libraries: • Location • Contacts • Policies • Links
WorldCat Registry Value Proposition • The WorldCat Registry allows your library to: • Provide direct linking to local library services over a variety of OCLC products including WorldCat.org and WorldCat Local • Create and manage a profile that centralizes and automates information sharing with vendors and OCLC • Receive a free benefit of greater internet visibility regardless of the OCLC membership
Registry Growth 2007-2009 • 2009 • 130,000 records • Over 4,500 library users managing records • Processing 200-300,000 requests/mo via OpenURL Gateway • Multiple OCLC and non-OCLC Services that rely on this data • 2007 • 70, 000 records • some library users • 20,000 requests/mo via OpenURL Gateway
Bringing It All Together: RedLaser App http://redlaser.com
The Vision • Achieve web scale for KB services • Move the KB to the cloud • Provide KB services through an API model to: • Provide a central platform for KB data management • Allow read and write access to the KB within OCLC services • Allow read and write access to the KB for external services • The KB can be managed in one place, but exposed anywhere
Web scale value proposition 70% 30% INFRASTRUCTURE INITIATIVE Amazon.com: http://www.slideshare.net/goodfriday/amazon-web-services-building-a-webscale-computing-architecture
Cloud Computing A style of computing in which scalable and elastic IT-enabled capabilities are delivered as a service to external customers using Internet technologies. -Gartner Group Simple: Web-based applications with shared data and services.
Traditional KB Services • The traditional model for KB services is to build a KB to support a service or product
Powering the library Metasearch A-Z KB KB Link Resolver KB ERM KB
More power KB
Efficient storage of data in the cloud:Common use data Library Bib Holdings Common Use Data Users Suppliers UserData Partners
Efficient storage of data in the cloud:Common use data Library Holdings Common Use KB Data Users Suppliers UserData Partners Titles Collections