350 likes | 503 Views
A Systemwide View of Library Collections. Brian Lavoie, OCLC Research Roger C. Schonfeld, Ithaka CNI Spring Task Force Meeting April 5, 2005. Systemwide View of Library Collections.
E N D
A Systemwide View of Library Collections Brian Lavoie, OCLC Research Roger C. Schonfeld, Ithaka CNI Spring Task Force Meeting April 5, 2005
Systemwide View of Library Collections • Print collections have been changing, as the distinction between local and external resources is increasingly blurred due to resource sharing • Digitization combined with network technologies creates opportunities for one “copy” of a resource to be shared across many libraries • These forces inevitably are going to lead to a shift in focus to the resources of the “system,” rather than individual library collections
Mass Digitization • Great deal of public and private investment in digitization programs … e.g., JSTOR, ARTstor - and of course mass digitization spearheaded via GooglePrint • Digitization opportunities unlimited; resources are not … • How to determine priorities? What programs of digitization will be necessary to meet the needs of the scholarly community?
Print Preservation • From a systemwide perspective, what preservation framework makes most sense for print resources? • How have preservation frameworks changed over time? • As retrospective materials become increasingly available in digital form, will new frameworks for print preservation be necessary?
What Are We Going to Do Today? • The kinds of collaborations necessary to begin to take advantage of a systemwide perspective are very hard, both from economic and political standpoints • We will not be proposing any answers! • Instead, we thought to take advantage of the WorldCat resource – which affords the broadest view of print collections – to build a bridge from a local perspective to the beginnings of a systemwide perspective • Today’s presentation focuses on print books
Data Sources • WorldCat: world’s largest and most comprehensive bibliographic database • > 20,000 libraries worldwide have contributed to the development of WorldCat • Copy of WorldCat from January 2005: • ~55 million records • Copy of WorldCat holdings file from January 2005: • ~950 million holdings
Data Source Limitations • Not all published materials are cataloged in WorldCat • Not all library holdings are represented in WorldCat • Largely reflects North American library collections • So … WorldCat does not embody the whole universe of library collections and holdings – but it’s a very good approximation!
1. The “Systemwide Collection” Size Age
Works and Manifestations • FRBR (Functional Requirements for Bibliographic Records): • Hierarchy of bibliographic entities • Works, Expressions, Manifestations, Items • Work: distinct intellectual or artistic creation • e.g., Macbeth • Manifestation: physical embodiment of an expression of a work • e.g., Macbeth, Folger Shakespeare Library edition, published in paperback by Washington Square Press (2004) • WorldCat records describe FRBR manifestations • Works identified using OCLC “FRBRization” algorithm • Converts MARC21 bibliographic databases into FRBR “work-sets” • http://www.oclc.org/research/software/frbr/
Print Book Manifestations and Works – and Digital Manifestations
How Old Are the Components of the Systemwide Collection? Cumulative Book Works/Manifestations Over Time
How Old Are the Components of the Systemwide Collection? Book Works/Manifestations per Year
Age of Works and Manifestations: Relative to 1923 (millions)
2. Individual Collections Cumulate to Form the System How will digitization bring them together virtually?
Minimal OverlapBook Works Held by X or More Libraries (in millions)
Works Held BroadlyBook Works Held by X or More Libraries (in millions)
Works Held BroadlyBook Works Held by X or More Libraries, as Percent of Total Book Works
The Virtual System in Practice • GooglePrint digitization initiative • Questions: • How many print books does this initiative potentially impact? • What proportion of “systemwide print book collection” does this represent? • Overlap (how much held broadly? how much held uniquely?) • Forthcoming paper from OCLC researchers that will offer some perspective on these questions • Hopefully, work like this will help to establish set of important questions/metrics that need to be addressed when: • Considering digitization initiatives • Considering implications of a changing world of research and learning for collections
Steady, Gradual Nineteenth Century Growth in Works Held Many Times…
Of Works with Multiple Holdings, Steady Increase Through the 1960s in the Proportion Held Many Times
Summary: Findings • Roughly 26 million print title works, represented in 32 million print title manifestations, are held by OCLC member libraries. This should be seen as a minimum in considering the number of printed books over time. Half of the books date from the period since 1977. How can a mass digitization strategy effectively manage the intellectual property ramifications of this finding? • Publications are distributed across a wide number of libraries, and any mass digitization strategy that ignores this distributional reality is likely to omit numerous works. How should this finding impact the library system’s planning for a massive format migration?
Summary: Findings • Rareness is very common within the system. This has been recognized by many librarians but is not always taken into account in policy development. How will any future print preservation strategy address this reality? Can data on rareness help to inform digitization strategies? • Redundancy in holdings across the system has changed over time. How has this led our framework for preservation to become more or less secure? What lessons should be drawn as we consider other print preservation strategies, particularly in the era of mass digitization, such as paper repositories? What lessons might there be for digital preservation?
More information … • More in-depth article forthcoming … • Contact us with comments and questions: • Brian Lavoie: lavoie@oclc.org • Roger C. Schonfeld: rcs@ithaka.org