270 likes | 387 Views
looking under the hood Preservation Status of e-Resources: A Potential Crisis in Electronic Journal Preservation CNI Forum, December 20 11. Oya Y. Rieger AUL for Digital Scholarship & Preservation Services Cornell University Library Robert Wolven
E N D
looking under the hoodPreservation Status of e-Resources:A Potential Crisis in Electronic Journal PreservationCNI Forum, December 2011 Oya Y. Rieger AUL for Digital Scholarship & Preservation Services Cornell University Library Robert Wolven AUL for Bibliographic Services and Collection Development, Columbia University Libraries/Information Services
genesis of the study • Cornell and Columbia spend more on e-materials than other forms of content. Cornell University Library Annual Statistics Report 2009/2010
genesis of the study • E-journal archiving responsibility is distributed and elusive Yet as the creation and use of digital information accelerate, responsibility for preservation is diffuse, and the responsible parties … have been slow to identify and invest in the necessary infrastructure to ensure that the published scholarly record represented in electronic formats remains intact over the long-term. Urgent Action Needed to Preserve Scholarly Electronic Journals , Donald J. Waters et al., 2005
Research Questions • How do we participate in the LOCKSS alliance? • Do we understand the difference between LOCKSS and CLOCKSS? • Who is overseeing the coordination of preservation decisions? • How do we keep track of which e-subscriptions are represented in LOCKSS to understand their preservation status? • How do we have back issue access when a journal is canceled? • What kind of a mechanism do we have in place between the ERM/LMS and the local LOCKSS box to support uninterrupted access to digital content? • Can we do an analysis that compares Portico and LOCKSS coverage of the 2CUL e-journals?
2CUL LOCKSS Assessment Study • Initiative Leads • Oya Rieger, AUL, Digital Scholarship & Preservation Services, Cornell • Patricia Renfro, Associate VP for Digital Programs and Technology Services, Columbia • Research Team • Marty Kurth, Coordinator, Digital Scholarship Services, Cornell (now NYU) • Jeff Carroll, Collections, Columbia • Bill Kara, Central Library Operations, Cornell • Bill Kehoe, Information Technology, Cornell • Jim Spear, Technical Services Assistant, Cornell • Breck Witte, Library Information Technology, Columbia • Bob Wolven, Collection Development, Columbia
international community initiative that provides libraries with open-source digital preservation tools and support facilitate easy and inexpensive collection and preservation of institutional copies of authorized e-content 200+ members & over 8,600 e-journal titles from 500 publishers
digital preservation service provided by ITHAKA, a not-for-profit organization with a mission to help the academic community use digital technologies to preserve the scholarly record 139 participating publishers, 718 partner libraries, 12,381 e-journal titles, and 123,586 e-book titles
Leveraging LOCKSS • Only surface understanding of the preservation strategy and its implications • No formal process in place for identification of e-journals for preservation consideration • LOCKSS is currently being used for dark archiving • Lack of organizational leadership to bring together related parties from collections, IT, and scholarly communication teams
Operational Aspects • Neither Columbia nor Cornell currently uses its ERM to record and manage details related to potential LOCKSS or Portico access • Identification of titles for which access has been triggered is not handled through the ERMs at Cornell; Columbia tracks CLOCKSS and Portico triggered content in Serials Solutions • Neither of the libraries have we taken advantage of LOCKSS so far by gaining access to a canceled subscription or a closed journal or by participating in a failure-recovery test
LOCKSS & Portico Coverage Study • The short version: • “Only 13% (or 15%) of Cornell’s and Columbia’s e-journals are currently being preserved.” • A closer look under the hood: • What we found • What should be done about it
Disclaimers • Not an evaluation of LOCKSS or Portico • Not a complete survey of e-journal preservation • Not a rigorous research study • Not up to the minute • Set out to measure overlap; ended up …
LOCKSS and Portico coverage study • Data for e-journal titles extracted from catalog • Limited to titles with ISSN or e-ISSN (50%) • 45,000+ titles for Cornell • 55,000+ titles for Columbia Data sent to Portico for matching • Cornell data also compared to LOCKSS
LOCKSS and Portico CoverageCornell data • LOCKSS only: 3.9% • Portico only: 14.5% • LOCKSS and Portico: 7.6% • Not necessarily same holdings • Total coverage: 26.1% of titles
26% of What? • Serial publications • In digital form • With ISSN or e-ISSN • Titles • Not content • Not expenditures Titles vs Holdings: South Asia Research LOCKSS: vol. 25, 26, 27, 28 Portico: vol. 23(1), 24, 25, 26, 27(1), 28(3), 29(1)
Serial publications • Scholarly, peer-reviewed journals • Trade publications, newsletters • Annual reports • Newspapers • Government documents • Conference proceedings • Monographs in series
In digital form • Current, from publisher • Backfiles, from publisher • Current or backfiles, from aggregator • Historical, scanned by libraries, Google • Historical, in commercial digital collections • Published on the web
Breaking down the numbers:what’s not preserved (35-40,000 titles) Available through aggregators: 25-30% Miscellaneous freely accessible: 22-25% Newsletters: 10% • East Asian: 10% • Participating publishers: 8-9% • Non-participating publishers: 4-5%
Breaking down the numbers • Digitized collections with e-journals (commercial): 5% • Digitized collections, library based (e.g. Hathi Trust): 4% • Government, IGO (e.g. OECD): 3-4% • Book series, conference proceedings: 2-3% • Data errors (e.g., ISSN mismatch): 2%
A few examples • Aggregator: Popular electronics • In multiple databases • Freely accessible: Jornal brasileiro de pneumologia • In Scielo Brasil, 2004- • NGO: Yearbook … Balkan Human Rights Network • In Central European Online Library, 2006 • Trade Newsletter: Malaysia Food & Drink Report • In ABI/Inform, 2009- • East Asian:대한산업공학회지 • In DBPIA
More examples • Historical: Bulletin d’archeologie chretienne • In Gallica, 1870-1876 • Book series: Developments in volcanology • In ScienceDirect e-book collection • Data error: Music and Medicine • In SAGE Premier, 2009- (ISSN mismatch) • Foundations of Computational Mathematics • In SpringerLink 2001-present (LOCKSS, not Portico) • Proceedings … User Services Conference • In ACM Digital Library 1974-present
Breaking down the numbers:what’s not preserved (35-40,000 titles) Available through aggregators: 25-30% Difficult; 3rd-party agreements Important; libraries going e-only Miscellaneous freely accessible: 22-25% Questionable; many “acquired” en masse Newsletters: 10% Secondary? Ephemeral? • East Asian: 10% Different legal, technical environment
Breaking down the numbers • Participating publishers: 8-9% Publisher platforms as distributors (aggregators) Content not structured as journals • Non-participating publishers: 4-5% Cost/benefit issues • Government, IGO (e.g. OECD): 3-4% Whose responsibility? • Data errors (e.g., ISSN mismatch): 2% Fewer than expected
Different preservation strategies • Scholarly journals – LOCKSS; Portico • Historical – HathiTrust; Portico digital collections • Free on the web – web archiving; e-Depot • University published – Institutional repository? • Book series, conferences – as books
Next steps? • Repeat, extend analysis • Work with other libraries on priorities, strategies • Work with publishers • Work with LOCKSS, Portico, Keepers Registry • Investigate international context • Develop intersystem data exchange
We wish to thank the staff of LOCKSS and Portico for their assistance in conducting this study. Questions?