350 likes | 451 Views
… part of shared task to ensure ease and continuity of access. PEPRS: Recording The Extent Preserved. Peter Burnhill EDINA, University of Edinburgh with sincere thanks to Regina Reynolds ALA Holdings Forum, New Orleans, 25 th June 2011.
E N D
… part of shared task to ensure ease and continuity of access PEPRS: Recording The Extent Preserved Peter Burnhill EDINA, University of Edinburgh with sincere thanks to Regina Reynolds ALA Holdings Forum, New Orleans, 25th June 2011 “Universal and repurposed holdings information - emerging initiatives and projects” 4:00/5:30pm MCC Room 355
This presentation is in 3 parts • Why the interest in the ‘holdings statement’ • ‘experiential knowledge’ from union catalogues • Moving from human-readable to computational • An introduction to PEPRS and peprs.org • What is available now or ‘real soon now’ • What is unresolved but important and needs doing • Focus on record of extent preserved • Extent issued; extent held on shelf or digitally secured (but first a little bit of ‘institutional’ background to start)
Brief introductions • EDINA • UK national academic data centre – http://edina.ac.uk • Designated and funded by JISC – http://www.jisc.ac.uk/ • The agency for innovative use of digital technology for UK research and education • Based at University of Edinburgh • Research-led University, with Library founded in 1580 • ISSN International Centre • Directs and coordinates the ISSN Network of 88 national ISSN Centres • Based in Paris, France
What is PEPRS? • JISC-funded project • led by EDINA & ISSN IC • to provide an online registry on what e-journals are being preserved • who is doing this and how • and the extent of content preserved • a registry of keepers of (e-)journal content
Experience and implications (1) Union catalogues • SALSER (union catalogue of serials in Scotland, est. 1994) • http://edina.ac.uk/salser/ • all life is there • no de-duplication at the title level, nor at the holdings level • Holdings statements once described as “highly variable and mostly poor”
2. SUNCAT, the UK union catalogue of serials • 80 largest research & university libraries • inc British Library, Cambridge, Oxford, Edinburgh, Glasgow • 3.5m ‘library records’: over 4.7m ‘item holding records’ + 2.8m ‘titles’ in CONSER, ISSN & DOAJ databases • FRBR-like matching to provide search at title-level • http://www.suncat.ac.uk/ • No comparison of information at holdings level • change in local holdings statement is biggest cause of updating • Helping UK Research Reserve discover ‘candidate titles’ for print archiving • UKRR plans to keep minimum of 3 copies • OPAC holdings statements not reliable enough for disposal decisions
Importance of knowing what was & was not issued • Always been a problem for librarians who need to claim back for what does not arrive • Now a problem for the ‘preservers’! Exploring data flowon ‘issues’ into SUNCAT: to help librarians know what had not been issued(!) • ONIX for Serials and serials holdings format
Experience and implications (2): access to articles The article has always been the ‘information object of desire’. Now with an established digital world (but not a ‘digital only’ world), the focus is on ‘entitlement’ & ‘access’ - not ‘holdings’
Assisting access to articles online remotely • A&I and machine-to-machine access • linking via OpenURL to articles online • Institutions arrange licence & remote access to publishers’ content via ERM (not the OPAC) • Recent focus on role of ERM, and union catalogues, to record of ‘entitlement’ in event of cancelation • Renewed attention on ‘digital shelf for back copy’ • for assurance of continuity of access
Scholarly Communication(Retaining focus on formal (£) economy for licensed online access to article–length work published in journals – but conscious of the ‘open’) Publisher article serial issue ISSN & other metadata DOI & other metadata ‘Holdings’ metadata Licence=authorisation serial issue article unioncat Library (serial) A&I Serials managers OPAC ‘discover’ authentication(Shibboleth) OpenURL Resolver ‘locate/access’ ‘request’ Reader (article) P.Burnhill, EDINA/JISC, 2005 (updated 2011)
an emotional scarring condition with neglect, forgotten dates, and sometimes in bad cases forgetting they even exist. Middle children are known for ending up with things that are too big for the baby and too small for the oldest. Is this a case of ‘middle child syndrome’?
Holdings statements as the “middle child” • In OPACs and union catalogues, holding statements are difficult to understand, often regarded as wrong, and some think them unreformable. • The eldest (the journal title information) always takes precedence, but can help a lot if well defined • The youngest (the wild article child) is ‘just there’
PEPRS:Piloting an E-journal Preservation Registry Service Idea of a registry raised in literature, ca. 2003/4, and then again in 2006: “either .. clarity of public statement by each agency or through a registry by which it would be plain what content was being archived, and therefore what was not.” (US) CLIR Report, 2006
PEPRS--Development • Scoping study in 2007 by Rightscom and Loughborough University led on to a JISC-funded Project: • Partners: EDINA & ISSN International Centre, • Phase 1: August 2008 – July 2010 ‘investigate, prototype and build’ • Phase 2: August 2010 – July 2012 ‘preparing for service & governance’ • Initially UK in scope, we now judge PEPRS as necessarily international • Literature is international – so is ISSN • Every nation needs one • Growing international support
On the road … and hosted at http://edina.ac.uk/presentations.html JISC Journals Working Group (London, August 2008) ISSN National Directors Meeting (Tunis, September 2008) NASIG, 24th Annual Conference (Ashville NC, USA, 4 June 2009) Library of Chinese Academy of Science (Beijing, 15 September 2009) ISSN National Directors Meeting (Beijing, 17 September 2009) PARSE.Insight Workshop (Darmstadt, Germany, 21 September 2009) Knowledge Exchange Workshop (Edinburgh, October 2009) E-journals are Forever Workshop, JISC/DPC (London, April 2010) IFLA 2010 (Gothenburg September 2010) RLUK Conference (Edinburgh, 11 November 2010) Columbia Univ. (NYC, 23 November 2010); UKSG (Spring 2011) … ISSN Governing Body (Paris, April 2011) … ARL (Montreal, May 2011) and welcomed invite to ALA, New Orleans P.Burnhill, F.Pelle, P.Godefroy, F.Guy, M.Macgregor, A.Rusbridge & C.ReesPiloting an e-journals preservation registry service. Serials 22(1) March 2009. [UK Serials Group] P.Burnhill Tracking e-journal preservation: archiving registry service anyone? Against the Grain. 21(1) February 2009. pp. 32,34,36 17
Abstract Data Model: Figure 1 in reference paper in Serials, March 2009 SERVICES: user requirements E-J Preservation Registry Service Piloting anE-journalsPreservationRegistryService E-Journal Preservation Registry METADATAon preservation action (b) (a) Digital Preservation Agenciese.g. CLOCKSS, Portico; BL, KB; UK LOCKSS Alliance etc. METADATAon extant e-journals Data dependency ISSN Register
Information about the archiving organisations • Wanting to work with those who have ‘archival intent’, i.e., the keepers of content for the long term • Five pilot participants: • British Library • CLOCKSS Archive • e-Depot[Koninklijke Bibliotheek (KB), Dutch Royal Library] • Global LOCKSS Network • Portico *preparing to include more in some kind of self-registration
Participants self-state* the following: • Overview & background: A short summary of each archiving initiative. • Ingest & preservation workflow: Steps taken to ingest content & preserve it over time. • Library access to content: In general terms, the conditions under which a library can access the content archived for each initiative. • Auditing of content, policies and procedures (both internal and external activities): Steps taken to ensure the ongoing authenticity and accessibility of content and to monitor the development of the approach over time. • Latest data: With direct link to the archiving agency's holdings information, or to the archiving agency's home page if the holdings information is not available. [*PEPRS is not an audit*]
Public Βeta now live! after field-testing with archiving Organisations [British Library, CLOCKSS, LOCKSS, KB & Portico] + associates http://peprs.org
A Quick Look Simple search shows that this journal is being preserved get same result searching on (either) ISSN
Passing glance at the variation in ‘holdings information’ reflecting what the archiving organisations hold as metadata * CLOCKSS also archives Springer content; not shown here
What happens when print ISSN is entered? Note key role of ISSN-L; even if the ‘print ISSN’ is entered, the preservation status of the e-journal is found
* COMING SOON * Allows a library to upload a list of ISSNs to check preservation status. Being field-tested in the UK and by 2CUL (Columbia & Cornell) 26
* COMING SOON * We are exploring the standards to use for m2m use of the registry service, so PEPRS could be used within union catalogues and other serial services. 27
Variation in how ‘holdings’ are expressed to PEPRS by the agencies The volume is often the work unit in archiving, plus whatever metadata there is at hand associated with that unit of effort * Dates are in the metadata, not in the workflow *
More variation in a list from OUP Mix of Arabic volume numbers & Roman numerals; dates are derived from metadata 29
note (simple) variation in Publisher information, across the archiving agencies, and ISSN Register
Matters unresolved (1): things in initial project scope PEPRS-specific • What users ‘really want to know’ via release of Public Beta • about archiving agencies and their preservation policy & practices • feedback on functionality; opportunity for social media • How to be an international registry of global keepers • Governance: UK (JISC/SCONUL/RLUK); EU (Knowledge Exchange; LIBER); USA (ARL); International (IFLA, ICOLC, ISSN-IC; EU) ?? Relevant for ‘Holdings Forum’ • Assigning ISSNs to preserved e-serials that are reported • ‘E-journals’ that come to notice • ISSN-IC is devising workflow to assign ISSNs as required • ‘D-journals’, digitised content from print journals • some have print ISSN, some not; problematic but essential to make progress • Issues/volumes, not just titles • extent preserved; common/conversion [action in Phase 2]
Matters unresolved (2): challenging the scope of PEPRS • ‘Continuity of access’, not just preservation • archiving agencies may want to detail current access offer • how should PEPRS try to adapt? • What about repositories of digitized journals? • HATHI Trust has over 210,000 titles • of which only about 1/3 have an ISSN in the record • What about print archiving? • CLR’s PAPR initiative, for print journals • significant proportion will not have had an ISSN assigned Common challenges relevant for Holdings Forum • All have serials where ISSN not yet assigned by ‘big sister’ • If it is worth preserving it should have a serials identifier! • Good News: ISSN Network has issued over 80,000 already • All tackling ‘middle sister’ problem of Issues/Volumes
‘holdings’ in OPAC conflates: information for humans (patrons/readers) about access to content with possession of that content Maybe OK for print journals but we need a different approach for journal content in digital format, where access and stewardship have differing requirements ‘holdings information’ in OPACs has ‘middle child’ conflict
What’s the way forward? • Let’s accept that the OPAC holding statement is just a ‘human-readable string’ • We need radical reform, with means to ingest and store structured metadata on issues (with their tables of contents) that allows: • transformation to allow helpful display for humans • computation by software/agents to support lots more
Information for machines on what is held by a keeper: We are working on an ‘arithmetic’ representation the norm/expectation being some matrix expression, with ‘additions’ and ‘subtractions’ about that norm Ingesting data flows from Publishers & Digitizers … … that can be parsed ‘volume by ‘volume’ … but expect the operational definition of ‘volumes’ to differ, as the workflows for Publishers and Digitizers are not the same, and so their respective ‘units of work’ differ: The issue as is published The bound volume as was digitized Universal and repurposed holdings information …
Concluding thoughts … Our common task is to ensure ease and continuity of access Because the role of libraries, individually and collectively, as trusted keepers of scholarly information has been challenged by the new economics of the digital … Each Keeper needs to be sure about what it holds • on a (digital) shelf held with ‘archival intent’ … doing so in ways that all others can know who is keeping what? • publish that metadata so the machine can understand! That’s true for e-journal content, and probably true for both digitized journal content and also of print … …
peprs.org => thekeepers.org hence interest in registries:
THANK YOU Acknowledgements due to all members of the PEPRS Project Team, and in particular to Morag Macgregor for the software engineering And thanks again to Regina Reynolds for adding Expression to this Work [Manifestation/Item?] Contact details: EDINA@ed.ac.uk and p.burnhill@ed.ac.uk