150 likes | 167 Views
IRs: towards preservation services. Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer Science (ECS), Southampton University JISC Repositories & Preservation Programme, New Projects Briefing London, 24-25 October 2006.
E N D
IRs: towards preservation services Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer Science (ECS), Southampton University JISC Repositories & Preservation Programme, New Projects Briefing London, 24-25 October 2006
Preserv: a rapid sketch Preserv is investigating preservation strategies for institutional repositories Repositories (or repository software) do not do preservation IR preservation must begin with IR policy Digital preservation is difficult and best managed by specialists – preservation service providers We don’t know what a preservation service provider looks like
IR software • EPrints (Southampton) • DSpace (MIT) • Fedora (Cornell, Virginia) These are among the most widely used applications for building IRs. It can be argued that these provide different degrees of support for preservation (in reverse order!).
IR software as preservation system “(N)o existing software application could serve on its own as a trustworthy preservation system. … Without the appropriate people, infrastructure, policies, and procedures, even the best preservation application cannot ensure preservation.” Fedora and the Preservation of University Records Project: Reports and Findings, Tufts University and Yale University, Final Narrative Report to NHPRC, September 27, 2006 http://dca.tufts.edu/features/nhprc/reports/index.html
Storage media Media refreshing Reformatting Backups and disaster recovery Environment Audit Security Preservation strategy Migration Emulation Technology preservation Records management, etc. Preservation actions From the JISC standards guidelines
Preservation as policy • IR preservation must start with policy • IR policy is concerned with all aspects of repository management: content strategy (mandates!), collection policy, rights, etc., see OpenDOAR Policies Tool http://opendoar.org/tools/en/policies.php • Preservation strategy will emerge from this analysis • BUT IRs can only know the requirements and scale of the preservation task with a fully formed policy • To fulfil institutional requirements, the policy needs institutional backing (a true IR)
Survey of repository policies Selected repository administrators invited to participate, based on availability and size of ROAR profile Original test sites for profiling and survey included Oxford University, e-Prints Soton, ECS EPrints (Soton) Series of questions, based on analysis of preservation metadata for Preserv model
Does the repository have any existing policy on preservation? Yes 1 No 18 Example policy http://www.rub.ruc.dk/rub/selvbetjening/projektbiblioteket_eng.shtml
Does the repository implement any preservation measures, internally or with external agents/services? Byte preservation Y 8 N 9 No reply 2 Transformation Y 3 N 14 No reply 2 Rendering Y 1 N 13 No reply 5 Emulation Y 0 N 14 No reply 5 Other: backup, mirroring, geographic cluster backup Partnerships: Sherpa-DP, MetaArchive NDIIPP, dissertation copies at German National Library
Does the repository have a policy on submission file formats? Y 11 N 4 No reply 4 • prefer PDF / DOC / PPT / HTML • recommend using PDF or HTML • PDF (Sherpa policy) • accept all formats, text documents should be at least be pdf preferred pdf/a • Use DSpace supported, known, and unknown formats (x3) • Rendering software must be free, i.e. Acrobat, text, PostScript, HTML
Format profiling using PRONOM and ROAR ROAR http://archives.eprints.org/
Preservation services: passive to active • Passive preservation (aka bitstream preservation) • Active preservation • Characterisation • Preservation planning • Preservation action
Moving to a new concept: distributed preservation services? • We need to look beyond the idea of a ‘black box’ preservation service • Services might be based on lightweight, interacting distributed Web services • Who will provide these services? • What coordination is required between services? Is that where client-facing service providers will emerge? • What services can the market sustain? See DPC Featured project interview: Preserv, 25 July 2006 http://www.dpconline.org/graphics/join/preserv.html
Summary: policy before preservation • Repositories are looking for guidance on preservation • Repositories embrace different institutional, cultural and social constraints that will shape policy, including preservation, when they get round to defining it! • Proposed a hierarchical (in terms of cost) series of preservation service models so repositories can choose which one suits • This approach may be superseded, or supplemented, by distributed preservation services