1 / 40

BNSC Report Fall 2007

BNSC Report Fall 2007. David Giaretta. CASPAR Consortium. Integrated project Total spend 16MEuro. http://www.casparpreserves.eu. …CASPAR. Strongly based on OAIS Passed 1 st year EU review. CASPAR Aims.

damon
Download Presentation

BNSC Report Fall 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BNSC Report Fall 2007 David Giaretta

  2. CASPAR Consortium Integrated project Total spend 16MEuro http://www.casparpreserves.eu

  3. …CASPAR • Strongly based on OAIS • Passed 1st year EU review

  4. CASPAR Aims • Produce tools and techniques to support digital preservation and make it easier to share the cost • must be relatively easy to use • must have a low “buy-in” in terms of effort required for adoption • must avoid requiring wholesale change of everyone else’s systems • must be decentralised and reproducible so that it can live on after the formal end of the CASPAR project • must be “preservable” • must be open: open source, open standards • Cannot do everything • Working closely with other projects

  5. Validation • How can we judge any proposed solution? • CASPAR validation metrics: • Theoretic underpinning • Testbed scenarios addressing real issues • No “hand-waving” – use what is there now • Accelerated lifetime tests • Hardware and Software • Environment • People • Improved “trustability”/”certifiability” Live a long time Evidence - not proof

  6. Rep • Info CASPAR information flow architecture Virtualisation

  7. Orchestration Gap Manager RegRep Data Curator User RepInfo toolkit Data Source Application Repository Registry INFRASTRUCTURE ELEMENTS

  8. Preservation Aware Storage and Preservation DataStores • Preservation Aware Storage - The storage component of a digital preservation system that has built-in support for both bit preservation and logical preservation. • Presevation DataStores (PDS) is anew OAIS-based preservation-aware storage. It offloads functionality to the storage layer • Decrease the probability of data loss • Simplify the applications • Provide improved performance and robustness • Utilize locality properties • Compute data intensive functions internally e.g. fixity • Provide better support for links among objects

  9. Preservation Aware Storage Functionality

  10. Preservation Aware Storage Functionality (Cont.)

  11. PDS Architecture Preservation Web Services AIP Preservation DataStore Ingest, Access, Administration, … Preservation Engine Layer Applications • Layered approach • Prototype based on open standards • OAIS, XAM, OSD • Generic gradual mapping from logical to physical object • Independent of physical storage • Independent of stored data type • Scalable XAM Layer Object/File Layer backend

  12. PDS Architecture Preservation Web Services AIP Preservation Engine Layer Preservation DataStore RepInfo Mgr PDI Mgr Preservation WSDL Migration Mgr Placement Mgr Ingest, Access, Administration, … Preservation Engine Applications XAM API XAM Layer XAM Library VIM API VIM API XAM to FS XAM to OSD WAS CE posix I/O sockets File System HL OSD + Object Store web service Security Admin HL OSD Object Layer backend

  13. Preservation DataStores • Preservation DataStores are OAIS-based preservation aware storage • API covers different options for ingest and access, configure policies and enables updates of AIPs and PDS code • Prototype implements mainly ingest and access using web services • References • “Towards OAIS-Based Preservation Aware Storage - A White Paper“. • http://www.haifa.il.ibm.com/projects/storage/datastores/public.html • “The Need for Preservation Aware Storage - A Position Paper". • ACM SIGOPS Operating Systems Review, Special Issue on File and Storage Systems, Volume 41, Issue 1 (Jan 2007), pp 19-23. • “Preservation DataStores: Architecture for Preservation Aware Storage”, to appear in 24th IEEE Conference on Mass Storage Systems and Technologies (MSST), 2007. • Web site - http://www.haifa.il.ibm.com/projects/storage/datastores/index.html

  14. Data Value Vector Image 3-D data Virtualisation - building up data types… Spectrum Earth Observation image Astronomical image Time Series

  15. Content dependent components • Representation Information tools • Structure • EAST • DRB • DFDL • Virtualisation assistant • Semantics • RDF editors • RDFSuite • Terminology capture • Software • UVC • Hardware emulators • Trust, Authenticity & Provenance tools • Certification assistant • PREMIS • Packaging tools • XFDU toolkit Use existing tools where applicable Develop new tools as needed and resources allow

  16. Strawman Architecture…

  17. …CASPAR Architecture Overview

  18. CASPAR meets OAIS - 2

  19. OAIS Information Model and CASPAR API

  20. OAIS Information Model Capture in UML diagrams • Add “obvious” methods • get/set for sub-components e.g. we know AIP has PDI so need get/setPDI • Add “best guess” methods • Iterators over contents • May need to change

  21. Summary • The Conceptual Model is based on OAIS and works out some implications • It suggests area of Research • Intelligibility • Structure • Virtualisation • Authenticity • It leads into the Architecture which is • Broadly applicable • Is useful not just for Preservation but also interoperability • Note - Registry/Repository of Representation Information • http://registry.casparpreserves.eu • http://registry.dcc.ac.uk

  22. Digital Curation Centre • DCC Development closely linked to CASPAR • Other linked JISC funded projects: • SCARP • Significant properties of software • …may be others

  23. Audit and Certification

  24. The need for Trustable Repositories • Task Force on Archiving of Digital Information (1996) declared, • “a critical component of digital archiving infrastructure is the existence of a sufficient number of trusted organizations capable of storing, migrating, and providing access to digital collections.” • “a process of certification for digital archives is needed to create an overall climate of trust about the prospects of preserving digital information.” • A recurring request in many subsequent studies and workshops

  25. Trusted Digital Repositories • Invited group, hosted by Research Library Group (RLG) • Concerned with organisational and financial issues • Trusted Digital Repositories: Attributes and Responsibilities (TDR) • http://www.rlg.org/legacy/longterm/repositories.pdf

  26. Critique of TRAC • Closed process • Single review of draft document • Many changes based on unpublished “test audits” • Underplays “understandability” • Important for data • Assumed not to be important for “documents” • Simple list – • Do ALL boxes have to be ticked? • What does a “tick” mean anyway? • Link to other standards • ISO 17799/27001 for security (overlap with TRAC section C) • ISO 9000 – say what you do and do what you say • but impractical to demand multiple independent audits

  27. ISO process status • New group set up with the primary aim of producing an ISO standard • Repository Audit and Certification (RAC) • OPEN process • Wiki open to all • Mailing list open to all • Virtual meetings normally every week • See http://wiki.digitalrepositoryauditandcertification.org • Into ISO via CCSDS – same route as OAIS • Some organisational/procedural changes in CCSDS • Currently a Birds of a Feather (BoF) group • To demonstrate adequate support for the work • Subsequently should become a Working Group • Documents agreed by the WG will then be reviewed by CCSDS and more broadly via international ISO review process

  28. Current status • Reviewing and comparing • TRAC • NESTOR • DCC documents • Do we need another ISO standard? • Could we could simply add to existing standards e.g. ISO 27001 • The view is that ISO 27001 CANNOT be modified adequately • It’s view of Information is too limited • Started drafting a straw man document • Taking TRAC and add concepts from other docs

  29. Key Issues • How to get from a checklist to an international accreditation/ certification system? • Evidence – short term • Evidence – long term • The real crunch! • Quantification • The marking system • Levels of audit? • External review • Internal maturity

  30. The Market • Transparency • Trustable? • certified by whom? • to what level? • what evidence? • for what Designated Community • relevant/sensible? • What cost?

  31. Links • RAC group Wiki: • http://wiki.digitalrepositoryauditandcertifiation.org • TRAC document • http://www.crl.edu/PDF/trac.pdf • Digital Curation Centre • http://www.dcc.ac.uk • CASPAR project • EU project on digital preservation – Science, Culture and Arts data • Infrastructure, tools and detailed case studies – what does one need to actually “understand” the data? • http://www.casparpreserves.eu

  32. Alliance for Permanent Access • Members: • Science and Technology Facilities Council • Koninklijke Bibliotheek • Deutsche Nationalbibliothek • Max Planck Gesellschaft • International Association of Scientific, Technical and Medical Publishers • European Space Agency, ESRIN • Fernuniversität in Hagen • European Organization for Nuclear Research • Georg-August-Universitat Gottingen Stiftung Oeffentlichen Rechts • European Science Foundation, • Centre National d’Etudes Spatiales, • Centre Informatique National de l’Enseignement Supérieur, • UK Joint Information Systems Committee, • British Library • National Archives of Sweden

  33. Alliance status • First stage – fairly informal sign-up • Preparing for Conference in Nov • More formal framework next year

  34. PARSE bid • Consortium is a sub-group of the Alliance • EU bid • Aims at E-Infrastructure for Preservation • Roadmap • Survey of what is in place and planned • Gap Analysis • Impact Analysis tool

  35. Other opportunities • NSF solicitation, entitled Sustainable Digital Data Preservation and Access Network Partners (DataNet) • http://www.nsf.gov/pubs/2007/nsf07601/nsf07601.pdf • informational meeting for prospective Principal Investigators will be held 10 am to noon, Tuesday, November 6, 2007, Room 595 NSF Stafford II building, Arlington, Virginia. • www.nsf.gov/dir/index.jsp?org=OCI

More Related