110 likes | 126 Views
Addressing the growing challenge of preserving vast digital data using the OAIS Reference Model. Learn about functional entities, metadata, and ensuring trustworthiness and longevity of repositories.
E N D
The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST lubell@nist.gov
The problem • Too much digital data! • It takes about 15 minutes for the world to churn out new digital information equivalent to the entire collection in US Library of Congress • Proprietary file formats • Expected lifetime of typical manufacturing software application only 3 years • Short-lived Computing hardware and software • Expected lifetime of today’s storage/retrieval technologies only 10 years • Products often outlive computer software/hardware by an order of magnitude • Aircraft can last 50 years or more • Healthcare records should be preserved through the patient’s lifetime, and perhaps beyond
It’s not just about preservation • How will the repository be accessed in the future? • Reference, reuse, rationale? • Should drive present-day records management policies • Is the repository trustworthy? • Organizational infrastructure • Digital object management • Technical infrastructure, security
International standard: ISO 14721: 2003 Online at http://nost.gsfc.nasa.gov/isoas/ Reference model for an Open Archival Information System Domain-general Implementation-agnostic Widely used Digital libraries Scientific data repositories Product data engineering repositories What is OAIS?
OAIS functional entities SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package
DataObject InformationObject RepresentationInformation(metadata) Information = Data + Interpretation Binary File Electronic Tech Manual Definition of PDF Format
InformationObjects ContentInformation PreservationDescriptionInformation Sub-categories • Reference • Provenance • Context • Fixity An information package
Preservation Descriptive Information (PDI) • Reference info identifies the content information using a unique ID or a bibliographic attribute • Provenance info specifies history, including chain of custody. Guides consumers in judging trustworthiness • Context info relates the content to other information outside the package • Fixity info helps ensure authenticity using methods such as checksums or digital signatures PDI is the part of the OAIS reference model most closely pertaining to identity management
Develop metrics for OAIS • Goal is to measure how well a repository conforms to OAIS reference model • ISO working group addressing this • Starting point: Trustworthy Repositories Audit & Certification: Criteria and Checklist, Center for Research Libraries, URL: http://www.crl.edu/PDF/trac.pdf • NIST MEL developing metrics specific to long-term management of product-related engineering data • Sustaining Engineering Informatics: Toward Methods and Metrics for Digital Curation, 3rd International Digital Curation Conference, December 2007, URL: http://tinyurl.com/45934n
Tailor OAIS to specific domains • Emphasis • Producer/consumer interfaces • Metadata • Functional entities • Ingest • Access • Packaging • PDI • Packaging standards • METS (Metadata Encoding Transmission Standard) • PREMIS (Preservation Metadata Implementation Strategies) schemas