1 / 18

Implementation of PREMIS in METS

Implementation of PREMIS in METS. Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San Francisco, CA October 7, 2009.

Download Presentation

Implementation of PREMIS in METS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress rgue@loc.gov PREMIS Implementation Fair San Francisco, CA October 7, 2009

  2. METS records the (possibly hierarchical) structure of digital objects, the names and locations of the files that comprise those objects, and the associated metadata • A METS document may be a unit of storage (e.g. OAIS AIP) or a transmission format (e.g. OAIS SIP or DIP) • METS is extensible and modular • METS uses the XML Schema facility for combining vocabularies from different Namespaces • The METS Editorial Board has endorsed PREMIS as an extension schema • Many institutions trying to use PREMIS within the METS context

  3. Structure of a METS file

  4. OAIS, METS and PREMIS <METS> described by delimited by Descriptive Information Archival Information Package Packaging Information identifies derived from <dmdSec> MODS MARCXML DC Preservation Description Information Content Information further described by <fileGrp> <amdSec> Reference Information <mdRef> Data Object Representation Information <rightsMD> Context Information metsRights premis:rights <file> <techMD> <structMap> <digiProvMD> <sourceMD> premis:event Provenance Information Structure Semantics described by Fixity Information <techMD> premis:object File formats premis:object textMD MIX Legend Black Arial = OAIS Red Times New Roman = METS Primary Schema Blue Times New Roman Italics = Extension Schema

  5. METS extension schemas • “wrappers” or “sockets” where elements from other schemas can be plugged in • Provides extensibility • Uses the XML Schema facility for combining vocabularies from different Namespaces • Endorsed extension schemas: • Descriptive: MODS, DC, MARCXML • Technical metadata: MIX (image); textMD (text) • Preservation related: PREMIS

  6. Why do we need guidelines for using PREMIS with METS? • Contents of each information package may vary depending on its function within a repository • Need to determine how to include representation metadata and associate it with package components • PREMIS data entities (objects, events, rights, agents) do not map perfectly to METS categories for representation metadata (techMD, digiProvMD, rightsMD, sourceMD) • There are redundant elements between the two standards • Both have extensibility mechanisms • Flexibility of both standards requires implementation choices

  7. Development of Guidelines for Using PREMIS with METS for Exchange • PREMIS in METS Guidelines Working Group • Consists of PREMIS and METS experts • Focuses on the METS document as a mechanism of exchange of digital objects and their metadata (SIP or DIP) • Facilitates communication when internal requirements and technical environments vary • Tension between flexibility and being prescriptive to facilitate interoperability • Consider usage scenarios • If a SIP it may get unwrapped and stored in different structures • If a DIP it is converted from internal structures to PREMIS • A more liberal approach is possible for a SIP than a DIP • Establishing guidelines, a METS profile, and examples http://www.loc.gov/standards/premis/guidelines-premismets.pdf

  8. Implementation issues in using PREMIS with METS • Location of PREMIS metadata within METS documents • Whether to record elements redundantly if they occur in both PREMIS and METS • Relationship of different structural metadata mechanisms in PREMIS and METS • How to record PREMIS Agent entities in METS documents • Use of identifiers to link elements in PREMIS and METS • How to record elements that are also part of a format specific technical metadata schema (e.g. MIX)

  9. Some recommendations from Guidelines • METS sections • Use Object in techMD or digiProvMD • Use Event in digiProvMD • Use Rights in rightsMD • Use Agent in digiProvMD or rightsMD • PREMIS Container -- use only if keeping all PREMIS metadata together. Do not use if separating PREMIS metadata into different amdSec subelements • PREMIS and METS redundancies -- Choosing which options to use is an implementation decision, document in profile e.g. METS <size> element attributes and subelements of <objectCharacteristics> in PREMIS

  10. Recommendations (cont.) • Structural relationship elements -- use the METS structMap to record structural relationships, use PREMIS relationship elements to record preservation and derivation relationships and structural if desired • ID/IDREF and PREMIS identifier elements -- use METS ID/IDREF mechanisms, best practices for using these ID/IDREF mechanisms apply • Use PREMIS extensibility mechanism for format specific technical metadata • Document decisions in METS profiles

  11. <fileSec><fileGrp> <file ID="FID1" SIZE="184302" ADMID="TMD1PREMIS TMD1MIX DP1EVENT DP1AGENT“ CHECKSUM="4638bc65c5b9715557d09ad373eefd147382ecbf" CHECKSUMTYPE="SHA-1"> <FLocat LOCTYPE="OTHER" xlink:href="BXF22.JPG" /> </file></fileGrp></fileSec> <techMD ID="TMD1PREMIS"> <mdWrap MDTYPE="PREMIS"> <xmlData><premis:object > <objectCharacteristics> <fixity> <messageDigestAlgorithm>SHA-1 </messageDigestAlgorithm> <messageDigest>4638bc65c5b9715557d09ad373eefd147382ecbf  </messageDigest> <messageDigestOriginator>EchoDep/messageDigestOriginator> </fixity> <size>184302</size> </objectCharacteristics> Elements defined in both METS and PREMIS: METS: Checksum, Checksumtype attribute of <file> not repeatable PREMIS: fixity also includes messageDigestOriginator allows multiples

  12. <fileSec><fileGrp> <file ID="FID1" ADMID="TMD1PREMIS DP1EVENT DP1AGENT“ MIMETYPE="image/jpeg" <FLocat LOCTYPE="OTHER" xlink:href="BXF22.JPG"/> </file></fileGrp></fileSec> <techMD ID="TMD1PREMIS“ <mdWrap MDTYPE="PREMIS"> <xmlData> <premis:object> <objectCharacteristics> <format> <formatDesignation> <formatName>image/jpeg</formatName>   <formatVersion>1.02 </formatVersion> </formatDesignation></format> </objectCharacteristics> Elements defined both in METS and PREMIS: METS: MIMETYPE attribute of <file> optional PREMIS: <format> more granular; includes name and version (although name may be MIMETYPE) mandatory

  13. <fileSec> <fileGrp> <file ID="FID1" ADMID="TMD1PREMIS TMD1MIX DP1EVENT DP1AGENT"> <techMD ID="TMD1PREMIS"> <linkingEventIdentifier> <linkingEventIdentifierType>ECHODEP Hub Event </linkingEventIdentifierType> <linkingEventIdentifierValue>echo12345</linkingEventIdentifierValue> </linkingEventIdentifier> <digiprovMD ID="DP1EVENT"> <premis:event> <eventIdentifier> <eventIdentifierType>ECHODEP Hub Event</eventIdentifierType> <eventIdentifierValue>echo12345 </eventIdentifierValue> </eventIdentifier> <eventType>ingestion</eventType> <eventDateTime>2006-05-02T15:12:53 </eventDateTime></event> Elements defined both in METS and PREMIS METS ID/Idref: used to associate metadata in different sections and for different files PREMIS identifiers: explicit linking between entity types

  14. <structMap TYPE=“physical”> <div ORDER="1" TYPE="text"> <:fptr FILEID="FID9"/> <div ORDER="1" TYPE="page" LABEL=" Page [1]"> <fptr FILEID="FID1"/></mets:div> <div ORDER="2" TYPE="page" LABEL=" Page [2]"> <fptr FILEID="FID2"/></mets:div> </div> <relationship> <relationshipType>structural</relationshipType> <relationshipSubType>is sibling of </relationshipSubType> <relatedObjectIdentification> <relatedObjectIdentifierType>UCB</relatedObjectIdentifierType> <relatedObjectIdentifierValue>FID2</relatedObjectIdentifierValue> <relatedObjectSequence>1</relatedObjectSequence> Elements defined both in METS and PREMIS: METS: structMap details structural relationships and is the heart of the METS document hierarchical, so may be more expressive than PREMIS semantic units links the elements of the structure to content files and metadata PREMIS: <relationship> details all kinds of relationships, including structural data dictionary says that implementations may record by other means

  15. Some METS profiles with PREMIS • UCSD simple and complex object • UC Berkeley • ECHO Dep Generic METS Profile for Preservation and Digital Repository Interoperability • LC Profile for Recorded Events • Australian METS Profile • TIPR • … many others

  16. Additional changes to Guidelines • Make extensibility mechanism consistent with METS • significantPropertiesExtension • objectCharacteristicsExtension • creatingApplicationExtension • environmentExtension • signatureInformationExtension • eventOutcomeDetailExtension • rightsExtension

  17. Additional changes to Guidelines (cont.) • Add the same elements and attributes as in METS to PREMIS extension elements in schema and data dictionary • mdRef, mdWrap • binData, xmlData • Attributes: ID, LABEL, MDTYPE, MIMETYPE, SIZE, CREATED, CHECKSUM, CHECKSUMTYPE • Allow URI or string for MDTYPE • Add use cases/examples to illustrate choices made • Clarify structural relationships

  18. Implementing an Exchange Standard • PREMIS Implementation Tool • Some tools documented on the PREMIS website http://www.loc.gov/standards/premis/tools_for_premis.php • PiM tool developed by Florida Center for Library Automation • Further work to generate metadata from digital files in PREMIS elements

More Related