200 likes | 330 Views
Keeping the pieces together: The Role of METS in the Preservation of Digital Content. Robin Wendler Harvard University Library January 16, 2005. [Men in crate looking up, photograph, ca. 1905. Harvard University Archives HUK 363 p (Fig. 8) ]. Standards everywhere, but nothing there for me….
E N D
Keeping the pieces together: The Role of METS in the Preservation of Digital Content Robin Wendler Harvard University Library January 16, 2005 [Men in crate looking up, photograph, ca. 1905. Harvard University Archives HUK 363 p (Fig. 8) ]
Standards everywhere, but nothing there for me… • As of 2000 • A plethora of descriptive metadata standards • Emerging standards for digital conversion, BUT • No open standard for representing a digital object • Display and navigation • Archiving • Exchange / Transport • Digital repositories managing files, not objects
METS to the rescue • Metadata Encoding and Transmission Standard • An XML schema for encoding descriptive, administrative, and structural metadata about a digital object • http://www.loc.gov/standards/mets/ • An initiative of the Digital Library Federation • Development: METS Editorial Board • Principle author: Jerome McDonough, NYU • Website maintained by: Library of Congress
METS Basics • METS provides a framework for • Content files • Metadata • Relationships • Suitable for • Open Archival Information Systems • Archival information package (AIP) • Submission information package (SIP) • Dissemination information package (DIP) • Display and navigation of digital objects • Sharing of digital objects among libraries and archives
Structure of a METS File METS metsHdr Header describing METS file itself fileSec Inventory or manifest of component files dmdSec Descriptive metadata Administrative metadata: -- technical, source, rights, provenance admSec structMap Structure map: the heart of METS structLink Structural map linking, i.e., hyperlinks behaviorSec* Executable behaviors * Not commonly used
Structure Map <div LABEL=“Title page”> <div LABEL=“title page” ORDER=“1” TYPE= <fptr FILEID=“A”> </div> <div LABEL=“Preface”> <div LABEL= “page i” ORDER=“2” ORDERLABEL=“i”> <fptr FILEID=“B”> </div> <div LABEL= “page ii” ORDER=“3”> <fptr> FILEID=“C”> </div> <div LABEL=“Chapter 1”> <div LABEL=“page 1” ORDER=“4”> <fptr FILEID=“D”> </div> <div LABEL=“page 2” ORDER=“5”> <fptr FILEID=“E”>… Title page Preface page i page ii Chapter 1 page 1 page 2…
Referring to Metadata METS METS does not define descriptive or administrative metadata elements. dmdSec and admSec are buckets or sockets where externally-defined metadata can be supplied or referenced metsHdr fileSec dmdSec • Three techniques: • In-line XML • Wrapped base-64 encoded data • Pointers to external information • (e.g., URNs, handles) admSec structMap structLink METS Board endorses range of recommended “extension schemas” behaviorSec
Use of MODS Extension Schema for Descriptive Metadata <div LABEL=“Reports of the president and treasurer” DMDID=“D1”> <div LABEL=“Chapter 1” DMDID=“CH1”> <div LABEL=“page 1” ORDER=“3”> <fptr FILEID=“D”> <div LABEL=“page 2” ORDER=“4”> <fptr FILEID=“E”>… Book Chapter 1 page 1 page 2… <dmdSec ID=“D1” > <mdWrap MDTYPE="MODS"> <xmlData> <mods:mods xmlns:mods="http://www.loc.gov/mods/v3" xsi:schemaLocation=http://www.loc.gov/mods/v3 …> <mods:name> <mods:displayForm> Radcliffe College</mods:displayForm> </mods:name> <mods:titleInfo> <mods:title> Reports of the president and treasurer for...</mods:title> </mods:titleInfo> </mods:mods> </xmlData> </mdWrap> <mdRef LOCTYPE=“URL” MDTYPE=“MARC” xlink:href=http://... BNI3165”/> Catalog record
Referring to Content Files METS Digital Content can exist inside or outside a METS file. metsHdr fileSec • Three techniques: • In-line XML • Wrapped base-64 encoded data • Pointers to external information • (e.g., URNs, handles) dmdSec admSec structMap structLink behaviorSec
Use of MIX Extension Schema for Image Technical Metadata Structure Map <amdSec> <techMD ID=“TMD01”> <mdWrap MDTYPE="NISOIMG" LABEL="Service Copy Technical Metadata"> <xmlData> <mix:mix> <mix:BasicImageParameters> <mix:Format> <mix:MIMEType>image/jpg</mix:MIMEType> <mix:ByteOrder>big-endian</mix:ByteOrder> <… <fileSec> <fileGrp> <file ID=“D"ADMID=“TMD01” MIMETYPE="image/jpg“> <FLocat LOCTYPE="URL" xlink:href="http://nrs.harvard.edu..." /> </file> <div LABEL=“Chapter 1”> <div LABEL=“page 1” ORDER=“3”> <fptr FILEID=“D”> <div LABEL=“page 2” ORDER=“4”> <fptr FILEID=“E”>… Chapter 1 page 1 page 2…
Profiles • Challenge: • METS is very flexible • Flexibility allows variant practices • Variant practices undermine interoperability • Response: • Create profiles: documented ways of using METS • Profiles constrain practice (this is a good thing) • Specify required structures, extension schemas, vocabularies, etc. • Communities of interest develop and register shared profiles • XML schema for METS profiles • Human-readable, not machine-actionable
Library of Congress California Digital Library Harvard University Library Oxford University MIT/DSpace National Library of Wales Stanford University Library Indiana University Library University of California, Berkeley University of Chicago Library University of Graz, Austria Florida Center for Library Automation Göttingen State and University Library OCLC Digital Archive RLG Cultural Materials Philadelphia Museum of Art University of Alberta … among others Adoption Used By:
Benefits • Modeling the whole object, not just files Open standard + XML encoding + Growing base of tools ---------------------------------- = Manageable, sharable, preservable digital objects
Thank You! Robin Wendler r_wendler@harvard.edu Baker Library, Harvard Business School, Historical CollectionsTC6512.0001:6512