1 / 31

Metadata Encoding and Transmission Standard overview – and case study

Metadata Encoding and Transmission Standard overview – and case study. Markus Enders, SUB Göttingen enders@sub.uni-goettingen.de. METS overview. METS was derived from „Making of America“ format --> generalize format; usage for other media types. Funded by Digital Library Federation (DLF).

benjy
Download Presentation

Metadata Encoding and Transmission Standard overview – and case study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metadata Encoding and Transmission Standard overview – and case study Markus Enders, SUB Göttingen enders@sub.uni-goettingen.de

  2. METS overview METS was derived from „Making of America“ format --> generalize format; usage for other media types Funded by Digital Library Federation (DLF) Editorial Board is steering the development helds “Mets Opening Days”

  3. METS overview <mets:div TYPE=”Monograph” LABEL=”From Hamburg to San Fransisco” ORDER=”1” ID="DMD1"> structMap <div> • central object • mandatory • nested <div> store structure • multiple structures (type attribute)

  4. METS overview <mets:structLink> <mets:smLink xlink:from=”div1” xlink:to=”div2”> structMap structLink <div> • central object • mandatory • nested <div> store structure • multiple structures (type attribute) • link between two div elements from different <structMap>

  5. METS overview • contains file groups (nested) • files are contained in file groups • basic technical metadata as attributes • link from a <div> to one or more files structMap structLink <div> <fptr>: FileSec parallel or sequential <FileGrp> <file> link into a file

  6. METS overview • Descriptive metadata vs. Administrative Metadata • metadata can be embedded or referenced • XML or binary metadata • extensions schemas used: MODS, DC, premis etc... • m:n relationship between metadata and <div> od <file> Desc. MD extension schema Admin. MD extension schema techMD digiProvMD rightsMD sourceMD

  7. METS overview StructMap Desc. MD structLink <div> extension schema FileSec Admin. MD <FileGrp> extension schema <file> techMD digiProvMD rightsMD sourceMD

  8. METS overview METS Header StructMap Desc. MD structLink <div> extension schema FileSec Admin. MD <FileGrp> extension schema <file> techMD digiProvMD rightsMD sourceMD

  9. METS overview How does the linking work (in XML): XML IDs are used: each target must have a unique ID <mets:dmdSec ID="DMD1"> Metadata: DMDID and ADMID are of the type IDREFs <mets:div DMDID="DMD1 DMD2"> File pointer: <mets:fptr FILEID="FN10081"/>

  10. METS example (1) Digitization Centre Simple Document model (single structure) • several content files per document (single TIFF image per page) • bibliographic metadata • logical structure for the document (table fo content) • direct relationships between logical structure entities and content files

  11. METS example (1) Digitization Centre Simple logical document model Content files <fileSec> Logical structure <structMap> Monograph 00000001.tif 00000002.tif Chapter 00000003.tif Chapter 00000004.tif 00000005.tif Chapter 00000006.tif Chapter 00000007.tif Chapter 00000008.tif

  12. METS example (1) Digitization Centre Simple logical document model Content files <fileSec> Logical structure <structMap> Metadaten Monograph Metadaten 00000001.tif 00000002.tif Chapter 00000003.tif Chapter 00000004.tif 00000005.tif Chapter 00000006.tif Chapter 00000007.tif Chapter 00000008.tif

  13. METS example (1) Digitization Centre Simple logical document model Logical structure <METS:structMap TYPE="LOGICAL"> <METS:div TYPE="Monograph"DMDID="dmdlog0001"> <METS:div TYPE="TitlePage" ID="log0002"> <METS:fptr FILEID="bitonal0001"/> </METS:div> <METS:div TYPE="Dedication" ID="log0003"/> <METS:fptr FILEID="bitonal0002"/> </METS:div> ...... </METS:div> </METS:structMap>

  14. METS example (1) Digitization Centre Simple logical document model Metadata <METS:dmdSec ID="dmdlog0001"> <METS:mdWrap MDTYPE="MODS"> <METS:xmlData> <MODS:mods> ...... </MODS:mods> </METS:xmlData> </METS:mdWrap> </METS:dmdSec>

  15. METS example (1) Digitization Centre Simple logical document model ContentFiles <METS:fileSec> <METS:fileGrp> <METS:file ID="bitonal0001" MIMETYPE="image/tiff"> <METS:FLocat LOCTYPE="URL" xlink:href="file://./00000001.tif"/> </METS:file> </METS:fileGrp> </METS:fileSec>

  16. METS example (2) Digitization Centre Document model with two structures • logical structure (TOC) • physical structure (bound book, page) • realtionships between structures • Every structure entity has its own metadata section • content files are linked to physical structure entities

  17. METS example (2) Digitization Centre Document model with two structures Logical structure Phys. structure Content files Monograph Bound Book 00000001.tif Page 00000002.tif Chapter Page 00000003.tif Chapter Page 00000004.tif Page 00000005.tif Chapter page area 00000006.tif Chapter Page 00000007.tif Chapter Page 00000008.tif Page HiRes01.jpg Page Fulltext.xml

  18. METS example (2) Digitization Centre Document model with two structures Map two structures <METS:structMap TYPE="LOGICAL"> <METS:div TYPE="Monograph" ID="log0001" DMDID="dmdlog0001"/> </METS:structMap> <METS:structMap TYPE="PHYSICAL"> <METS:div TYPE="BoundBook" ID="phys0001" DMDID="dmdphys0001"> <METS:div TYPE="page" ID="phys0002" DMDID="dmdphys0002"/> </METS:div> </METS:structMap>

  19. METS example (2) Digitization Centre Document model with two structures Map two structures <METS:structLink TYPE="xxx"> <!--Monograph --> <METS:smLink from="log0001" to="phys0001"/> <!—title page--> <METS:smLink from="log0002" to="phys0002"/> </METS:structLink>

  20. METS example (2) Digitization Centre Document model with two structures Link to several files <METS:div TYPE="page" ID="phys0002" DMDID="dmdphys0002"> <METS:fptr FILEID="bitonal0001"/> <METS:fptr FILEID="hires0001"/> </METS:div> Link to page area <METS:div TYPE="column" ID="phys0003" DMDID="dmdphys0002"> <METS:fptr> <METS:area FILEID="bitonal00000001" COORDS="40x40x150x250"/> </METS:fptr> </METS:div>

  21. METS example (2) Digitization Centre Document model with two structures Logical structure Phys. structure Content files Monograph Bound Book 00000001.tif Page 00000002.tif Chapter Page 00000003.tif Chapter Page 00000004.tif Page 00000005.tif Chapter page area 00000006.tif Chapter Page 00000007.tif Chapter Page 00000008.tif Page HiRes01.jpg Page Fulltext.xml

  22. METS example (2) Digitization Centre Document model with two structures Link to fulltext (TEI): <METS:div TYPE="page"> <METS:fptr> <METS:area FILEID="teixml01" BEGIN="xx02" END"xx02"BETYPE="IDREF"/> </METS:fptr> </METS:div> <METS:div TYPE="page"> <METS:fptr> <METS:area FILEID="teixml01" BEGIN="xx02" END"xx02"BETYPE="IDREF"/> </METS:fptr> </METS:div> <TEI:p> <TEI:q id="xx01">....</TEI:q> <TEI:q id="xx02">....</TEI:q> <TEI:pb n="13"/> <TEI:q id="xx03">...</TEI:q> </TEI:p>

  23. METS example (2) Digitization Centre Document model with two structures Fulltext is referenced, not embedded in METS file due to file sizes. METS file is about 2 – 3 MB fulltext is about 20 MB Use MODS for descriptive metadata for logical structure entities Own descriptive metadata schema for physical structure entites – storing page numbers

  24. METS example (2) Digitization Centre Why did the GDZ choose METS: easily extendable: may start with image digitization and may add fulltext later complex structure needs to be stored Fulltext format not flexible enough: (1) TEI knows only one kind of structure (logical); does not know any pages (just page breaks). (2) no extensive metadata model --> fulltext files needs to be linked to a METS file

  25. METS creation: By hand in XML editor (structMap the only required object) special tools for certain purposes, e.g: - conversion tools for web-archiving - ... At GDZ: GOOBI workflow management tool to do: General METS API, which implements the data model.

  26. METS presentation: Depends on your METS file: - simple XSLT transformations - repository systems (ContentDM, Fedora etc.) - some page turners available (for digitized content)

  27. METS-Profile Documentation Documentation is necessary: Describe objects and relationships in you document model: • What objects are available • What metadata are attached to those objects • How are objects related to each other (trees) • How to store unambiguous order? • Are there non-hierarical relationships between objects? • Which content files are available? How's the access granularity?

  28. METS-Profile Documentation Documentation should not describe a format generally, but the precise usage of a packaging format. Example: How to inheirit relationships between two structure-trees? Chapter Page Page Need the column be linked to the chapter directly or is an indirect link sufficient? Page Column

  29. METS-Profile Documentation Documentation should not describe a format generally, but the precise usage of a packaging format. Examples: How to link into fulltext files? Usage of BEGIN and END attributes How to store the order of <div> elements? What kind of <div> elements are available?

  30. METS-Profile Documentation METS Profile describes the usage of METS for a special scenario: - what extension schemas are used? - what authority files? - usage of attributes and elements METS-profile schema available; profile is an XML file, which is not machine readable. „registry“ on METS website available

  31. http://www.loc.gov/mets

More Related