1 / 20

METS Dissemination: Interfaces

METS Dissemination: Interfaces. METS Opening Day 28 October, 2003 Leslie Myrick. NYU Collections using METS Interfaces. EAD Finding Aids Tokyo Tribunal Proceedings Afghanistan Digital Library * CRL Web Archiving Project DRAM Hemispheric Institute REPO History Sign Project. Ingredients.

patm
Download Presentation

METS Dissemination: Interfaces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. METS Dissemination: Interfaces METS Opening Day 28 October, 2003 Leslie Myrick

  2. NYU Collections using METS Interfaces • EAD Finding Aids • Tokyo Tribunal Proceedings • Afghanistan Digital Library * • CRL Web Archiving Project • DRAM • Hemispheric Institute • REPO History Sign Project

  3. Ingredients • Tomcat Servlet Engine • XSLServlet or SaxonServlet • XT or Saxon Transformation Engine • MySQL Database for generation • Perl DBI and CGI for interface to DB

  4. Why XSLT? • Relatively simple • Open-source, platform-neutral, standards-based • Official Recommendation of W3C • It is XML

  5. Free XSLT Tools Abound Editors: emacs, NoteTab + Xalan .bat Servlet Containers – Tomcat, Resin Transformation Engines – Xalan, Saxon, XT Parsers – Xerces, Aelfred,XP/Sax, Crimson Parsing APIs: DOM, SAX

  6. METS as a Functional Syntax • METS designed not only for transfer and archival management, but for giving access to, navigating an object • METS + XSLT can create dynamic interfaces with links to resources and their metadata • METS can be dumped into Oracle, indexed and searched using context-aware queries.

  7. How to Navigate a METS Document • ID, IDREF, IDREFS • Each ID must have a matching IDREF and v.v. • To match an ID against more than one value use IDREFS (e.g. multiple ADMID values in METS:file • Keys • More flexible; they make document into a database

  8. ID, IDREF, IDREFS • Provide navigable relationships between files and their metadata in complex Schema e.g. METS • Must be defined in Schema or DTD • Restrictive: Element can have only one ID; ID values must be unique (e.g. authorID and artistID can’t be same)

  9. Keys; the key() function • Creates an index • Defined in the stylesheet and not in the DTD/Schema • Flexible – many keys on one element: one for each attribute. • Any number of elements can match a given value

  10. Uses for METS • From the humble Finding Aid … to …

  11. METS and Finding Aids • Beyond the <dao> href pointer • Useful for managing complex image structure – e.g. multiple scans of multiple pages of letters • Holistic way to present descriptive metadata along with inline image (all in one package) • Also useful for presenting technical metadata that EAD does not yet accommodate

  12. METS Pageturners • Creates HTML page or frameset with links to resources • Creates navigable relationships between resources in a METS file • Creates complex time-based media synchronizationss

  13. Sfquad.xml redux • Question: could XSLT mimic java in rendering METS? • The answer at the time: no • Dynamic frame reloading a special problem

  14. N-YHS Edisto Album • Album of 77 images from the Civil War period • Logical structure: album – page - images • Two to four images per page • Presented with or without collapsible TOC

  15. Tokyo Tribunal • Simple nested structure: jpg page views of Decision taken by the Tokyo Tribunal • Collapsible TOC to unpack logical structure of various parts

  16. Afghanistan Digital Library • 40 books from 1871-1930 (400 eventually) • Simple structure – no chapters for the most part • METS Web viewer + PDF / CD version • Page Images (TIFF at 600 dpi); service files at 98-100 dpi

  17. CRL Political Web Archive • Collaboration between Stanford, Cornell, Texas, NYU, IA under aegis of CRL, Mellon • Sub-Saharan Africa, South East Asia, Latin America, Western Europe • Testbed: 400 URLs; websites from radical groups, NGOs • Internet Archive .arc files

  18. Internet Archive .arc files • .arc file 100 MB aggregate of harvested files, along with HTTP headers and crawler-generated header for each file • Fine as a simple SIP, but basically unmanageable as an AIP or DIP • At present accessed using byte offsets to grab content from aggregate file • Only searchable by URL (Wayback Machine)

  19. Can METS save .arc? • One solution: a METS file for each website contained in .arc • At collection level, ur-METS file to manage the different versions of website on different dates in different .arcs • Alternatively, a METS file for each arc, delineating content by byte offset? Naah.

  20. It’s the Structure, Silly • Ur-METs with <METS:mptr> to versions (cf. serials model) • Failure of web-archiving access models to date due to indexing at page level only • Netarkivet.dk – NWA Document format xml document for each page; indexed by FAST • Results: thousands of hits and no context.

More Related