1 / 22

METS Java Toolkit

METS Java Toolkit. DLF Spring Forum May 10-12, 2002, Chicago, IL. Stephen L. Abrams Harvard University Library stephen_abrams@harvard.edu. Why Do We Need a Toolkit?. Automation for archiving project with multiple content providers. METS used in hierarchical SIP

emily
Download Presentation

METS Java Toolkit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. METS Java Toolkit DLF Spring Forum May 10-12, 2002, Chicago, IL Stephen L. Abrams Harvard University Library stephen_abrams@harvard.edu

  2. Why Do We Need a Toolkit? • Automation for archiving project with multiple content providers. • METS used in hierarchical SIP • Client-side tools to produce syntactically valid SIPs • Use of METS to encapsulate complex objects, with multiple content streams. • Page turner, currently based on MOA2 METS Java Toolkit

  3. Functional Requirements • Java API to provide support for generic METS. • Support procedural: • Construction of in-memory representation • Validation • Marshalling/unmarshalling to/from instance documents • Usable as basis for application-specific tools. • Sub-class for specific functionality or restrictions METS Java Toolkit

  4. JAXB • API based on Sun’s JAXB specification, but not the tools. METS Java Toolkit

  5. Toolkit API • Each schema element corresponds to a class. Mets mets = new Mets(); • Accessor/mutator methods for each attribute. mets.setID(id); String id = mets.getID(); • Accessor/mutator methods for content model. List content = Mets.getContent(); content.add(child); METS Java Toolkit

  6. Toolkit API UML METS Java Toolkit

  7. Why Do We Need a New API? • Why not use DOM? • Unnatural unit of granularity: elements and attributes are both nodes in DOM tree • Why not JDOM? • Explicit support for validation • JAXB compiler could (potentially) be used to support METS upgrades. METS Java Toolkit

  8. Procedural Construction • The initial current element is <mets> • For each child element in the current element’s content model: • Instantiate an appropriate element object • Set its attributes • Define its content model • Add it to the content model of its parent METS Java Toolkit

  9. Procedural Construction (Ex.) Mets mets = new Mets(); mets.setID ("1234"); MetsHdr metsHdr = new MetsHdr(); metsHdr.setCREATEDATE(new Date()); Agent agent = new Agent(); agent.setROLE(Role.CREATOR); Name name = new Name (); name.getContent().add(new PCData ("S. Abrams")); agent.getContent().add(name); metsHdr.getContent().add(agent); mets.getContent().add(metsHdr); ... METS Java Toolkit

  10. Validation • Global • ID uniqueness • IDREF-to-ID consitency • Local • Existence of required attributes and content model elements Mets mets = new Mets(); ... mets.validate (); METS Java Toolkit

  11. Marshalling • Serializing in-memory representation to an output stream. Mets mets = new Mets(); ... FileOutputStream out = new FileOutputStream("mets.xml"); mets.validate (); mets.marshal(out); METS Java Toolkit

  12. Unmarshalling • Parsing instance document and creating in-memory representation. • Implicit local validation during parsing; global validation must be explicit. • Internal parsing with Jim Clark’s XP. FileInputStream in = new FileInputStream("mets.xml"); Mets mets = Mets.unmarshal(in); mets.validate (); ... METS Java Toolkit

  13. Extension Schemas • Toolkit could be extended to include explicit support for additional schemas. • Generic namespace-aware Any class: Any any = new Any("elem"); any.setAttribute("attr", value); String attr = any.getAttribute("attr"); any.getContent().add(child); METS Java Toolkit

  14. Additional Work • To be done any day now… • Support for <area>, <par>, and <seq> • Strict validation of sequence ordering • Marshal non-UTF-8 encodings • Base64 encoding/decoding methods for binData and Fcontent • Support for entity references • Diagnostic error messages METS Java Toolkit

  15. Distribution • HUL’s intent is to make the toolkit freely available under an Open Source license. • Minimal support (if any). • Community process for maintenance? • Does an appropriate organizational home exist? METS Java Toolkit

  16. Implementation • METS schema, Version 1.0 (zeta) • JAXB specification, Version 0.21 <http://java.sun/xml/jaxb> • XP, Version 0.5 <http://jclark.com/xml/xp> • Java J2SE and JDK 1.3.1 • Solaris 2.7 • Home page: <http://hul.harvard.edu/mets> METS Java Toolkit

  17. import java.util.*; import org.mets.xml.bind.*; import org.mets.xml.mets.*; public class Marshal { public static void main (String [] args) { Mets mets = new Mets (); mets.setOBJID ("1234-5678(2002)9:1<>1.0.CO;9-X"); mets.setLABEL ("METS Java toolkit"); mets.setTYPE ("Article"); MetsHdr metsHdr = new MetsHdr (); metsHdr.setCREATEDATE (new Date ()); metsHdr.setRECORDSTATUS ("DRAFT"); Agent agent = new Agent (); agent.setROLE (Role.CREATOR); Name name = new Name (); name.getContent ().add (new PCData ("S. L. Abrams")); agent.getContent ().add (name); Note note = new Note () note.getContent ().add (new PCData ("HUL/OIS")); agent.getContent ().add (note); note = new Note (); note.getContent ().add (new PCData ("Special order, 2002/02/25")); agent.getContent ().add (note); metsHdr.getContent ().add (agent); AltRecordID doi = new AltRecordID (); doi.setTYPE ("DOI"); doi.getContent ().add (new PCData ("10.1234/56789")); AltRecordID nrs = new AltRecordID (); nrs.setTYPE ("NRS"); nrs.getContent ().add (new PCData ("nrs:hul.ois:10203")); metsHdr.getContent ().add (doi); metsHdr.getContent ().add (nrs); mets.getContent ().add (metsHdr); DmdSec dmdSec = new DmdSec (); dmdSec.setID ("xyz-123"); MdRef mdRef = new MdRef (); mdRef.setLOCTYPE (Loctype.DOI); MdRef.setMDTYPE (Mdtype.DC); mdRef.setMIMETYPE ("text/xml"); ... Marshal.java METS Java Toolkit

  18. ... mdRef.setXlinkHref ("10.9876/54321"); dmdSec.getContent ().add (mdRef); MdWrap mdWrap = new MdWrap (); mdWrap.setMDTYPE (Mdtype.MARC); BinData binData = new BinData (); binData.getContent ().add (new PCData ("AbC…Yz0123456789")); mdWrap.getContent ().add (binData); dmdSec.getContent ().add (mdWrap); mets.getContent ().add (dmdSec); AmdSec amdSec = new AmdSec (); TechMD techMD = new TechMD (); techMD.setID ("t-1234"); mdWrap = new MdWrap (); mdWrap.setMDTYPE (Mdtype.OTHER); mdWrap.setOTHERMDTYPE ("MyTechMD"); XmlData xmlData = new XmlData (); Any any = new Any ("my", "techMD"); any.getAttributes ().add (new Attribute ("ID", "AB123")); any.getAttributes ().add (new Attribute ("my", "type", "TIFFF")); any.getContent ().add (new PCData ("...technical MD...")); xmlData.getContent ().add (any); mdWrap.getContent ().add (xmlData); techMD.getContent ().add (mdWrap); amdSec.getContent ().add (techMD); RightsMD rightsMD = new RightsMD (); rightsMD.setID ("r-5678"); mdWrap = new MdWrap (); mdWrap.setMDTYPE (Mdtype.OTHER); mdWrap.setOTHERMDTYPE ("MyRightsMD"); xmlData = new XmlData (); any = new Any ("my", "rightsMD"); any.getContent ().add (new PCData ("...rights MD...")); xmlData.getContent ().add (any); any = new Any ("your", "rightsMD"); any.getContent ().add (new PCData ("...rights MD...")); xmlData.getContent ().add (any); any = new Any ("their", "rightsMD"); any.getContent ().add (new PCData ("...rights MD...")); ... Marshal.java (cont.) METS Java Toolkit

  19. ... xmlData.getContent ().add (any); mdWrap.getContent ().add (xmlData); rightsMD.getContent ().add (mdWrap); amdSec.getContent ().add (rightsMD); SourceMD sourceMD = new SourceMD (); sourceMD.setID ("s-9012"); mdWrap = new MdWrap (); mdWrap.setMDTYPE (Mdtype.OTHER); mdWrap.setOTHERMDTYPE ("MySourceMD"); xmlData = new XmlData (); any = new Any ("my", "sourceMD"); any.getAttributes ().add (new Attribute ("aat", "type", new Integer (178684))); any.getContent ().add (new PCData ("...source MD...")); xmlData.getContent ().add (any); mdWrap.getContent ().add (xmlData); sourceMD.getContent ().add (mdWrap); amdSec.getContent ().add (sourceMD); DigiprovMD digiprovMD = new DigiprovMD (); digiprovMD.setID ("d-3456"); mdWrap = new MdWrap (); mdWrap.setMDTYPE (Mdtype.OTHER); mdWrap.setOTHERMDTYPE ("MyDigiprovMD"); xmlData = new XmlData (); any = new Any ("my", "digiprovMD"); any.getContent ().add (new PCData ("...provenance MD...")); xmlData.getContent ().add (any); mdWrap.getContent ().add (xmlData); digiprovMD.getContent ().add (mdWrap); amdSec.getContent ().add (digiprovMD); mets.getContent ().add (amdSec); FileSec fileSec = new FileSec (); FileGrp fileGrp = new FileGrp (); fileGrp.getADMID ().add ("t-1234"); fileGrp.getADMID ().add ("s-9012"); File file = new File (); file.setID ("a1b2c3"); FLocat flocat = new FLocat (); flocat.setLOCTYPE (Loctype.URN); flocat.setXlinkHref ("urn:nid:nss"); file.getContent (). add (flocat); FContent fcontent = new FContent (); ... Marshal.java (cont.) METS Java Toolkit

  20. ... fcontent.getContent ().add (new PCData ("MS0yLTM=")); file.getContent ().add (fcontent); fileGrp.getContent ().add (file); fileSec.getContent ().add (fileGrp); mets.getContent ().add (fileSec); StructMap structMap = new StructMap (); structMap.setID ("A125"); structMap.setLABEL ("Individual volumes"); Div div = new Div (); div.setORDER (25); div.setORDERLABEL ("xxv"); div.setTYPE ("Chapter"); Div sec = new Div (); sec.setTYPE ("Section"); Div sub = new Div (); sub.setTYPE ("Sub-section"); Fptr fptr = new Fptr (); fptr.setFILEID ("a1b2c3"); sub.getContent ().add (fptr); sec.getContent ().add (sub); div.getContent ().add (sec); sec = new Div (); sec.setTYPE ("Section"); Mptr mptr = new Mptr (); mptr.setID ("123-45-6789"); mptr.setLOCTYPE (Loctype.OTHER); mptr.setOTHERLOCTYPE ("filepath"); mptr.setXlinkHref ("dir/file.xml"); sec.getContent ().add (mptr); div.getContent ().add (sec); structMap.getContent ().add (div); mets.getContent ().add (structMap); BehaviorSec behavior = new BehaviorSec (); behavior.setID ("killerapp"); behavior.getSTRUCTID ().add ("A125"); behavior.getSTRUCTID ().add ("s-9012"); Mechanism mechanism = new Mechanism (); mechanism.setLOCTYPE (Loctype.URL); mechanism.setXlinkHref ("http://host/path"); behavior.getContent ().add (mechanism); mets.getContent ().add (behavior); mets.validate (); mets.marshal (System.out); } } Marshal.java (cont.) METS Java Toolkit

  21. <mets xmlns="http://www.loc.gov/METS/” xmlns:xlink="http://www.w3.org/1999/xlink” xmlns:xsi="http://www.w3.org/2001/XMLSchema- instance” xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd” OBJID="1234-5678(2002)9:1&lt;&gt;1.0.CO;9-X” LABEL="METS Java toolkit" TYPE="Article"> <metsHdr CREATEDATE="2002-03-15T161023” RECORDSTATUS="DRAFT"> <agent ROLE="CREATOR"> <name>S. L. Abrams</name> <note>HUL/OIS</note> <note>Special order, 2002/02/25</note> </agent> <altRecordID TYPE="DOI">10.1234/56789</altRecordID> <altRecordID TYPE="NRS">nrs:hul.ois:10203</altRecordID> </metsHdr> <dmdSec ID="xyz-123"> <mdRef LOCTYPE="DOI" xlink:type="simple” xlink:href="10.9876/54321" MDTYPE="DC" MIMETYPE="text/xml"/> <mdWrap MDTYPE="MARC"> <binData>AbCdEfGhIjKlMnOpQrStUvWxYz0123456789</binData> </mdWrap> </dmdSec> <amdSec> <techMD ID="t-1234"> <mdWrap MDTYPE="OTHER" OTHERMDTYPE="MyTechMD"> <xmlData> <my:techMD ID="AB123" my:type="TIFF">...technical MD...</my:techMD> </xmlData> </mdWrap> </techMD> <rightsMD ID="r-5678"> <mdWrap MDTYPE="OTHER" OTHERMDTYPE="MyRightsMD"> <xmlData> <my:rightsMD>...rights MD...</my:rightsMD> <your:rightsMD>...rights MD...</your:rightsMD> <their:rightsMD>...rights MD...</their:rightsMD> </xmlData> </mdWrap> </rightsMD> ... marshal.xml METS Java Toolkit

  22. ... <sourceMD ID="s-9012"> <mdWrap MDTYPE="OTHER" OTHERMDTYPE="MySourceMD"> <xmlData> <my:sourceMD aat:type="178684">...source MD...</my:sourceMD> </xmlData> </mdWrap> </sourceMD> <digiprovMD ID="d-3456"> <mdWrap MDTYPE="OTHER" OTHERMDTYPE="MyDigiprovMD"> <xmlData> <my:digiprovMD>...provenance MD...</my:digiprovMD> </xmlData> </mdWrap> </digiprovMD> </amdSec> <fileSec> <fileGrp ADMID="t-1234 s-9012"> <file ID="a1b2c3"> <FLocat LOCTYPE="URN" xlink:type="simple” xlink:href="urn:nid:nss"/> <FContent>MS0yLTM=</FContent> </file> </fileGrp> </fileSec> <structMap ID="A125" LABEL="Individual volumes"> <div ORDER="25" ORDERLABEL="xxv" TYPE="Chapter"> <div TYPE="Section"> <div TYPE="Sub-section"> <fptr FILEID="a1b2c3"/> </div> </div> <div TYPE="Section"> <mptr ID="123-45-6789" LOCTYPE="OTHER” OTHERLOCTYPE="filepath” xlink:type="simple" xlink:href="dir/file.xml"/> </div> </div> </structMap> <behaviorSec ID="killerapp" STRUCTID="A125 s-9012"> <mechanism LOCTYPE="URL" xlink:type="simple” xlink:href="http://host/path"/> </behaviorSec> </mets> marshal.xml (cont.) METS Java Toolkit

More Related