1 / 47

METS - API

METS - API. application programming interface. METS Implementors Meeting, May 8th, 2007 . Markus Enders, SUB Göttingen Jens Ludwig, SUB Göttingen. Why?. necessity of an API. Why?. METS has a complex data model:. the most common instantiation of METS is its XML form.

Lucy
Download Presentation

METS - API

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. METS - API application programming interface METS Implementors Meeting, May 8th, 2007 Markus Enders, SUB Göttingen Jens Ludwig, SUB Göttingen

  2. Why? necessity of an API

  3. Why? METS has a complex data model: the most common instantiation of METS is its XML form an API should be based on the data model and is (theoretically) independent of its XML representation

  4. Why? API should be focused on METS elements and their appropriate attributes and relationships API should support creation of METS as well: creation of invalid data should not be possible (e.g. wrong order of elements...) 100% valid METS data

  5. Why? Multi-Tier Applications: API connects application with serialization level. API as a framework for METS creation / parsing

  6. Why? Applikation METS API XML Repository Database

  7. Implementation Issues: Maintainance: Changes in METS-schema must be reflected by API Programming language: more than one language should be supported multi-level access: • Granularity of access

  8. Implementation Issues: Maintainance: Changes in METS-schema must be reflected by API Derive classes from xml-schema: e.g. Apache xmlbeans or SUN JAXB provides java classes for xml-schema Programming language: more than one language should be supported multi-level access: • Granularity of access

  9. Implementation Issues: Maintainance: Changes in METS-schema must be reflected by API Programming language: more than one language should be supported php-java bridge: http://php-java-bridge.sourceforge.net Inline-Java perl module: http://search.cpan.org/~patl/Inline-Java/ multi-level access: • Granularity of access

  10. Implementation Issues: Maintainance: Changes in METS-schema must be reflected by API • access to single elements / attributes • higher level for more widespread functionality Programming language: more than one language should be supported multi-level access: • Granularity of access

  11. Implementation Issues: Apache xmlbeans based API for java Creates an interface for each schema object and an implementation to read / write this object to XML Other implementations possible (repository) Can create DOM tree at any time, e.g. if non-schema based xml-data needs to be stored.

  12. Implementation Issues: level one: METSbeans allows acces to single METS elements, attributes and their relationships xmlbeans based API for java level two: more complex functions which are based on the METSbeans

  13. METSbeans every type from schema becomes one class classes are generated automatically from the XML-schema additional APIs can be generated and integrated for any xml-schema based data format (e.g. MODS, premis etc.)

  14. METSbeans internal architecture: for every type in the xml schema, an appropriate java interface exists every interface is implemented during automatic generation process additional implementations of an interface are possible – high flexibility to access mets-data outside a file system

  15. METSbeans internal architecture: <xsd:complexType name="divType"> interface: DivType class: DivTypeImpl

  16. METSbeans internal architecture: xmlbeans has a set of native data types: XMLObject, XMLString XMLShort, XMLTime etc...

  17. METSbeans internal architecture: METSDocument as topmost class instantiates the document. All other objects cannot be created without this object Instance can be created by: • parsing a file • using a factory class to create new document

  18. METSbeans snippet: MetsDocument example factory class: MetsDocument mets=MetsDocument.Factory.newInstance(); example parsing a file: try { xml = XmlObject.Factory.parse(f); } catch (XmlException e) { e.printStackTrace(); return false; } MetsDocument metsDoc=(MetsDocument) xml;

  19. METSbeans DivType: methods for accessing <mprtr> element getMptrArray(), getMptrArray(int i), sizeOfMptrArray(), setMptrArray(Mptr[] mptrArray), setMptrArray(int i, Mptr mptr), insertNewMptr(int i), addNewMptr(); removeMptr(int i)

  20. METSbeans DivType: methods for accessing <div> element getDivArray() getDivArray(int i) sizeOfDivArray() setDivArray(DivType[] divArray) setDivArray(int i, DivType div) insertNewDiv(int i) addNewDiv() removeDiv(int i)

  21. METSbeans DivType: very similar methods for handling file pointers (<fptr> elements)

  22. METSbeans DivType: methods to set attributes (id attribute) getID(); isSetID(); setID(String id); unsetID(); xsetID(org.apache.xmlbeans.XmlID id); xgetID();

  23. METSbeans snippet: create a new <div> element MetsDocument mets=MetsDocument.Factory.newInstance(); MetsType myMets=mets.addNewMets(); StructMapType sm=myMets.addNewStructMap(); DivType div=sm.addNewDiv(); div.setTYPE("Monograph"); DivType firstchild=div.addNewDiv(); firstchild.setTYPE("TitlePage");

  24. METSbeans snippet: saving a METS document HashMap suggestedPrefixes = new HashMap(); suggestedPrefixes.put("http://www.loc.gov/METS/", "mets"); suggestedPrefixes.put("http://www.w3.org/1999/xlink", "xlink"); XmlOptions opts = new XmlOptions(); opts.setSaveSuggestedPrefixes(suggestedPrefixes); File outputFile=new File(filename); mets.save(outputFile,opts);

  25. METSbeans MdSecType represents the METS elements may contain: MdRef or MdWrap object <dmdSec> <techMd> <digiprovMd> <rightsMd> <sourceMd> but not: <amdSec>

  26. METSbeans snippet: create an MdSecType object MetsDocument mets=MetsDocument.Factory.newInstance(); MetsType myMets=mets.addNewMets(); MdSecType dmdSec= myMets.addNewDmdSec(); dmdSec.setID("DMDID01"); MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap(); MdSecType.MdWrap.XmlData xmldata=mdwrap.addNewXmlData(); xmldata.set(modsObject); any XMLObject: e.g XMLString

  27. METSbeans snippet: create an MdSecType object String: XmlString xs=XmlString.Factory.newValue("<mydata/>"); xmldata.set(xs); Document: ModsDocument modsObject=ModsDocument.Factory.newInstance(); ModsType myMods=mods.addNewMods(); IdentifierType identifier=myMods.addNewIdentifier(); .... xmldata.set(modsObject);

  28. METSbeans parse mets data: the API provides some parse-methods: parse(java.lang.String xmlAsString) parse(java.io.File file) parse(java.net.URL u) parse(java.io.InputStream is) parse(org.w3c.dom.Node node) if the parsed data is NOT valid METS a XmlException is thrown.

  29. METSbeans snippet: parse mets data File f=new File(filename); XmlObject xml; try { xml = XmlObject.Factory.parse(f); } catch (XmlException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); }MetsDocument metsDoc=(MetsDocument) xml;

  30. METSbeans snippet: get a DivType MetsDocument metsDoc=(MetsDocument) xml; MetsType mets=inDoc.getMets(); StructMapType structs[]=mets.getStructMapArray(); for (int i=0; i<structs.length;i++){ StructMapType struct=structs[i]; String structtype=structs[i].getTYPE(); if ((structtype!=null)&&( structtype.equals("LOGICAL"))){ DivType div= struct.getDiv(); String divtype=div.getTYPE(); return divtype; } }

  31. METSbeans easy to create and parse valid METS data (much easier than parsing DOM trees) easy to combine with other xml data quite fast compared to DOM Drawback: as based on xmlbeans it is only available for java; php-java / inline::java modul needed for php/perl

  32. Helper-class Functions: Need for additional high-level functions: Though the METSbeans allow access to every single METS element, it is still a complex task to do simple things e.g. adding metadata to a <div> Helper-class needed, which sits on top of MetsBeans

  33. Helper-class Functions: Following examples are from experiences working with METSbeans (based on METSbeans) No official implementation, just an excerpt of functions which a level 2 API could provide

  34. Helper-class Functions: Create DMDSec for common METS-objects: createDMDSec(XMLObject inMetadata, DivType inDiv) createDMDSec(XMLObject inMetadata, FileType inFile) ...

  35. Helper-class Functions: Create adminsitrative metadata for common METS-objects: e.g. createMDSectionInAMDSec( XMLObject inMetadata, String type, DivType inDiv, AmdSecType inAmdSec) ...

  36. Helper-class Functions: function to retrieve special metadata sections by ID or TYPE: getMDSecTypeByID( String inID) getMDSecTypeByType( String inType) ...

  37. Helper-class Functions: functions to get related files (to a <div> element): getAllFilesForDivType( DivType inDiv) getAllFilesForFileGroup( FileGrpType inGrp) ...

  38. Extension schema Integration of extension schema: Export MetsBeans-objects as DOM tree. Create Beans for extensions schema as well: Premis, MODS, MIX - Beans.

  39. Extension schema Example: create MODS data MdSecType dmdSec=mets.addNewDmdSec(); dmdSec.setID(dmdid_string); MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap(); MdSecType.MdWrap.XmlData xml=mdwrap.addNewXmlData(); ModsDocument mods=ModsDocument.Factory.newInstance(); ModsType myMods=mods.addNewMods(); xml.set(mods);

  40. Extension schema Example: create <premis:object> data MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap(); MdSecType.MdWrap.XmlData xml=mdwrap.addNewXmlData(); ObjectDocument objdoc=ObjectDocument.Factory.newInstance(); ObjectDocument.Object premis_object=objdoc.addNewObject(); xml.set(objdoc);

  41. Extension schema Example: parse MODS data MdSecType dmdSec; .... MdSecType.MdWrap mdw= dmdSec.getMdWrap(); MdSecType.MdWrap.XmlData xml_data=mdw.getXmlData(); String result=xml_data.xmlText(); ModsDocument mods=ModsDocument.Factory.parse(result);

  42. Problems?! Quality of the API API depends on XML-schema; quality of API depends on quality of schema. MetsType fpr <mets> DivType for <div> MdSecType for <dmdSec>,.... but not type for METS-Header <metsHdr> as it is defined inline

  43. Problems?! Integration of extension schema Problematic, if extension schema do not have a top-level element; especially parsing is difficult: String result=xml_data.xmlText(); ModsDocument mods=ModsDocument.Factory.parse(result); result must always contain a valid XML-document! e.g DublinCore simple

  44. How to continue Work with METSbeans everybody can create METSbeans by him/herself -> see Apache xmlbeans Downloadable from GDZ website Will provide a primer as a non-complete documention for METSbeans.

  45. How to continue Identify necessary functions for helper-class Over time we will identify additional methods which might be useful and should be integrated in the "helper-class".

  46. Application Layer can be build on top of METSbeans Profile specific implementations can be build on top of METSbeans and provide an API to the underlying document/content model.

  47. Application Layer can be build on top of METSbeans Applikation Applikation API for content model helper class METS API XML serialization

More Related