200 likes | 333 Views
Developing a Metadata Exchange Format for Mathematical Literature. David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010. History. Part of the early DML/WDML discussions Initial version of MLAP (qualified Dublin Core), 2004-2005
E N D
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010
History • Part of the early DML/WDML discussions • Initial version of MLAP (qualified Dublin Core), 2004-2005 • Effort on a simple DC profile in 2005-2006 • Thierry Bouche, Thomas Fischer, Claude Goutorbe, David Ruddy • Dublin Core community refines and documents its concept of an Application Profile, 2007-2009
Dublin Core Application Profile • Dublin Core Abstract Model • Essentially, an RDF model • All properties, vocabularies, and syntax encoding schemes identified by URIs • Global semantic interoperability • Semantic web, linked data
DCAP Compliance • Functional requirements • Domain model • Description set profile • Usage guidelines • Syntax guidelines
MLAP Functional Requirements • Typical functions of bibliographic records: find, identify, select, obtain • Multilingual support • Potential capabilities: • Linking to name authority records • Citation analysis • Embedded OpenURL Context Objects • Rich subject analysis
MLAP: Out of Scope • Description of publications not available online • Identification and description of distinct FRBR entities (supporting version control) • Structured author/contributor descriptions • Machine-processable descriptions of access embargo periods
MLAP Domain Model • Entities of the application profile, and their relationships Publication Publication Container 0 . . 1 Creator Agent 0 . . n
MLAP Description Set Profile • Defines how metadata records adhere to the Description Set Model • DSP uses a DC constraint language • Statement templates • Value constraints • XML expression of the MLAP DSP: http://projecteuclid.org/documents/ metadata/mlap/mlap_dsp.xml
MLAP Property Namespaces • DCMI Metadata Terms • PRISM: Publishing Requirements for Industry Standard Metadata • DC Collections Metadata Terms
MLAP Usage Guidelines • Human-readable presentation of DSP • Additional content value rules and/or recommendations • Examples • MLAP usage guidelines (HTML): http://projecteuclid.org/documents/ metadata/mlap/
MLAP Syntax Guidelines • The Description Set Model is neutral regarding syntactic encoding of description sets • DC provides specifications for how description sets may be serialized in plain text, XML, RDF/XML, and in XHTML meta tags • MLAP usage guidelines encode examples in plain text, with alternate encodings in XML, and eventually RDF/XML • Neutral approach allows for multiple ways to exchange metadata
@prefix dcterms: <http://purl.org/dc/terms/> DescriptionSet ( Description ( ResourceURI ( <http://example.org/a/resource/uri > ) Statement ( PropertyURI ( dcterms:title ) LiteralValueString ( "<div xmlns="http://www.w3.org/ 1998/Math/MathML">On <math alttext="$L$"> <mi>L</mi></math>-functions of twisted <math alttext="$4$"><mn>4</mn></math>-dimensional quaternionic Shimura varieties</div>" Language ( en ) SyntaxEncodingSchemeURI ( <http://www.w3.org/ 1999/02/22-rdf-syntax-ns#XMLLiteral> ) ) ) ) )
<?xml version="1.0" encoding="utf-8"?> <dcds:descriptionSet xmlns:dcds="http://purl.org/dc/xmlns/2008/09/01/dc-ds-xml/"> <dcds:description dcds:resourceURI="http://example.org/a/resource/uri"> <dcds:statement dcds:propertyURI="http://purl.org/dc/terms/title"> <dcds:literalValueString xml:lang="en" dcds:sesURI="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"> <div xmlns="http://www.w3.org/1998/Math/MathML"> On <math alttext="$L$"><mi>L</mi></math>-functions of twisted <math alttext="$4$"><mn>4</mn></math>-dimensional quaternionic Shimura varieties</div> </dcds:literalValueString> </dcds:statement> </dcds:description> </dcds:descriptionSet>
<?xml version="1.0" encoding="utf-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/"> <rdf:Description rdf:about="http://example.org/a/resource/uri"> <dcterms:title rdf:parseType="Literal"> <div xmlns="http://www.w3.org/1998/Math/MathML"> On <math alttext="$L$"><mi>L</mi></math>-functions of twisted <math alttext="$4$"><mn>4</mn></math>-dimensional quaternionic Shimura varieties</div> </dcterms:title> </rdf:Description> </rdf:RDF>
Minimal Record Requirements • Four required elements: <dcterms:title> <dcterms:issued> <dcterms:bibliographicCitation> <prism:url>
Potential for Rich Records • Multilingual values for many properties • MathML in titles and abstracts • Complete reference lists • OpenURL Context Objects for described publication and all referenced resources
Dedicated Identifiers • For example: <prism:url> for the publication’s HTTP URI, instead of <dcterms:identifier> • Also: <prism:issn> <prism:eIssn> <prism:isbn> <prism:doi> • Likewise, the publicationContainer entity
Unresolved Issues • Optimized for serial literature • Contributor property • Not easy to capture a role attribute • Potential solutions add complexity • MSC codes do not have URIs