180 likes | 339 Views
Vocabulary Workshop, RAL, February 25, 2009. NERC DataGrid Vocabulary Server Use Cases. Use Cases. Metadata population with verifiable content Dynamic drop-down lists Semantic cross-walk Smart discovery Vocabulary Server usage models. Metadata Population Use Case.
E N D
Vocabulary Workshop, RAL, February 25, 2009 NERC DataGridVocabulary Server Use Cases
Use Cases • Metadata population with verifiable content • Dynamic drop-down lists • Semantic cross-walk • Smart discovery • Vocabulary Server usage models
Metadata Population Use Case • SeaDataNet is an EU project building a distributed data system across 30-40 European and Mediterranean data centres • Semantic infrastructure provided by NDG Vocabulary Server • SeaSearch was a precursor project federating metadata across a slightly smaller network • SeaSearch was plagued by local vocabulary maintenance allowing illegal values into documents • SeaDataNet adopted two strategies to address this
Metadata Population Use Case • Strategy 1: constraint through tooling • Provide a metadata editor that • Allows manual entry of XML metadata records • Exports a simple RDBMS schema into XML • Link this up to the Vocabulary Server API to • Populate drop-down lists • Verify fields populated from vocabularies as they are output • SeaDataNet Mikado tool does this
Metadata Population Use Case • Strategy 2: constraint through validation • Problem is not everybody uses the tools • SeaDataNet metadata documents include Schematron code to validate field content • Schematron maintained by software polling Vocabulary Server API • Records validated at source using Schematron-aware tool (e.g. Oxygen 8 or later) or on-line validation service
Dynamic Drop-Down List Use Case • SeaDataNet marks up data using BODC Parameter Usage Vocabulary (21000 terms) • Navigation of something this size is a potential issue • Addressed by building three layers of increasingly broad terms over the top • Layers linked together using SKOS mappings
Dynamic Drop-Down List Use Case • Search client required to exploit this • An obvious design for this is a series of drop-down lists working down the hierarchy • These need to be dynamically populated to keep up to date with the master vocabulary versions
Dynamic Drop-Down List Use Case • The following URL gives all terms from the top level hierarchy: • http://vocab.ndg.nerc.ac.uk/list/P081/current • This may be used to set up a list of hot-linked labels pointing to Vocabulary Server concept URLs such as: • http://vocab.ndg.nerc.ac.uk/term/P081/3/DS02 • Represents the concept ‘chemical oceanography’ • When selected by the user a Vocabulary Server call is issues and…..
Dynamic Drop-Down List Use Case • …we get a SKOS document thus <?xml version="1.0" ?> - <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:dc="http://purl.org/dc/elements/1.1/"> - <skos:Concept rdf:about="http://vocab.ndg.nerc.ac.uk/term/P081/3/DS02"> <skos:externalID>SDN:P081:3:DS02</skos:externalID> <skos:prefLabel>Chemical oceanography</skos:prefLabel> <skos:altLabel>Chemical oceanography</skos:altLabel> <skos:definition>The chemical oceanographic science domain</skos:definition> <dc:date>2009-02-10T10:30:20.052+0000</dc:date> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/B007" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C003" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C005" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C010" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C015" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C017" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C020" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C025" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C030" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C035" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C040" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C045" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C050" /> <skos:narrowMatch rdf:resource="http://vocab.ndg.nerc.ac.uk/term/P031/12/C055" /> </skos:Concept> </rdf:RDF>
Dynamic Drop-Down List Use Case • This delivers a set of URIs from the next level down in the hierarchy • Again, these may be displayed as hot-linked labels and again the user selects one to drill down into the next layer of the hierarchy through another VS call • Maris BV in the Netherlands have linked this to Ajax to produce a client
Semantic Crosswalk Use Case • BODC wishes to produce a GCMD DIF document from an EDMED V1.2 document • The “parameter” sections of the two documents are populated using different vocabularies (BODC PDV and GCMD Science Keywords) • This situation was usually addressed by having no parameter section in the output document. We can now do better…
Semantic Crosswalk Use Case • A list of BODC PDV terms as parameter URNs is obtained from the EDMED document, for example: SDN:P021:24:TEMP SDN:P021:24:PSAL SDN:P021:24:CPWC • This may then translated into a list of URLs http://vocab.ndg.nerc.ac.uk/term/24/TEMP http://vocab.ndg.nerc.ac.uk/term/24/PSAL http://vocab.ndg.nerc.ac.uk/term/24/CPWC
Semantic Crosswalk Use Case • This list may be rolled into an HTTP get request thus: http://vocab.ndg.nerc.ac.uk/axis2/services/vocab/getRelatedReco rdByTerm?subjectTerm=http://vocab.ndg.nerc.ac.uk/term/P021/c urrent/TEMP&subjectTerm=http://vocab.ndg.nerc.ac.uk/term/P02 1/current/PSAL&subjectTerm=http://vocab.ndg.nerc.ac.uk/term/P 021/current/CPWC&objectList=http://vocab.ndg.nerc.ac.uk/list/P0 41/current&predicate=255&inferences=true • An XML document is returned containing the GCMD Science Keywords that map to the three BODC terms as both text strings and URLs • The document may be reformatted using XSLT or XQuery to generate the “parameters” section for the DIF
Smart Discovery Use Case • Ability to find datasets tagged ‘rainfall’ using the search term ‘precipitation’ • Also includes so-called ‘faceted searches’ • Find one ‘type of thing’ by searching for another ‘type of thing’ • For example: • Find datasets tagged ‘CTD’ (an instrument type) using the search term ‘salinity’ (a phenomenon) • Requires semantically rich relation ‘Salinity measuredBy CTD’ • System needs to understand ‘measuredBy’ (requires rules)
Smart Discovery Use Case • Operational Smart Discovery requires: • An extensively populated full-blown ontology • A state of the art inference engine • VS API has Smart Discovery support methods • Based on SQL search on relational triple store • Inference functionality would need a locally-developed inference engine • Produces impressive demonstrations but not scalable to operational
VS Usage Models • The dynamic drop-down list use case may be implemented in at least three ways • Client issues a VS call on each user interaction returning a relatively small XML document • Client uses one VS call to download the entire thesaurus into an RDF-aware tool and then interacts through a local API • Entire thesaurus loaded into RDF-aware tool on the server that is interrogated by the client through something like SPARQL
VS Usage Models • Method 1 • Experience shows it to work well for first three use cases • Smart Discovery could potentially require hundreds of server call per query. • Method 2 • Requires a thick client • Could be part of an installed package. • Provides access to inference engines • Well-suited to Smart Discovery • Untested as far as we know.
VS Usage Models • Method 3 • Being developed by Marine Metadata Interoperability (MMI) project based on OWL rather than SKOS. • Provides access to inference engines • Well-suited to Smart Discovery support