140 likes | 274 Views
SeaDataNet and EMODNET Vocabularies. Roy Lowry Adam Leadbetter British Oceanographic Data Centre. Overview. Automated parameter aggregation (P35) vocabulary status EMODNET chemical filter P01 semantic model exposure status Management of concept deprecations. P35 Status.
E N D
SeaDataNet and EMODNET Vocabularies Roy Lowry Adam Leadbetter British Oceanographic Data Centre
Overview • Automated parameter aggregation (P35) vocabulary status • EMODNET chemical filter • P01 semantic model exposure status • Management of concept deprecations
P35 Status • P35 is a vocabulary of parameters for EMODNET chemistry lot products • EMODNET data parameters are marked up using P01 vocabulary • P01 is much finer grained than P35 • Therefore aggregation of P01 parameters into P35 parameters is required • To date this has been done by a lot of painful manual work in ODV
P35 Status • However within NVS it is possible to maintain and serve a mapping between P01 and P35 • Each P35 concept has a URL that resolves to an XML document • This document can be used to drive automated parameter aggregation by identifying all P01 codes that may be incorporated into a P35 code
P35 Status • P35 presents design issues • P35 granularity (e.g. should there be separate products for unfiltered and filtered samples) • Which P01 terms should map to a given P35 term? • Design issues need governance - domain experts who can make these decisions • Governance now established based on experts from EMODNET partners communicating by list-server e-mail and Webex.
P35 Status • Example P35 concept (ITS90 temperature) set up in October • Just over 100 additional entries considered by governance and loaded this month • These cover • Salinity • Dissolved oxygen • Nutrients • Metals in the water column • Next target is metals in sediments and biota (900-1000) • P35 could easily reach several thousand entries
EMODNET Chemical Filter • Need to consider what is required here • One approach is to specify a list of P02 codes that cover the themes included in the EU legislation • This comes with risks • Some data outside the intended scope will be captured (e.g. Methylated arsenic in a trawl designed for organotins) • Easy to overlook consequences of any P02 rationalisation • P02 list can be tested against P35 (both map to P01)
EMODNET Chemical Filter • Alternative approach • Capture P01 codes through data mining • Translate P35 into a list of P01 codes • Do the chemical filter on the basis of P01 rather than P02
P01 Model Exposure Status • Both ODIP and EMODNET require access to the factored semantic model that underpins P01 • Strong pressure from ODIP (primarily Simon Cox) for this to be delivered through RDF-XML • For this every element of the factorisation requires a URI • This requires that every element to be covered by a controlled vocabulary
P01 Model Exposure Status • The biological entity in the factorisation is already covered (S25 vocabulary) • Parameter - matrix relationship already covered (S02 vocabulary) • Currently working on the matrix entity • Concepts like 'water body particulate >0.2um phase' • Taking longer than expected (part-time working, EMODNET, IMOS vocabulary demands, past misdemeanours) • But getting very close • Then we just need the parameter entity
Concept Deprecation • Many SeaDataNet vocabularies have evolved, with concepts added to satisfy specific demands • Governance explicitly prohibits deletion • This leads to issues • Unintended duplicates • Cause confusion • Unnecessarily complicate aggregation • Variable granularity • Discovery made more difficult (too many terms) • Patchy domain coverage • Unnecessarily complicates metadata markup
Concept Deprecation • NVS 1.0 handled deprecation poorly (URI changed) • Issues addressed in NVS 2.0 • All payload documents include • skos:note element set to 'accepted' or 'deprecated' • owl:deprecatedbooleanelement • Deprecated concept documents also have a dc:isReplaced By element • Full controlled vocabulary requests may be designated 'accepted', 'deprecated' or 'all' (default)
Concept Deprecation • Concept deprecation causes issues for the SeaDataNet architecture • Deprecated concepts contained in SeaDataNet metadatabases • Deprecated concepts in SeaDataNet filestock • Consequently, much needed vocabulary improvements (P03, P02) held back due to concern about the consequences
Concept Deprecation • Following deprecation support is needed: • Deprecation handling within the SeaDataNet vocabulary client, which could either • Only display accepted concepts (easy to implement) • Flag the deprecated concepts (more work but a better result) • Automatic parameter substitution in metadatabase file and data file import tools • Metadatabase sweepers (run regularly to clean up any concepts that have been deprecated since ingestion)