390 likes | 528 Views
Moving ISO & OGC standards into the Semantic Web Presented at “Metadata DownUnder” 11th Open Forum on Metadata Registries, Sydney, NSW, Australia. Water For a Healthy Country. Laurent Lefort 22 May 2008. Outline. Ontologies and water data standards
E N D
Moving ISO & OGC standards into the Semantic WebPresented at“Metadata DownUnder”11th Open Forum on Metadata Registries, Sydney, NSW, Australia Water For a Healthy Country Laurent Lefort 22 May 2008
Outline Ontologies and water data standards Transforming ISO TC 211 and OGC standards into ontologies Work on OWL versions for multiple standards Findings on the transformation methods Findings on the resulting ontologies and on how to build better ontologies What is the added value of Semantic Web technologies? CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Context: water resources management for Australia Water Resources Observation Network (WRON) program One of CSIRO’s Water for a Healthy Country Flagship themes Support to major research alliance between Bureau of Meteorology & CSIRO (WIRADA) to deliver mission-critical R&D Specific Activity on Water Data standards Usage and entitlement data Hydrometric data Geospatial data Models Source: Vertessy 2006: Australia’s water resources information imperative and the role of the Water Resources Observation Network (WRON) CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Interest in water data standards Water budget combining 4 sub-domains Atmospheric Water (& Climate) Surface water Groundwater (& Geology) Human use of water Need to manage features and observations Complex cross-domain interactions e.g. transfer between surface water and groundwater Need for a consistent standard basis (& method) Data and Metadata CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Generations of “standards” & integration complexity Integration support Surface water & groundwater “standards” OWL ontologies Semantic integration Registries Master Data Managt WOML Standard developers GWML Model-driven generation of XML schemas UML & XML schemas eWater (EU) SANDRE XML WaterML (CUAHSI) XML schemas Reuseable XML schema stack WFD schemas Standard users EPA WQX XML Custom XSL transfo. & web services ODM EPA STORET DB-based Distributed systems with same db schema SANDRE ASCII-based CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Semantic Web technologies for standards development CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia • RDF (Resource Description Format) for the “web of data”: annotations and links • Value: flattened, web-compatible method to manage and link data into set of triples • OWL (Web Ontology Language) for the web of (data) models • Several variants based on description logic with different expressivity / scalability ratios • Value: reasoning support to build class hierarchy and verify logical consistency
Building expectation that OWL can be useful • Past and present efforts to create and use OWL versions of standards • Drexel University (HydroSeek) • Uni of Muenster (ACE-GIS, SWING, EDINA) • Discussions at the Water Resources Information Model Workshop (Canberra, Sep 2007) • Recognition of the ontological value of some standards e.g. OGC Observations and Measurements • Finney: Australian Marine Ontology, WALIS Forum 2008 • Brodaric & Probst: DOLCE Rocks AAAI Spring Symp. 2008 CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Reasons to share our experience in building ontologies • High demand for OWL versions of standards • Transition and ramping up period from a manual process to a semi-automated one • Recently developed methods (ODM) and tools (TopBraid) to create ontologies from UML models or from XML schemas • Re-evaluation of current standard development practice • Push for harmonisation of spatial standards (INSPIRE) • Development of OGC Model driven approach • ISO 19150 Ontology group, led by Jean Brodeur • Can SW help ISO TC 211? Can ISO TC 211 help SW? CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
What we present today • Work on OWL versions for multiple standards • ISO and OGC standards • Standards based on ISO and OGC standards defined for the water domain • Findings on the transformation method • Comparison of ontology generation tools from UML models and from XML schemas • Findings on the resulting ontologies • Tactics to build better ontologies CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Standards to transform into OWL • Focus on water standards describing Features & Observations because of our interest in: • Reference datasets (continental scale) • Identification of water features and of their topological and hydrological relationships • Data exchange language for individual and aggregated observations CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Two key building blocks to build features and observations standards • ISO 19109: Geographic information -- Rules for application schema • Defines a method to specify features know as the General Feature Model (GFM) • OGC Observations and Measurements (O&M) • Refines the GFM method to manage observations • Supported by common schema generation technologies (UML to XML schemas) • To implement UML patterns out of “stereotypes” • To create definitions on top of existing schemas • Example of tools: ShapeChange, FullMoon (CSIRO) CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Common principles used for standards based on GFM and O&M • A core model defines the main classes forming the standard • Through their relation to other specified classes or to generic spatial definitions • Extra design flexibility is given in three areas • Attachment of properties to features, • Introduction of externally managed code lists • Provision for alternative usage (union) • Specific restrictions on the applicability of the definitions can be added with a constraint language, such as Schematron CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Added value of the Observations & Measurements standard • Two user-managed class hierarchies in GFM-based specs: • Feature and FeaturesCollection: a Feature-type is characterized by a specific set of properties • Up to five user-managed class hierarchies in O&M-based specs • Observation, SamplingFeature, PropertyType, Procedure and Result • An Observation is an Event whose result is an estimate of the value of some Property of the Feature-of-interest, obtained using a specified Procedure • Stronger ontological value for O&M • More branches and separation of concern: • Example: Difference between Feature and SamplingFeature • Feature for the real world objects e.g. an aquifer • SamplingFeature to characterise how a measure is done e.g. along a borehole CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Standards transformed into OWL • Application standards based on • ISO TC 211 General Feature Model • OGC Observations & Measurements • Corresponding ISO/OGC standards from two origins: • UML model grouping all the ISO TC 211 standards • from the Harmonized Model Maintenance Group • XML schemas from OGC (schemas.opengis,net) • Including GML, SensorML, … CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Selected ontology generation methods • XSL-based approaches • XO (CSIRO-developed) from UML 2.0 or XML schemas to OWL • Rhizomik.net xsd2owl.xsl (open source but restricted to non commercial usage) • TopBraid Composer (commercial tool) • Transformation from UML 2.0 and XML schemas to OWL • Enterprise Architect files can be pre-processed with an EA-specific openArchitectureWare plugin CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Generated ontologies CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example 1: om:Observation from XML schemas (TopBraid) CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example 2: om:Observation from UML model (TopBraid) • Long URIs based on package names • hasFeatureOfInterest: <http://ogc.uml/Model/Model/Externally-governed-packages/HollowWorld/CommonUsagePackages/ISO-19110/ISO-19115-Metadata/Metadata-entry-set-information/MD_Metadata>[0..1] CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example 3: om:Observation from UML model (XO) CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example 4: om:Observation from XSD (Rhizomik) CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Important findings • XML schemas easier to transform than UML models • As long as the transformation tool is capable to process tricky xsd:include and xsd:import cases • Modularity schemes in place for UML or XML schemas are not necessarily directly applicable in OWL • Suggested alternative is to simply use the XML namespace scheme to group together schemas sharing the same namespace into one or a limited number of modules • The method to define URIs works better with XML schemas than with UML models • XSL-based approaches better handle low quality (or incomplete) UML input • Known problems with UML/XMI files CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
XML schemas easier to transform than UML models • UML models • High variability in the usage of stereotypes • Risk of problem if the UML model is not fully validated or messy • XML schemas • Availability of validation tools even for multi-part schemas • Less work to interpret the modelling intent • Always available directly or after generation from UML • Tighter management of successive versions • Being able to generate the same output from both types of input for the same standard is critical to strengthen the transformation process CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Modules (files), namespaces (prefix) and URIs (IDs) in OWL Difference with XML: can not have same namespaces in different modules CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Method to define ontology modules • OMG ODM recommendation to replicate UML package is inapplicable in our view • Too many modules and the wrong ones • TopBraid’s UML import operation creates 184 OWL files for the O&M model (which includes the ISO TC 211 standards) • Recommendation • For XML schemas, group together schemas sharing the same namespace into one or a limited number of modules • Define a method producing the same results for UML models • Record the source module or schema as an annotation property for traceability and/or round trip purposes CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Method to define URIs • Using UML package names to create URIs is not recommended • See example 2 • Keeping the original XML schema namespace works well in practice • Maybe two generation options are needed • To create separate definitions for different versions of the same source • To merge definitions from different versions of the same source CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Three central issues with the current OMG ODM specification • Modules derivation from packages • Impossible to apply in practice (too many modules) • Naming conventions to disambiguate property names • Can lead to an explosion of the number of properties often not required • Does not discuss the union & substitution group patterns which are widely used in ISO/OGC standards CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Building better ontologies • Assumption • Ontologies can help standards amateurs to understand them without reading the documentation or learning how they have been created • This discussion • Tactics to capture the semantic essence of ISO/OGC & derived standards CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Tactic 1: Stick to the original definitions • Rendition of ISO standards which mirrors the original UML model • Drexel University team • ISO and OGC ontologies in OWL-Protégé 2.1 (2004-05) • ISO 19103, 19107-12, 19115, OGC Spatial referencing by Coordinates and GML • Success factor: traceability to the origin of definitions (often overlooked) CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Tactic 1: benefits of traceability • Handle multiple definitions of Observations • OM1_Observation: published OGC O&M spec. (part 1) version 1.0 • OM: Draft version of O&M • GML: gml:Observation CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Tactic 2: Modularise, Winnow, Align with Upper Ontology • University of Muenster (and EU projects partners) • ACE-GIS: OWL-Protégé 1.2 (2004), SERES: OWL-Protégé 2.2 (2005), SWING: WSML (2008) • Spatial representation (19107), Location (19111-19112), O&M (alignment with DOLCE and SWEET) • Generally based on a costly manual process • Match what the end user wants • Weaker traceability to the sources of definitions CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Tactic 3: Try to do both • Replace the manual process by a smarter transformation designed to normalise the ontology skeleton • Define the right branches at the top • Isolate unambiguous primitives (e.g. units) • Use modules/namespace/URIs to position source-specific definitions against common ones • Specific effort needed to • Reduce the number of root classes • Create deeper class & property hierarchies • Handle ambiguous property definitions CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Example of normalised ontology skeleton Define the right branches at the top Isolate unambiguous primitives (e.g. units) Use modules/namespace/URIs to position source-specific definitions against common ones CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Conclusions • Better ontology generation tactics can help to satisfy the demand for OWL versions of (groups of) standards • Three priority areas have been identified • Systematically develop parallel transformation chains from UML and XML schemas to enable cross-checking of outputs • Develop more convenient and more robust modularity, namespace and URIs schemes • Give feedback to ISO/OGC Policy group on the compatibility of their approach with OWL CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Inputs for ISO 19150 Ontology group • Can SW help ISO TC 211 (and OGC)? • Modelling and reasoning power of OWL • Sub-properties in v. 1.0 and role composition in v. 2.0 • Top level class hierarchy skeleton: normalised form of ontologies, alignment to upper ontologies • Can ISO TC 211 (and OGC) help SW? • Method to define a standard as a derived product of another one • Transposable experience on how to extend or restrict a specification • Use cases to inform SW work on ontologies and rules CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Acknowledgements Thanks to: • Ross Ackland, WRON Theme Leader, CSIRO • Simon Cox, Research scientist, CSIRO and OGC • Amit Parashar, CSIRO and Australian W3C office And also to: • TopQuadrant for TopBraid Composer • Rhizomik.net for xsd2owl.xsl • Rick Jelliffe et al: XSL pre-processing of XML schemas CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
CSIRO ICT Centre Laurent Lefort Senior Research Engineer (Ontologies) Phone: +61 2 6216 7046 Email: laurent.lefort@csiro.au Web: wron.net.au Thank you Contact UsPhone: 1300 363 400 or +61 3 9545 2176Email: Enquiries@csiro.au Web: www.csiro.au
Standards • GeoSciML (Geoscience Markup Language) • GFM-based, first standard to partially leverage O&M • GWML (Groundwater Markup Language), WOML (Water Observation Markup Language) • Two preliminary efforts based on O&M to create groundwater and surface water standards: • CSML (Climate sciences Modelling Language) • Adapting & completing O&M for Met/Ocean data • DHS-GDM (Department of Homeland Security Geospatial Data Model) • Huge compilation of standards for homeland security applications CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia
Summary of the 4 methods CSIRO Moving ISO TC 211 & OGC standards into the Semantic Web “Metadata DownUnder”: 11th Open Forum on Metadata Registries Sydney, NSW, Australia