550 likes | 680 Views
CSML (Climate Science Modelling Language). Dr Andrew Woolf ( A.Woolf@rl.ac.uk ) Environmental e-Science & Spatial Informatics e-Science Centre STFC Rutherford Appleton Laboratory. CSML: Context. The data integration problem
E N D
CSML (Climate Science Modelling Language) Dr Andrew Woolf (A.Woolf@rl.ac.uk) Environmental e-Science & Spatial Informatics e-Science Centre STFC Rutherford Appleton Laboratory
CSML: Context • The data integration problem • multiple organisations, formats, storage mechanisms (file, relational) • only commonality is data semantics
CSML – benefits of explicit semantics • Aim is to be as explicit as possible about semantics of information classes • Offer significant potential for advanced processing workflows • Reduce representation errors (e.g. omitting key attributes) • ‘Conventions’ approach is fragile
CSML – benefits of geospatial standards • Interoperability!! • Expanded user-base • Enhanced ROI • Compliance to emerging infrastructures
ECOOP CSML v2 MyOcean IOOS CDM ??? ... O&M INSPIRE GEOSS WIS The potential
Emerging ISO standards TC211 – around 40 standards for geographic information Cover activity spectrum: discovery access use …and described by metadata. …in a defined logical structure… …delivered through services… A geospatial dataset… …consists of features and related objects… Background: ‘feature types’ ISO 19101 Domain Reference Model
Background: ‘feature types’ • Geographic ‘features’ • “abstraction of real world phenomena” [ISO 19101] • Type or instance • Encapsulate important semantics in universe of discourse • Application schema • Defines semantic content and logical structure of datasets • ISO standards provide toolkit: • spatial/temporal referencing • geometry (1-, 2-, 3-D) • topology • dictionaries (phenomena, units, etc.) • GML – canonical encoding [from ISO 19109 “Geographic information – Rules for Application Schema”]
<gml:definitionMember> <om:Phenomenon gml:id="taxon"> <gml:description>The taxon name</gml:description> <gml:name codeSpace="http://www.vliz.be">taxon</gml:name> </om:Phenomenon> </gml:definitionMember> </NDGPhenomenonDefinitions> <!--===================================================================--> <gml:FeatureCollection> <!-- ============================================================== --> <gml:featureMember> <NDGPointFeature gml:id="ICES_100"> <NDGPointDomain> <domainReference> <NDGPosition srsName="urn:EPSG:geographicCRS:4979" axisLabels="Lat Long" uomLabels="degree degree"> <location>55.25 6.5</location> </NDGPosition> </domainReference> </NDGPointDomain> <gml:rangeSet> <gml:DataBlock> <gml:rangeParameters> <gml:CompositeValue> <gml:valueComponents> <gml:measure uom="#tn"/> <gml:measure uom="#amount"/> <gml:measure uom="#gsm"/> </gml:valueComponents> </gml:CompositeValue> </gml:rangeParameters> <gml:tupleList> 'ANTHOZOA',63.1,missing 'Scoloplos armiger',66.1,missing 'Spio filicornis',10,missing 'Spiophanes bombyx',60.3,missing 'Capitellidae',131.8,missing 'Pholoe',10,missing 'Owenia fusiformis',23.4,missing 'Hypereteone lactea',6.8,missing 'Anaitides groenlandica',13.2,missing 'Anaitides mucosa',6.8,missing ‘GML Application schema’ ISO 19136 ISO 19109 ‘Feature types’ ISO 19110 ISO 19118 Standards-based data modelling e.g. information modelling... ‘Universe of discourse’ “Conceptual modelling is the process of creating an abstract description of some portion of the real world and/or a set of related concepts.” (ISO 19101)
ISO 19111 ISO 19107 ISO 19108 GML Obs & Meas feature types ISO/OGC data modelling • Components for data modelling: • Feature subtypes: • geometry • coverage ISO 19107: Spatial schema ISO 19108: Temporal schema ISO 19111: Spatial referencing by coordinates ISO 19136: Geography Markup Language OGC 03-022r3: Observations and Measurements
abstract generic highly specialised feature types spectrum <measurement type=“Radiosonde” measurand=“temperature”/> <temperatureProfile/> <Sonde parameter=“temperature”/> Governance in standards-based modelling • The importance of governance • Information community defined by shared semantics • Need community process to manage those semantics (definitions, models, vocabularies, taxonomies, etc.) • e.g. CF conventions for netCDF files • Role of Feature Type Catalogues [ISO 19110] and registers [ISO 19135] • Governance as driver for granularity • Remit / interest determines appropriate granularity • e.g. IOC, IHO, WMO
CSML ambition • The CSML ‘niche’ • set of base feature types for specialising or using as-is
<?xml version="1.0" encoding="UTF-8"?> <schema targetNamespace="http://ndg.nerc.ac.uk/csml" xmlns="http://www.w3.org/2001/XMLSchema" xmlns:csml="http://ndg.nerc.ac.uk/csml" xmlns:om="http://www.opengis.net/om" xmlns:gml="http://www.opengis.net/gml" elementFormDefault="qualified" attributeFormDefault="unqualified" version="0.1"> <annotation> <documentation>CSML application schema</documentation> </annotation> <!--====================================================================== --> <import namespace="http://www.opengis.net/gml" schemaLocation="GML-3.1.0/base/gml.xsd"/> <import namespace="http://www.opengis.net/om" schemaLocation="phenomenon.xsd"/> <!--====================================================================== --> <!--===== Root element for CSML dataset =====--> <!--====================================================================== --> <complexType name="DatasetType"> <complexContent> <extension base="gml:AbstractGMLType"> <sequence> <element ref="csml:UnitDefinitions" minOccurs="0" maxOccurs="unbounded"/> <element ref="csml:ReferenceSystemDefinitions" minOccurs="0" maxOccurs="unbounded"/> <element ref="csml:PhenomenonDefinitions" minOccurs="0"/> <element ref="csml:_ArrayDescriptor" minOccurs="0" maxOccurs="unbounded"/> <element ref="gml:FeatureCollection" minOccurs="0" maxOccurs="unbounded"/> </sequence> </extension> </complexContent> </complexType> <element name="Dataset" type="csml:DatasetType"/> <!--====================================================================== --> <!--===== Dictionary/definition elements =====--> <!--====================================================================== --> <complexType name="ReferenceSystemDefinitionsType"> <complexContent> <extension base="gml:DictionaryType"/> </complexContent> </complexType> <element name="ReferenceSystemDefinitions" type="csml:ReferenceSystemDefinitionsType"/> <complexType name="ReferenceSystemDefinitionsPropertyType"> <sequence> <element ref="csml:ReferenceSystemDefinitions" minOccurs="0"/> </sequence> <attributeGroup ref="gml:AssociationAttributeGroup"/> </complexType> British Oceanographic Data Centre British Atmospheric Data Centre ‘Governance Principle’ ISO standards Tooling Conceptual model Schemas What is CSML? Climate Science Modelling Language
CSML Version Two • CSML v2 (compared to v1) • More explicit/expanded feature types: • Swath • ProfileSeries{Radar, Section, ProfileSeries} • Lost ‘composite domain pattern’ • GML 3.2 ( ISO 19136) • removed ‘storage descriptors’ from core CSML schema • ‘affordance’ (i.e. FT behaviour) • O&M
CSML and O&M • OGC ‘Observations and Measurements’ CSML An Observation is an Event whose result is an estimate of the value of some Property of the Feature-of-interest, obtained using a specified Procedure
CSML AbstractFeature • Provides common model for all CSML feature types • Supports OGC Observations and Measurements model • Each CSML feature: • has a type (with operations and required attributes), the ‘affordance’ concept • represents some physical ‘parameter’ (Phenomenon) • has a ‘value’ property which is a coverage (with domain and range) – the domain is often a subclass of ReferenceableGrid • may have additional attributes providing ‘reference’ spatio-temporal location parameters
CSML ReferenceableGrid • Implementation of ISO 19123 CV_ReferenceableGrid, missing from GML • Subject of OGC GML Change Request (doc 06-160) • Analagous to CF – grid locations specified for each axis of CRS • Efficiency allowed when CRS axis aligned with grid axis • Supports both spatial and temporal (and compound) CRS
Integration with files • Want to expose information, not format...
Integration with files • Information structures may be composed across files
Integration with files • Common pattern with file-data: • need to integrate information structures within and across multiple files • (relational tables provide this implicitly) • Semantics provide an integration key • e.g. an oceanographer and meteorologist can share a conversation about data despite format differences
A model for file-based interoperability • Retain file-based persistence format • Supplement with feature-based conceptual model • ‘Cast’ legacy data onto conceptual model • interoperableData = (featureModel) legacyData • Legacy file data + GML-encoded conceptual ‘metadata’ = ‘interoperable view’ • may be exposed through W*S
A model for file-based interoperability • GML provides conceptual feature ‘skeleton’ • File provides ‘flesh’ • GML ‘by-reference’ pattern for property values • uses simple xlink • “The value of a GML property that carries an xlink:href attribute is the resource returned by traversing the link”
10111001001011001100101000100111000110111001010100010110010100010010101110101111010011100101011000010101001001011011100100101100110010100010011100011011100101010001011001010001001010111010111101001110010101100001010100100101 Conceptual skeleton 0010111011010100110111010010010100010111100110101000101010101110101111110101001001011110110100101010101101001111 1011100101101010000101111101110010101010010101011101010101110101001010101010010101010101010100101011010101110101 Storage mapping CSML ‘interoperability model’
xlink review simple xlink [role] [title] local resource [role] [title] [label] remote resource [href] [role] [title] [label] arc [arcrole] [title] [show] [actuate]
xlink review • ‘role’ (URI): • indicates a property of the remote resource • must be a URI reference that “identifies some resource that describes the intended property” • ‘arcrole’ (URI): • describes the “meaning of the arc’s ending resource relative to its starting resource” • corresponds to RDF notion of a property • starting-resource HAS arc-role ending-resource
xlink patterns for files extended xlink GML feature instance Aggregation semantics determined by xlink arc traversal rules
xlink patterns for files simple xlink GML feature instance Aggregation semantics determined by storage descriptor
xlink proposal <someGMLElement xlink:arcrole="hasRemoteContentEmbeddedAt#localXpath" xlink:href="storageDescriptor#portion" xlink:role="storageSchemaIdentifier" xlink:show="embed" xlink:actuate="onRequest | onLoad"/> • href examples: • netCDF#variable • RDBMS#SQLQuery • GRIBFile#recordNumber • CSMLStorageDescriptor#arrayID
Example • <gml:ReferenceableGrid gml:id="ID001" srsName="urn:ogc:def:crs:EPSG:6.6:4326" dimension="2"> • <gml:limits> • <gml:GridEnvelope> • <gml:low>0 0</gml:low> • <gml:high>7 4</gml:high> • </gml:GridEnvelope> • </gml:limits> • <gml:axisLabels>x y</gml:axisLabels> • <gml:coordTransformTable> • <gml:GridCoordinatesTable> • <gml:gridOrdinate> • <gml:GridOrdinateDescription> • <gml:coordAxisLabel>Geodetic longitude</gml:coordAxisLabel> • <gml:coordAxisValues> • <gml:SpatialOrTemporalPositionList> • <gml:coordinateList>13.5 24.9 32.4 37.7 41.5 46.8 54.4 65.7</gml:coordinateList> • </gml:SpatialOrTemporalPositionList> • </gml:coordAxisValues> • <gml:gridAxesSpanned>x</gml:gridAxesSpanned > • <gml:sequenceRule axisOrder="+1">Linear</gml:sequenceRule> • </gml:GridOrdinateDescription> • </gml:gridOrdinate> • <gml:gridOrdinate> • <gml:GridOrdinateDescription> • <gml:coordAxisLabel>Geodetic latitude</gml:coordAxisLabel> • <gml:coordAxisValues> • <gml:SpatialOrTemporalPositionList> • <gml:coordinateList> • 53.1 48.7 46.2 44.7 43.9 43.3 43.1 44.0 • 46.2 43.2 41.5 40.6 40.2 40.0 40.3 41.7 • 37.1 36.1 35.6 35.5 35.7 36.0 37.1 39.5 • 30.4 30.2 30.4 30.7 31.1 32.0 33.8 37.2 • 24.3 24.8 25.3 26.0 26.6 27.7 29.7 33.4 • </gml:coordinateList> • </gml:SpatialOrTemporalPositionList> • </gml:coordAxisValues> • <gml:gridAxesSpanned>x y</gml:gridAxesSpanned > • <gml:sequenceRule axisOrder="+1 -2">Linear</gml:sequenceRule> • </gml:GridOrdinateDescription> • </gml:gridOrdinate> • </gml:GridCoordinatesTable> • </gml:coordTransformTable> • </gml:ReferenceableGrid> • GML CR 06-160 • ISO 19123 CV_ReferenceableGrid
Example netcdf myfile { dimensions: x = 8 ; y = 5 ; variables: float lon(x) ; lon:long_name = “longitude” ; lon:units = “degrees_east” ; float lat(x,y) ; lat:long_name = “latitude” ; lat:units = “degrees_north” ; float temp(x,y) ; temp:coordinates = “lon lat” ; temp:long_name = “temperature” ; temp:units = “degC” ; data: lon = 13.5, 24.9, 32.4, 37.7, 41.5, 46.8, 54.4, 65.7 ; lat = 53.1, 48.7, 46.2, 44.7, 43.9, 43.3, 43.1, 44.0, 46.2, 43.2, 41.5, ... • netCDF ASCII dump:
CSML instances – xlink example <csml:gridOrdinate> <csml:GridOrdinateDescription> <csml:coordAxisLabel>Geodetic longitude</csml:coordAxisLabel> <csml:coordAxisValues> <csml:SpatialOrTemporalPositionList> <csml:coordinateList srsName=“WGS84”>13.5 24.9 32.4 37.7 41.5 46.8 54.4 65.7 </csml:coordinateList> </csml:SpatialOrTemporalPositionList> </csml:coordAxisValues> <csml:gridAxesSpanned>x</csml:gridAxesSpanned > <csml:sequenceRule axisOrder="+1">Linear</csml:sequenceRule> </csml:GridOrdinateDescription> </csml:gridOrdinate> <csml:coordAxisValues xlink:arcrole=“http://ndg.nerc.ac.uk/xlinkUsage/insert#SpatialOrTemporalPositionList/coordinateList” xlink:href=“file://myfile.nc#lon” xlink:role=“http://ndg.nerc.ac.uk/fileFormat/netcdf” xlink:show=“embed”> <csml:SpatialOrTemporalPositionList> <csml:coordinateList srsName=“WGS84”/> </csml:SpatialOrTemporalPositionList> </csml:coordAxisValues> <csml:coordAxisValues xlink:arcrole=“http://ndg.nerc.ac.uk/xlinkUsage/insert#SpatialOrTemporalPositionList/coordinateList” xlink:href="CSMLStorageDescriptorExample.xml#coapec_u_2" xlink:role= "http://ndg.nerc.ac.uk/fileFormat/csmlStorageDescriptor" xlink:show=“embed”> <csml:SpatialOrTemporalPositionList> <csml:coordinateList srsName=“WGS84”/> </csml:SpatialOrTemporalPositionList> </csml:coordAxisValues>