220 likes | 229 Views
SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory. Data to Knowledge. Data Information Knowledge. Basic Elements Bytes Numbers Models Facts
E N D
SWEET: Upper-Level Ontologies for Earth System ScienceOPeNDAP MeetingFeb 2007Rob RaskinPO.DAACJet Propulsion Laboratory
Data to Knowledge Data Information Knowledge Basic Elements Bytes Numbers Models Facts Services Ingest Archive Visualize Infer Understand Predict Storage File Database HDF-EOS GIS/MIS Ontology Mind Interoperability Syntactic OPeNDAP WMS/WCS Semantic Volume/Density High/Low Low/High Statistics Checksum Moments Descriptive Inferential Analysis Fourier Wavelet EOF SSA Methodology Exploratory-analysis Model-based-mining Syntax Semantics
Semantics: Shared Understanding of Concepts • Provides a namespace for scientific terms…plus • Provides descriptions of how terms relate to one another • Example tags in markup language: • subclass, subproperty, part of, same as, transitive property, cardinality, etc. • Enables object in “data space” to be associated formally with object in “science concept space” • “Shared understanding” enables software tools to find “meaning” in resources
Ontology Representation • W3C has adopted four XML-based standard ontology languages: • RDF, OWL-Lite, OWL-DL, OWL Full • Basic building blocks: • Class, subclass, property, subproperty, sameAs • Standard language enables anyone to extend an ontology • Knowledge built up incrementally
Why an Upper-Level Ontology for Earth System Science? • Many common concepts used across Earth Science disciplines (such as properties of the Earth) • Provides common definitions for terms used in multiple disciplines or communities • Provides common language in support of community and multidisciplinary activities • Reduced burden (and barrier to entry) on creators of specialized domain ontologies • Only need to create ontologies for incremental knowledge
Semantic Web for Earth & Environmental Terminology (SWEET) • Ontology of Earth system science and data concepts • Provides a common semantic framework (or namespace) for describing Earth science information and knowledge • Emphasis on improving search for NASA Earth science data resources • Represented in OWL-DL
SWEET Ontologies Faceted Ontologies Living Substances Non-Living Substances Integrative Ontologies Natural Phenomena Physical Processes Human Activities Earth Realm Data Physical Properties Space Time Units Numerics
SWEET Supports Knowledge Reuse • SWEET is a concept space • Enables scalable classification of Earth science and data-related concepts • Enables object in data space to be mapped to science concept space • Concept space is translatable into other languages/cultures using “sameAs” notions
SWEET Science Ontologies • Earth Realms • Atmosphere, SolidEarth, Ocean, LandSurface, … • Physical Properties • temperature, composition, area, albedo, … • Substances • CO2, water, lava, salt, hydrogen, pollutants, … • Living Substances • Humans, fish, …
SWEET Conceptual Ontologies • Phenomena • ElNino, Volcano, Thunderstorm, Deforestation, Terrorism, physical processes (e.g., convection) • Each has associated EarthRealms, PhysicalProperties, spatial/temporal extent, etc. • Specific instances included • e.g., 1997-98 ElNino • Human Activities • Fisheries, IndustrialProcessing, Economics,…
SWEET Numerical Ontologies • SpatialEntities • Extents: country, Antarctica, equator, inlet, … • Relations: above, northOf, … • TemporalEntities • Extents: duration, century, season, … • Relations: after, before, … • Numerics • Extents: interval, point, 0, positiveIntegers, … • Relations: lessThan, greaterThan, … • Units • Extracted from Unidata’s UDUnits • Added SI prefixes • Multiplication of two quantities carries units
Numerical Ontologies • Numeric concepts defined in OWL only through standard XML XSD spec • Intervals defined as restrictions on real line • Added in SWEET • Numerical relations (lessThan, max, …) • Cartesian product (multidimensional spaces) • Numeric ontologies used to define spatial and temporal concepts
XSD: Datatypes • Numeric • boolean, decimal, float, double, integer, nonNegativeInteger, positiveInteger, nonPositiveInteger, negativeInteger, long, int, short, unsignedLong, unsignedInt, unsignedShort, unsignedByte, hexBinary, base64Binary • String • String, normalizedString, anyURI, token, language, NMTOKEN, Name, NCName • Date • dateTime, time, date, gYearMonth, gYear, gMonthDay, gDayxsd:gMonth
Data and Services Ontology • Formats • Data models • Data Sttructures • Special values • Missing, land, sea, ice, etc. • Parameters • Scale factors, offsets, algorithms • Data Services • Subset, reproject
Example: AIRS Level 2 Dataset • Subset of Dataset where • DataModel= Level 2 • Instrument= AIRS • HorizontalDimension= 2 • VerticalDimension= 1 • Format= HDF-EOS • Property= Temperature • Substance= Air
3DLayer subClassOf Fragment of SWEET PlanetaryLayer partOf primarySubstance =“air” Atmosphere partOf AtmosphereLayer upperBoundary =50 km subClassOf subClassOf sameAs= “Lower Atmosphere” lowerBoundary =15 km Troposphere Stratosphere isUpperBoundaryOf isLowerBoundaryOf Tropopause
How SWEET was Initially Populated • Initial sources • GCMD • Over 10,000 datasets • Over 1000 keywords • Data providers submit additional terms for “free-text” search • CF • Over 700 keywords • Very long term names • surface_downwelling_photon_spherical_irradiance_in_sea_water • Decomposed into facets • Property= spherical_irradiance • Substance= sea_water • Space= surface • Direction= down
Collaboration Web Site • Discussion tools • Blog, wiki, moderated discussion board • Version Control/ Configuration Management • Trace dependencies on external ontologies • Tools to search for existing concepts in registered ontologies • Ontology Validation Procedure • W3C note is formal submission method • Registry/discovery of ontologies • Support workflows/services for ontology development
Community Issues • Content • Maintain alignment given expansion of classes and properties • Standards and Conventions • Agreement on standards for use of OWL • Fuzzy representation conventions • Submit as standard to NASA Standards & Processes Working Group • Review Board • Who will oversee and maintain for perpetuity (or at least through the next funding cycle)? • ESIP Federation? A new consortium? • Global Support • Provide tools to visualize and appreciate the big picture
Update/Matching Issues • No removal of terms except for spelling or factual errors • Subscription service to notify affected ontologies when changes made • Must avoid contradictions • Additions can create redundancy if sameAs not used • Humans must oversee “matching” • CF has established moderator to carry out analogous additions • OWL “import” imports entire file • Associate community with ontology terms • Community tagging
Best Practices • Keep ontologies small, modular • Be careful that “Owl:Import” imports everything • Use higher level ontologies where possible • Identify hierarchy of concept spaces • Model schemas • Try to keep dependencies unidirectional
Web Sites • http://sweet.jpl.nasa.gov • http://PlanetOnt.org