1 / 22

Data to Knowledge

SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory. Data to Knowledge. Data Information Knowledge. Basic Elements Bytes Numbers Models Facts

mcnallyj
Download Presentation

Data to Knowledge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SWEET: Upper-Level Ontologies for Earth System ScienceOPeNDAP MeetingFeb 2007Rob RaskinPO.DAACJet Propulsion Laboratory

  2. Data to Knowledge Data Information Knowledge Basic Elements Bytes Numbers Models Facts Services Ingest Archive Visualize Infer Understand Predict Storage File Database HDF-EOS GIS/MIS Ontology Mind Interoperability Syntactic OPeNDAP WMS/WCS Semantic Volume/Density High/Low Low/High Statistics Checksum Moments Descriptive Inferential Analysis Fourier Wavelet EOF SSA Methodology Exploratory-analysis Model-based-mining Syntax Semantics

  3. Semantics: Shared Understanding of Concepts • Provides a namespace for scientific terms…plus • Provides descriptions of how terms relate to one another • Example tags in markup language: • subclass, subproperty, part of, same as, transitive property, cardinality, etc. • Enables object in “data space” to be associated formally with object in “science concept space” • “Shared understanding” enables software tools to find “meaning” in resources

  4. Ontology Representation • W3C has adopted four XML-based standard ontology languages: • RDF, OWL-Lite, OWL-DL, OWL Full • Basic building blocks: • Class, subclass, property, subproperty, sameAs • Standard language enables anyone to extend an ontology • Knowledge built up incrementally

  5. Why an Upper-Level Ontology for Earth System Science? • Many common concepts used across Earth Science disciplines (such as properties of the Earth) • Provides common definitions for terms used in multiple disciplines or communities • Provides common language in support of community and multidisciplinary activities • Reduced burden (and barrier to entry) on creators of specialized domain ontologies • Only need to create ontologies for incremental knowledge

  6. Semantic Web for Earth & Environmental Terminology (SWEET) • Ontology of Earth system science and data concepts • Provides a common semantic framework (or namespace) for describing Earth science information and knowledge • Emphasis on improving search for NASA Earth science data resources • Represented in OWL-DL

  7. SWEET Ontologies Faceted Ontologies Living Substances Non-Living Substances Integrative Ontologies Natural Phenomena Physical Processes Human Activities Earth Realm Data Physical Properties Space Time Units Numerics

  8. SWEET Supports Knowledge Reuse • SWEET is a concept space • Enables scalable classification of Earth science and data-related concepts • Enables object in data space to be mapped to science concept space • Concept space is translatable into other languages/cultures using “sameAs” notions

  9. SWEET Science Ontologies • Earth Realms • Atmosphere, SolidEarth, Ocean, LandSurface, … • Physical Properties • temperature, composition, area, albedo, … • Substances • CO2, water, lava, salt, hydrogen, pollutants, … • Living Substances • Humans, fish, …

  10. SWEET Conceptual Ontologies • Phenomena • ElNino, Volcano, Thunderstorm, Deforestation, Terrorism, physical processes (e.g., convection) • Each has associated EarthRealms, PhysicalProperties, spatial/temporal extent, etc. • Specific instances included • e.g., 1997-98 ElNino • Human Activities • Fisheries, IndustrialProcessing, Economics,…

  11. SWEET Numerical Ontologies • SpatialEntities • Extents: country, Antarctica, equator, inlet, … • Relations: above, northOf, … • TemporalEntities • Extents: duration, century, season, … • Relations: after, before, … • Numerics • Extents: interval, point, 0, positiveIntegers, … • Relations: lessThan, greaterThan, … • Units • Extracted from Unidata’s UDUnits • Added SI prefixes • Multiplication of two quantities carries units

  12. Numerical Ontologies • Numeric concepts defined in OWL only through standard XML XSD spec • Intervals defined as restrictions on real line • Added in SWEET • Numerical relations (lessThan, max, …) • Cartesian product (multidimensional spaces) • Numeric ontologies used to define spatial and temporal concepts

  13. XSD: Datatypes • Numeric • boolean, decimal, float, double, integer, nonNegativeInteger, positiveInteger, nonPositiveInteger, negativeInteger, long, int, short, unsignedLong, unsignedInt, unsignedShort, unsignedByte, hexBinary, base64Binary • String • String, normalizedString, anyURI, token, language, NMTOKEN, Name, NCName • Date • dateTime, time, date, gYearMonth, gYear, gMonthDay, gDayxsd:gMonth

  14. Data and Services Ontology • Formats • Data models • Data Sttructures • Special values • Missing, land, sea, ice, etc. • Parameters • Scale factors, offsets, algorithms • Data Services • Subset, reproject

  15. Example: AIRS Level 2 Dataset • Subset of Dataset where • DataModel= Level 2 • Instrument= AIRS • HorizontalDimension= 2 • VerticalDimension= 1 • Format= HDF-EOS • Property= Temperature • Substance= Air

  16. 3DLayer subClassOf Fragment of SWEET PlanetaryLayer partOf primarySubstance =“air” Atmosphere partOf AtmosphereLayer upperBoundary =50 km subClassOf subClassOf sameAs= “Lower Atmosphere” lowerBoundary =15 km Troposphere Stratosphere isUpperBoundaryOf isLowerBoundaryOf Tropopause

  17. How SWEET was Initially Populated • Initial sources • GCMD • Over 10,000 datasets • Over 1000 keywords • Data providers submit additional terms for “free-text” search • CF • Over 700 keywords • Very long term names • surface_downwelling_photon_spherical_irradiance_in_sea_water • Decomposed into facets • Property= spherical_irradiance • Substance= sea_water • Space= surface • Direction= down

  18. Collaboration Web Site • Discussion tools • Blog, wiki, moderated discussion board • Version Control/ Configuration Management • Trace dependencies on external ontologies • Tools to search for existing concepts in registered ontologies • Ontology Validation Procedure • W3C note is formal submission method • Registry/discovery of ontologies • Support workflows/services for ontology development

  19. Community Issues • Content • Maintain alignment given expansion of classes and properties • Standards and Conventions • Agreement on standards for use of OWL • Fuzzy representation conventions • Submit as standard to NASA Standards & Processes Working Group • Review Board • Who will oversee and maintain for perpetuity (or at least through the next funding cycle)? • ESIP Federation? A new consortium? • Global Support • Provide tools to visualize and appreciate the big picture

  20. Update/Matching Issues • No removal of terms except for spelling or factual errors • Subscription service to notify affected ontologies when changes made • Must avoid contradictions • Additions can create redundancy if sameAs not used • Humans must oversee “matching” • CF has established moderator to carry out analogous additions • OWL “import” imports entire file • Associate community with ontology terms • Community tagging

  21. Best Practices • Keep ontologies small, modular • Be careful that “Owl:Import” imports everything • Use higher level ontologies where possible • Identify hierarchy of concept spaces • Model schemas • Try to keep dependencies unidirectional

  22. Web Sites • http://sweet.jpl.nasa.gov • http://PlanetOnt.org

More Related