1 / 22

ESOR: Earth Science Ontology Repository for Existing Designs and Prototypes at RPI

ESOR is a knowledge base that matches entities for multiple applications in the Earth Science domain. It provides improved recall and precision compared to BioPortal. Explore existing designs and prototypes at RPI ESOR.

macw
Download Presentation

ESOR: Earth Science Ontology Repository for Existing Designs and Prototypes at RPI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Show and Tell Existing Designs and Prototypes at RPI

  2. Earth Science Ontology Repository (ESOR)

  3. What is ESOR?[1] • The Earth Science Ontology Repository(ESOR) provides an entity matching service as a backend knowledge base for multiple applications. It serves a similar function as BioPortal[2], but with more of a focus on the Earth Science domain.

  4. How to use ESOR • links: • keyword search: http://orion.tw.rpi.edu/~zhengj3/wod/earthsearch.php • matched entity with score by keyword dataonetwc.tw.rpi.edu/linkipedia/search?query= • ontology by entity dataonetwc.tw.rpi.edu/linkipedia/read?url=

  5. Example#1 • input: “snow, snowdepth” (MsTMIP #28) • output of keyword search:

  6. Example #2 • output of matched entity with score by keyword:

  7. Example #3 • output of ontology by entity:

  8. Comparison with Bioportal • using “snow depth” or “snow, snow depth” as input, Bioportal will return matched entities with either of the keywords “snow” or “depth”. Therefore, the better match “http://sweet.jpl.nasa.gov/2.3/propSpaceThickness.owl#SnowCover” is missed. • A similar problem occurs with other MsTMIP variables (#10 Heterotrophic Respiration, #11 Leaf Area Index, #22 Near surface specific humidity, #23 Sensible heat, #24 Latent Heat .... ) • Bioportal had much more complicated functions than ESOR. However, if we just use it as a backend knowledge base, our ESOR has better performance in terms of recall and precision.

  9. List of ontologies • version 1 (the current version) ChEBI (Chemical Entities of Biological Interest) OBI (Ontology for Biomedical Investigations) OBO-E (Extension Observation Ontology) PROV-O (The Provenance Ontology) SWEET Chemical Properties SWEET Human Research SWEET Units SemantEco Water Ontology Semanteco Pollution Ontology Time Ontology UO (Units of Measurement Ontology) dcterms (Dublin Core) foaf (Friend of a Friend) geonames wgs (Basic Geo Vocabulary) ... • version 2 (coming soon) • KB_Bio_101 (AURA) • Santa Barbara Coastal Observation Ontology (OBOE-SBC) • ... * for the complete list, see document section 3:https://docs.google.com/document/d/1Hs3k0RrfUoQkxKEJBJtU9trFdqC-NdhtwFKSz5wcHXM/edit#

  10. Underlying techniques

  11. How can we benefit from ESOR in D1? • help the user to choose the right entity with which to annotate their dataset • serve as a backend knowledge base for automatic semantic annotation • match entities based on content instead of single keywords

  12. SemantEco Annotator

  13. SemantEco: Semantic Environmental Monitoring Approach Goal • Enable/Empower communities (citizens & scientists) to explore pollution sites, facilities, regulations, and health impacts along with provenance • Connections to USGS, Lake George, IBM, expanding to discussions of predictions and intervention suggestions • Where are pollution events happening? • What are the health impacts? • How does pollution correlate with population changes (wildlife, invasives, etc.)? • Explanation of pollution limits • Graphing thresholds and trends • Possible health effect of contaminant (EPA) • Filtering by facet to select type of data • Link for reporting problems • Extended with input from USGS, with population counts for birds & fish Question we try to answer

  14. Tools for Semantic Annotation of Measurements Demo: https://www.youtube.com/watch?v=pKO5NwgWnyc • Annotator • annotation of CSV’s • OBOÉ design pattern • D1 Phase I product, and other ongoing work at RPI... • GITHUB, Web interface available

  15. Focus of D1 Semantics in Phase II Semantics of Measurements: ...binding raw data values to concepts drawn from ontologies ...often through metadata ...using W3C standards for annotation-- PROV, OA ...to facilitate resource discovery and interpretation through enhanced precision and recall of searches

  16. What’s new: ontology search Features: Weighted ranking Entity Linking Similarity check User Preference Semantic Annotator owl search takes advantages of the earth science ontology knowledge base and linkipedia tool from Tetherless World Constellation at RPI.

  17. What is new: User Management • Misc interface updates: • Integrated ontology search • Other bug fixes • User interface can be implemented for a variety of storage / representation methods • User Store interface handles read-write for Users (to-from database, file, etc) • Permission interface is a simple representation of source and level that can be translated to URI User UserStore Permission

  18. How can we benefit from SemantEco Annotator in D1? • Annotator can be integrated into the DataONE annotator, especially for use with CSV-formatted datasets • Reuse the design pattern • Facilitate resource discovery and interpretation through enhanced precision and recall of searches

  19. Related projects Jefferson Project

  20. Multi-beam SONAR, Bathymetric LiDAR, Terrestrial LiDAR, Sensor Network composed of 30+ instruments streaming data 365/24/7 High resolution, high accuracy, high data density seamless data set ~ 70 Tb/year total raw data expected, 7-8Tb/year of “product data” Semantic approach's contributes in two knowledge representation and reasoning areas human interventions on the deployment and maintenance of local sensor networks including the scientific knowledge to decide how and where sensors are deployed Data Integration through the use of the Human-Aware Sensor Network Ontology (HASNetO), which is based on OBOE, W3C PROV, and VSTO knowledge about simulation results including parameters, interpretation of results, and comparison of results against external data The Jefferson Project Fundamental Goal: Understand, Predict and Enable a Healthy Lake George Ecosystem Using Cutting–Edge Science to Enable Smarter Solutions

  21. Reference [1] ESOR link: http://orion.tw.rpi.edu/~zhengj3/wod/earthsearch.php [2] BioPortal: http://bioportal.bioontology.org/ [3] The Earth Science Ontology Repository: https://docs.google.com/document/d/1Hs3k0RrfUoQkxKEJBJtU9trFdqC-NdhtwFKSz5wcHXM/edit#[4] Wikipedia/dbpedia mapping for MsTMIP variables: https://docs.google.com/document/d/18tKNwyonw2sFzbFE0BzPt8WtPA1gVS6EPaGYxeh9dd4/edit#heading=h.fj17u9rk42u[5] MsTMIP use case: https://docs.google.com/document/d/1hS7j-TCLtbA2x0ztZZhXI-1NNqE5DEi585MnOq8fhpo/edit#heading=h.u4s2g4klu51s[6] MsTMIP 45 variables mapping:https://docs.google.com/document/d/1y2ieWXIhmE6-vz3SesCf50tfO46DWcUBVbWpCWdlBWk/edit[7] Annotator: http://tw.rpi.edu/web/project/SemantEcoAnnotator Thanks!

More Related