1 / 17

TITLE: Semantic infrastructure for the Geosciences

TITLE: Semantic infrastructure for the Geosciences A. Krishna Sinha, Geosciences, Virginia Tech, Blacksburg (pitlab@vt.edu) *Robert Raskin, Jet Propulsion Laboratory, Pasadena ( robert.g.raskin@jpl.nasa.gov) Natasha Noy, Stanford (noy@stanford.edu).

amanda
Download Presentation

TITLE: Semantic infrastructure for the Geosciences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TITLE: Semantic infrastructure for the Geosciences A. Krishna Sinha, Geosciences, Virginia Tech, Blacksburg (pitlab@vt.edu) *Robert Raskin, Jet Propulsion Laboratory, Pasadena (robert.g.raskin@jpl.nasa.gov) Natasha Noy, Stanford(noy@stanford.edu) There is common consensus that all scientific disciplines have to deal with heterogeneity of data, as well as (a) data complexity, (b) data volatility, (c) large data volumes, (d) broad distribution of data resources, and (e) access to tools and services to appropriately render and represent data and data products. The semantic capabilities needed to integrate complex and heterogeneous data within a knowledge based infrastructure is the focus of this proposal. Semantic technologies provide an abstraction layer above existing syntax-based technologies, thereby bridging data, tools, and various processes across scientific domains and IT silos. As the proposed framework is relevant to all science disciplines, our workshops will include ontologists from other scientific disciplines, e.g. medical and bioinformatics, where ontology development is more mature. A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  2. Our research program emphasizes two major goals: • Knowledge Capture: We will create deep data level ontologies (Foundation ontologies) that capture the conceptual meaning of data used in geosciences, with the goal of promoting data interoperability and automated inference, leading to the transition from data to knowledge. We will utilize existing middle- and upper-level ontologies such as IEEE Ontology Working Group’s Suggested Upper Ontologies (SUO; http://suo.ieee.org/), and mid-level ontologies contained in Semantic Web for Earth and Environmental Terminology (SWEET) to anchor Foundation ontologies so that they are compliant with widely accepted ontology engineering practices. • Knowledge-based Services: Data-to-knowledge transformation enabled by semantic integration requires knowledge-based services including: (1) semantically based registration and discovery of metadata, data and tools, (2) seamless access and fusion of multiple data products, (3) the ability to find structure within large, heterogeneous data, and (4) automated scientific reasoning. A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  3. Developing Foundation Ontologies through workshops We will convene multiple workshops with domain specialists to expand ontologies based on geoscience community needs and goals. We will implement first-order logic-based relationships between domain specific concepts, features, phenomena and processes. Past work on developing geoscience ontologies has been limited to mostly taxonomic organization of terms, e.g. GeoSciML for geologic maps (One Geology, http://www.onegeology.org/) and, MMI for marine world (http://marinemetadata.org), and are not based on first-order logic required for inference and computational reasoning. We will also develop ontology-aware tools (e.g. semantically registered data) to exploit this shared understanding to enable rapid and easy transition from data discovery to knowledge discovery A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  4. Conceptual organization of ontologies at various levels of granularity. Foundation-level representation of concepts (Material, Time, Structure, Location, Services) provide data level semantics required to integrate across disciplines, as well as discovery of metadata and tools A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  5. Our experience with ontology development shows that domain experts feel more confident in sharing their knowledge if existing ontology development efforts, e.g. methodology, are communicated prior to workshops. The workshops will capture and formalize community knowledge that is often related to experiences of domain experts. We will utilize community participation (through invitations and open calls for participation) coupled with the broad geoscience theme of volcanism and climate change with heterogeneous data types to accelerate the process of creating the relevant ontologies. The workshop participants will then populate, modify and create ontologies leading to a semantic framework based solutions for the use case. Collaborative ontology workflow methods will be used for participants to engage in ontology development in the areas shown below Primary ontologies relevant to the geosciences and focus of workshops A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  6. After initial workshops, we will continue to foster a distributed community of contributors and ontology users. We will run an on-line community-based ontology repository by installing a Virtual Machine image of the BioPortal (http://bioportal.bioontology.org) which is an actively supported, widely used domain-independent ontology-repository infrastructure. The scientists will use the repository to view and review the ontologies, propose their changes, request new terms, and create mappings between similar terms in different ontologies and vocabularies. Users with edit rights will use a distributed ontology-editing environment (e.g., WebProtege) to make the required changes and run a reasoner to ensure consistency. BioPortal also provides a Web service interface to enable applications to access the ontologies and the mappings to annotate the data with ontology terms to enable integration or aggregation of data. A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  7. Data types related to studying volcanoes Data associated with just volcanoes !!!! Nyamuragira Nyiragongo Sarah Colclough Cambridge University A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  8. Physical quantity versus measured as quantity Value and units? Reference frame? Reference units? A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  9. Ontology Packages A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  10. Some concepts within the Material Package Integration holds the promise of fundamentally transforming how biologic research is done, allowing researchers to synthesize information and make connections among many types of experiments in ways that have never before been possible; but it also poses the most difficult challenge to those who develop and use the databases. “The problem is that interaction with a collection of databases should be as seamless as interaction with any single member of the collection. But integrating databases in this way has proved exceptionally difficult because the databases are so different. “We have many disciplines, many subfields,” said Wiederhold, of Stanford University's Computer Science Department, “and they are autonomous—and must remain autonomous—to set their own standards of quality and make progress in their own areas. We can't do without that heterogeneity.” At the same time, however, “the heterogeneity that we find in all the sources inhibits integration.” The result is what computer scientists call “the interoperability problem,”….NAS report A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  11. Level 1. Discovery of data resources (e.g., gravity, geologic maps, etc) requires registration through use of high level index terms such as AGI, Geoscience World, AGU, GCMD. Level 2. Discovering Item level databases requires registration at data level ontologies (e.g. rock geochemistry, gravity database) Level 3. Item detail level registration (e.g., column in geochemical database that represents SiO2 measurement) . This level of registration is a requirement for semantic integration Registration of Data : key to discovery and integration A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012 A.K.Sinha and Kai Lin, Geoinformatics 2006

  12. Data registration at various levels of granularity: registration technology Data Discovery Data Integration Level 1: Data Registration at the Index Term Level Level 2: Data Registration at the Item Level Level 3: Data Registration at the Item Detail Level Earth Sciences Virtual Database A Data Warehouse where Schema heterogeneity problem is solved Data Integration Technology A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  13. Level 1Registration at the Index Term Level Term Level http://www.geoscienceworld.org/ A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  14. Level 2: Registration at the Item Level Structure Isotope Location Mineral Rock Element A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012 A.K.Sinha and Kai Lin, Geoinformatics 2006

  15. Level 3:Registration at the Item Detail Level (Example1) Approach of registering data to concepts removes structural (format) and semantic heterogeneity 1 0..n A Section from Material Ontology A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012 A.K.Sinha and Kai Lin, Geoinformatics 2006

  16. A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

  17. Links to Roadmap elements • 1.Purpose: will enable data discovery , utilization and integration • 2. Solutions: will provide technology for current and future research needs • 3. Process: will enable interoperability • 4. Communication: will enable interactions with other sciences A.K.Sinha EOI presentation, Arlington , April 30-May 1, 2012

More Related