1 / 1

Research Initiatives in Deep Carbon Observatory – Data Science

Research Initiatives in Deep Carbon Observatory – Data Science. Han Wang ( wangh17@rpi.edu ) , Yanning Chen ( cheny18 @ rpi.edu ) , Xiaogang Ma ( max7@ rpi.edu ) , John Erickson ( erickj4@ rpi.edu ) , Patrick West ( westp@rpi.edu ) , and Peter Fox ( pfox@ cs.rp i .edu ).

blake
Download Presentation

Research Initiatives in Deep Carbon Observatory – Data Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Initiatives in Deep Carbon Observatory – Data Science Han Wang(wangh17@rpi.edu), Yanning Chen (cheny18@rpi.edu), Xiaogang Ma (max7@rpi.edu), John Erickson (erickj4@rpi.edu), Patrick West(westp@rpi.edu),and Peter Fox (pfox@cs.rpi.edu) Tetherless World Constellation, Rensselaer Polytechnic Institute, Troy, NY 12180, United States Ontology Development Knowledge Discovery Combing semantic web approaches and other techniques, we are able to discover new knowledge from the data. One technique we are looking into is Formal Concept Analysis (FCA), a method for “deriving a concept hierarchy from a collection of objects and their properties” [Wikipedia]. By applying FCA to a set of objects, we will discover the implicit relationships between these objects and have tools to visualize the relationships. In other words, FCA allows us to automatically learn a conceptual hierarchy or taxonomy from the given data. In DCO, we proposed to perform FCA on the semantically annotated entities, such as instruments and measured parameters, and we are looking for automatically building an instrument hierarchy, a parameter hierarchy, and relationships that link parameters to instruments that measure them. We are encoding the key concepts of DCO research activities such as instruments, parameters, techniques, materials, people, organizations, etc. into ontological models. Provenance Capture There are numerous instruments involved in different scientific activities from the four DCO directorates. These instruments produced a large amount of data, which were then used to generate various data artifacts by either humans or software agents. As these generation processes tend to be fairly complex, capturing the provenance information in them becomes crucial. We proposed to utilize the PROV Ontology, the W3C recommendation for modeling provenance, to represent the entities, people, and activities involved in producing the data artifacts. Fig. 1.A preliminary ontology for DCO objects. Information Retrieval • One usage of the ontologies is that we can use them as templates to annotate entities in the textual titles and/or descriptions of datasets. The annotated entities have semantic meanings from the ontologies, and therefore we can develop various applications based on them. • When a dataset title and/or description contains any domain-specific terms, we are able to show their explanations which are encoded in the ontologies. • When there is any bibliographic information such as a person, an organization, an article, etc. in the text, we are able to present additional information of it which is linked to the original entity in the ontologies. • When a user registers a dataset and needs to input textual metadata, we can enable auto-completion as long as the ontologies contain the relevant entries. Fig. 2. Using the PROV Ontology to model the process of an instrument taking a measurement. Fig. 3. Using the PROV Ontology to model the process of a chart being generated by a dataset. Visit DCO webpage Get this poster

More Related