130 likes | 144 Views
This paper discusses the problem of integrating and interpreting data in large projects and presents foundational technologies and recent developments in semantic tools for complex ecosystem research environments.
E N D
Semantic Support for Complex Ecosystem Research Environments Deborah McGuinness1, Paulo Pinheiro1, Henrique Santos1,2, Matthew Klawonn1, Katherine Chastain1 1Rensselaer Polytechnic Institute, USA 2Universidade de Fortaleza, Brazil AGU, December 2015
Outline • Problem Statement • Foundational Technologies • Long standing semantic tools • Custom solutions • Recent Developments • Conclusions • Future Directions 2
Problem Statement • In large projects, how should data be: • Integrated with other relevant data and metadata? • Interpreted? • And also • Accessed, shared, and visualized? • Examples of data types in projects we work on: • Environmental monitoring • Architecture science and ecology 3
Foundational Technologies • Ontologies: For capturing context • PROV-O • OBOE • VSTO • HASNetO • Apache SOLR: For storage and retrieval • Contextualized CSVs: For data annotation • D3 Javascript: For metadata visualization 4
The Human-Aware Sensor Network Ontology wasAssociatedWith prov: Agent startedAtTime prov: Activity xsd:dateTime 1 vstoi: Platform endedAtTime xsd:dateTime * hasData Collection * hasneto: DataCollection vstoi: Deployment 1 vstoi: Instrument 1 * * hasneto: hasMeasurement 1 * * hasneto: Sensing Perspective * oboe: Measurement vstoi: Detector perspectiveOf * * hasPerspective Characteristic of-characteristic * 1 1 vstoi: Attached Detector vstoi: Detachable Detector 0..1 * oboe: Characteristic oboe: Entity
HADatAc • Human Aware Data Acquisition Framework • A web application based on Apache SOLR, the Play Framework • Goal: To provide a one-stop-shop for combined data and metadata management, markup, integration, retrieval, and visualization • Uses ontologies combined with limited human markup to achieve this goal • Can be deployed on a laptop or server, depending on a user's needs 6
Combining Data and Metadata Measurement metadata Mouse over Metadata based faceted search Mouse over Metadata about the metadata 7
Data Privacy • In addition to nice visualization, integration, and retrieval features, HADatAc has sophisticated privacy mechanisms • Data has various levels of access open to anonymous and pre-registered users. 8
Ease of Use == START-PREAMBLE == @base <http://localhost#> . . @prefix hasneto: <http://hadatac.org/ont/hasneto#> . @prefix hadatac: <http://hadatac.org/ont/hadatac#> . <example-kb> a hadatac:KnowledgeBase; hadatac:hasHost "http://localhost"^^xsd:anyURI . <dataCollection-example01> a hasneto:DataCollection; prov:startedAtTime "2015-02-12T09:30:00Z"^^xsd:dateTime . <deployment-example01> hasneto:hasDataCollection <dataCollection-example01> . <example01-dataset01> a vstoi:Dataset; prov:wasGeneratedBy <dataCollection-example01>; hadatac:hasMeasurementType <mt0>,<mt1> . <mt0> a hadatac:MeasurementType; time:inDateTime <ts0>; hadatac:atColumn 3; oboe:ofCharacteristic hadatac-entities:EC-WindDirection; oboe:usesStandard oboe-standards:Degree . <mt1> a hadatac:MeasurementType; time:inDateTime <ts0>; hadatac:atColumn 2; oboe:ofCharacteristic hadatac-entities:EC-WindSpeed; oboe:usesStandard oboe-standards:MeterPerSecond . <ts0> hadatac:atColumn 0 . == END-PREAMBLE == TimeStamp,Record,WindSpdAve_ms,WindDir,WindSpd_ms_Min,WindSpdGust_ms_Max,AirTemp_C_Avg,RH_Pct_Avg,BaroPress_hPa_Avg,Rain_mm_Tot,Hail_Hits_Tot 2015-02-12T09:30:00Z,0,0.99,217.9,0.3,1.7,-4.5,66.58,995,0,0 2015-02-12T09:45:00Z,1,1.112,227.8,0.1,2.1,-4.372,66.45,995,0,0 2015-02-12T10:00:00Z,2,1.169,222.2,0.3,2.6,-4.146,65.98,995,0,0 • Work with csv files • Automate data transfer across the web, including large amounts of data • Retrieval (e.g faceted search), and visualization tools are automatically usable with uploaded data. 10
Conclusions • Various ontologies were presented with the intent to show how they capture context in big data projects • HADatAc was introduced, along with some of its key functionalities. HADatAc is a cross-platform web service which integrates annotated data sets with other relevant data and metadata, and surrounds them with retrieval (faceted search) and visualization tools as well as privacy controls. 11
Future Steps • Refine HASNetO vocabulary and test it over a constantly growing HASNetO-based knowledge base. • Continue to add functionality to HADatAc • More visualization tools • Enhanced search capabilities • Looking to integrate with lab information management systems (potentially use with science other than medicine) 12
More Information • Contact Information • Deborah McGuinness: dlm@cs.rpi.edu • Paulo Pinheiro: pinhep@rpi.edu • Matt Klawonn: klawom@rpi.edu 13