270 likes | 440 Views
Semantic Air Quality Portal. Semantic e-Science Fall 2011. Group. Apurva Tiwari : Ontology creation Charisma Ladiwala : Data acquisition Linyun Fu: Systems Administration Rohan Dhruva : Usecase & Documentation William Gill: Front End
E N D
Semantic Air Quality Portal Semantic e-Science Fall 2011
Group • ApurvaTiwari: Ontology creation • Charisma Ladiwala: Data acquisition • Linyun Fu: Systems Administration • RohanDhruva: Usecase & Documentation • William Gill: Front End http://tw.rpi.edu/web/Courses/SemanticeScience/2011/airquality
Data Sources • EPA CASTNET data • Collecting measurements for pollutants • In 50 sites, since 1991 • Air Resources Board • California Environmental Protection Agency • Standard thresholds for pollutants
Use Case • “Comparing and analysing Air Quality data” • Allow users to visualise the air quality data in the US • Data collated from various sources • Know the air quality in the user’s region of interest • Moving • Travelling • Determine the risk posed by air-borne pollutants
Question 1 • How does the Air Quality of place X compare with the average air quality of New York State and what possible risks are posed by the air at place X? • Very useful for people who are planning to move to another state
Question 2 • In place X, which season poses the lowest airborne risks for a person with asthma wishing to conduct outdoor actives? • Question particularly relevant for tourists and outdoor enthusiasts • Select the best route, area, and time, keeping in mind the health restrictions
Provenance • Thresholds for different pollutants: • http://www.epa.gov/airnow/aqi_brochure_08-09.pdf • A Guide to Air Quality and Health. • U.S. Environmental Protection Agency • Office of Air Quality Planning and Standards • Outreach and Information Division • Research Triangle Park, NC August 2009 • Collected it from website on 10/30/2011 at 8:32 pm . Why??? • Converting concentration to AQI- Calculator: • http://airnow.gov/index.cfm?action=resources.conc_aqi_calc • AQI calculator: AQI to concentration • U.S. Environmental Protection Agency • Collected it from website on 10/30/2011 at 8:35 pm . Why???
Asthma Hospital Discharges - Rate per 10,000 Population, Total – Ten Year trend. Department of Health, NY State. 2007-2009 SPARCS Data as of October, 2010 Revised September 2011. http://www.health.ny.gov/statistics/ny_asthma/hosp/asthma6.htm Collected on 11/3/2011 at 9:36 am. Why??? • About Asthma Emergency Department Visit Data, Hospital Discharge Data and Deaths Department of Health, NY State. SPARCS Data as of October, 2010. Revised June 2009. http://www.health.ny.gov/statistics/ny_asthma/read.htm#2About Collected on 11/3/2011 at 10:20 am. Why???
Air quality statistics and data ( Status and Trends) 2005-2009 • http://www.epa.gov/airtrends/factbook.html • Air Quality Monitoring Information • Updated December 17, 2010. • Collected on 10/31/2011 at 6:20 pm. Why???
Additional Provenance Information • The csv2rdf4lod automation tool: https://github.com/timrdf/csv2rdf4lod-automation.It was downloaded on October 15,2011 at 7:24pm. Github social coding Powered by: Dedicated serversand cloud computingof Rackspace Hosting. • Data on how the asthma data was collected. >>> This data will be encoded in the visualization!!!
NY Asthma Hospitality Data Provenance • Raw data: http://www.health.ny.gov/statistics/ny_asthma/hosp/asthma6.htm • Companion provenance data: asthma6.htm.pml.ttl • Level 2 data (only keep the columns Region/County and Adjusted Average Rate and add a FIPS code column; delete regional/national total rows and metadata block): asthma-hospitality-discharges-2007-2009.csv • Companion provenance data: asthma-hospitality-discharges-2007-2009.csv.pml.ttl • Level 3 data: asthma-hospitality-discharges-2007-2009.csv.e1.ttl converted with csv2rdf4lod with the enhancement parameters asthma-hospitality-discharges-2007-2009.csv.e1.params.ttl • All files downloadable from http://tw.rpi.edu/web/Courses/SemanticeScience/2011/AirQuality
EPA AQI Category Ontology • Source: http://airnow.gov/index.cfm?action=pubs.aqiguideozone • Expressed in OWL: aqiCategory-owl.rdf • <owl:Restriction> <owl:onProperty rdf:resource="&e1prop;daily_aqi_value"/> <owl:someValuesFrom> <rdfs:Datatype> <owl:onDatatype rdf:resource="&xsd;integer"/> <owl:withRestrictions rdf:parseType="Collection“> <rdf:Description rdf:about="#Good-AQI-Category-Min“> <xsd:minInclusive rdf:datatype="&xsd;integer">0</xsd:minInclusive> </rdf:Description> </owl:withRestrictions> </rdfs:Datatype> </owl:someValuesFrom></owl:Restriction> • Tested with Jena
Pollutant Ontology • Pollutant.owl • <owl:Class rdf:about="#Aldehyde“> <rdfs:subClassOf rdf:resource="#Organic"/></owl:Class><owl:Class rdf:about="#Alkane“> <rdfs:subClassOf rdf:resource="#PureHydrocarbon"/></owl:Class> • Not so useful now
Ontology Work • Two main ontologies in our project • Threshold ontology • Defines the threshold levels for each pollutant • Data obtained from CA Air Resources Board • Pollutant ontology • Provides data for the map/timeline • Data obtained from EPA CASTNET project • Stipulated to NY State for now, to avoid data overload
Properties • Data Properties: • hasEPAValue- Type: Double • hasCAValue- Type Double • Object Properties: • hasPollutant
PLANS TO IMPROVE THE ONTOLOGY • More Provenance data must be included: • More provenance data like the date and place for which the data has been collected should be incorporated. • Different Degrees of Thresholds: • Currently, due to lack of concrete numbers, only one level threshold has been included. But different levels of thresholds like average, hazardous, lethal etc. can be used. • New Ontology for Asthma: • Right now, any knowledge on asthma has not been admitted into the ontology. So, the ontology can be improved by letting it in the ontology. • Extension of the Pollutant Ontology: • The pollutant ontology can be extended to be more comprehensive and capture more detailed knowledge on pollutants.
Middleware • PHP scripts that compose pre-built queries and cache results • Take JSON results and arrange data for fast lookup in client • 2 Dimensional arrays • Data[time][fips]
Future Middleware • Smart look-aheads based on time and place settings made by the client • Found sparql endpoint returns max of 10k results • Currently limited data to NYS (~8k rows)
Front End • Jquery UI api(slider) • http://jqueryui.com/ • Polymaps (map) • http://polymaps.org/
Future Work • Add more sources of data • Parse and display data for all the states
Thank You! Questions?