220 likes | 376 Views
Semantic Air Quality Portal. Semantic e-Science Fall 2011. Group. Apurva Tiwari : Ontology creation Charisma Ladiwala : Data acquisition Linyun Fu: Systems Administration Rohan Dhruva : Usecase & Documentation William Gill: Front End
E N D
Semantic Air Quality Portal Semantic e-Science Fall 2011
Group • ApurvaTiwari: Ontology creation • Charisma Ladiwala: Data acquisition • Linyun Fu: Systems Administration • RohanDhruva: Usecase & Documentation • William Gill: Front End http://tw.rpi.edu/web/Courses/SemanticeScience/2011/airquality
Data Sources • EPA CASTNET data • Collecting measurements for pollutants • In 50 sites, since 1991 • Air Resources Board • California Environmental Protection Agency • Standard thresholds for pollutants
Use Case • “Comparing and analysing Air Quality data” • Allow users to visualise the air quality data in the US • Data collated from various sources • Know the air quality in the user’s region of interest • Moving • Travelling • Determine the risk posed by air-borne pollutants
Question 1 • How does the Air Quality of place X compare with the average air quality of New York State and what possible risks are posed by the air at place X? • Very useful for people who are planning to move to another state
Question 2 • In place X, which season poses the lowest airborne risks for a person with asthma wishing to conduct outdoor actives? • Question particularly relevant for tourists and outdoor enthusiasts • Select the best route, area, and time, keeping in mind the health restrictions
Provenance • Thresholds for different pollutants: • http://www.epa.gov/airnow/aqi_brochure_08-09.pdf • A Guide to Air Quality and Health. U.S. Environmental Protection Agency Office of Air Quality Planning and Standards Outreach and Information Division Research Triangle Park, NC August 2009 Collected it from website on 10/30/2011 at 8:32 pm • Converting concentration to AQI- Calculator: http://airnow.gov/index.cfm?action=resources.conc_aqi_calc AQI calculator: AQI to concentration U.S. Environmental Protection Agency Collected it from website on 10/30/2011 at 8:35 pm
Air quality statistics and data ( Status and Trends) 2005-2009 http://www.epa.gov/airtrends/factbook.html Air Quality Monitoring Information Updated December 17, 2010. Collected on 10/31/2011 at 6:20 pm.
Asthma Hospital Discharges - Rate per 10,000 Population, Total – Ten Year trend. Department of Health, NY State. 2007-2009 SPARCS Data as of October, 2010 Revised September 2011. http://www.health.ny.gov/statistics/ny_asthma/hosp/asthma6.htm Collected on 11/3/2011 at 9:36 am. • About Asthma Emergency Department Visit Data, Hospital Discharge Data and Deaths Department of Health, NY State. SPARCS Data as of October, 2010. Revised June 2009. http://www.health.ny.gov/statistics/ny_asthma/read.htm#2About Collected on 11/3/2011 at 10:20 am.
Exploring LOGD air quality data • <http://logd.tw.rpi.edu/source/epa-gov/dataset/air-quality-system/version/Oct-27-2010> is NOT found among the list of graphs at http://logd.tw.rpi.edu/sparql – Surprise! • PREFIX conversion: <http://purl.org/twc/vocab/conversion/> SELECT ?g sum( ?triples ) as ?estimated_triples WHERE { GRAPH ?g { ?g void:subset ?subdataset . ?subdataset conversion:num_triples ?triples . } } GROUP BY ?g
Converting NY Asthma Hospitality Data • Raw data: http://www.health.ny.gov/statistics/ny_asthma/hosp/asthma6.htm • Level 1 data (directly copied from raw data): asthma-hospitality-discharges-2007-2009.xlsx • Level 2 data (only keep the columns Region/County and Adjusted Average Rate and add a FIPS code column; delete regional/national total rows and metadata block): asthma-hospitality-discharges-2007-2009.csv • Level 3 data: asthma-hospitality-discharges-2007-2009.csv.e1.ttl converted with csv2rdf4lod with the enhancement parameters asthma-hospitality-discharges-2007-2009.csv.e1.params.ttl – downloadable from http://tw.rpi.edu/web/Courses/SemanticeScience/2011/AirQuality
Loading triples into Virtuoso store • ssh sparql.tw.rpi.edu • vload ttl <data_file> <graph_uri> • Thanks Ping Wang for loading the graph <http://sparql.tw.rpi.edu/air/asthma-hospitality-discharges-2007-2009> • Thanks Tim Lebo for helping Linyun load the graph <http://test/asthma> with the same data • Accessible at http://sparql.tw.rpi.edu/virtuoso/sparql
TODOs • Improve loaded data and reload them • Convert and load threshold data • Load upper-level ontology
Ontology Work • Two main ontologies in our project • Threshold ontology • Defines the threshold levels for each pollutant • Data obtained from CA Air Resources Board • Pollutant ontology • Provides data for the map/timeline • Data obtained from EPA CASTNET project • Stipulated to NY State for now, to avoid data overload
Properties • Data Properties: • hasEPAValue- Type: Double • hasCAValue- Type Double • Object Properties: • hasPollutant
Front End • Air quality represented on a choropleth map (by county) • Slider allows user to navigate through time • Asthma data plotted against AQ when user hovers over a county • Plot data within data extremes // against thresholds
Front End • Good: • PHP proxy to sparql end point (manage XSS & make http POST) • Map technology: http://polymaps.org/ • To improve: • Build graph to plot AQ v Asthma hospitalizations • Reduce no. of AJAX requests to endpoint …. Preload all data? • Clean up invalid LogDjson output (post?) • Deal with multipart fips code // integers (should be numeric names)
Future Work • Add more sources of data • Parse and display data for all the states
Thank You! Questions?