1 / 18

Stefan Falke Center for Air Pollution Impact and Trend Analysis

N etworked E nvironmental I nformation S ystem for G lobal E missions I nventories ( NEISGEI ). Applying 21 st Century Advances in IT Technology to the Conversion of Air Quality Data into Scientific-, Management- and Policy-Relevant Knowledge. Stefan Falke

halj
Download Presentation

Stefan Falke Center for Air Pollution Impact and Trend Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Networked Environmental Information System for Global Emissions Inventories (NEISGEI) Applying 21st Century Advances in IT Technology to the Conversion of Air Quality Data into Scientific-, Management- and Policy-Relevant Knowledge Stefan Falke Center for Air Pollution Impact and Trend Analysis Washington University, St. Louis, Missouri Brooke Hemming US EPA/National Center for Environmental Assessment Research Triangle Park, North Carolina

  2. NEISGEI Networked Environmental Information System for Global Emissions Inventories • A conceptual framework for the development of a fully integrated, distributed emissions inventory • Tie together data at all spatial (and temporal) scales • Allow merging and manipulation of any and all web-based, along with your own, air quality-relevant data • Serve the entire air quality community: scientists, regulators, policy analysts and the public

  3. A tool for strategic planning of air quality environmental management capacity building projects • A fully populated network will be a resource for identifying important missing datasets in regional, hemispheric and global scale studies

  4. Find/Discover Data 1) Manger asks for data for specific time range and location range AQ data from many distributed and heterogeneous sources Middleware I need summer, 2002, air quality data for California User M E D I A T O R translate request 2) The mediator interprets the query and identifies the relevant datasets These data sets meet your criteria metadata M E D I A T O R list of datasets send results metadata

  5. Access Data 3) Manager selects subset of interest Great! Data sets 1 & 3 are what I need. I would like to view that data Data Middleware User M E D I A T O R translate request 4) Wrappers translate the data into a standard format Here is the data in a format compatible with your software. data W R A P P E R dataset translate request data

  6. View Data 5) Manager would like to see the data in maps and charts Data Middleware User M A P V I E W display data Map Viewed inBrowser Time Series T I M E V I E W display data

  7. Map Fields in Data Sets 6) Manager uses the data mapper to automatically create relationships among heterogeneous dataset field names mapper settings A U T O D A T AM A P P E R mapped data fields map fields frommultiple datasets 7) Using generated field map, manager calculates the ratio between parameter x from data set 2 and parameter y from data set 3 Ratio op. settings R A T I OO P E R A T O R Ratio 2:3 data set calculate ratio

  8. Create Summary Report 8) Manager generates summary reports based on calculated values 2:3 Ratio Report gen. settings R E P O R TG E N E R A T O R Report calculate ratio Other Data

  9. NEISGEI Networked Environmental Information System for Global Emissions Inventories • Network concept introduced at the NSF Digital Government Research Conference 2002 • NSF/EPA Workshop • Issues identified: • Finding data • Integrating data • Quantifying data uncertainty • Resulting projects: • CAREN: A CARB-California AQMDs Network (NSF/CARB/EPA) – data wrapping and integration of same type data • Integrated North America Emissions Inventory (CEC) • Fire, Smoke and Air Quality Network (NSF/EPA/USDA) – data wrapping and integration of heterogeneous data plus tools to support environmental management

  10. US EPA States AQMDs Municipalities Tribes CAREN: The California Air Resources Network Eduard Hovy, Jose-Luis Ambite, Andrew Philpot USC Information Sciences Institute • Automating the integration of heterogeneous databases: • Government information should be timely, thorough, and accurate. • But government agencies often do not share data effectively with each other or the public • Barrier:Technological incompatibilities • Barrier:Regulatory, organizational and financial barriers • Barrier: Fear of litigation due to inappropriate disclosure RPOs ???

  11. Previous Work: Energy Data Collection (NSF Digital Government Program 1999-present) • Employ as reference the Omega ontology: 120,000-term general purpose concept hierarchy • Augment Omega with domain-specific metadata describing energy data series and source characteristics • Use artificial intelligence query planner to provide uniform access to relational and web-based information sources • Successful in incorporating 50,000 data series from six heterogeneous data sources from three different agencies, using semi-automated mappings • Significant manual effort required. • Conclusion: More general methods needed! BLS CEC EIA EIA

  12. } FR: Il y a un crayon jaune sur une grande chaise. } EN: There is a yellow pencil under a big chair. Machine Translation-inspired Induction for Data Mapping Recent advances in Machine Translation (MT) have allowed the automatic induction of cross-naturallanguage correspondences from large multi-lingual corpora parallel French/English sentences (e.g., from Canadian Parliamentary Records) • declared or detected metadata: e.g., field names, database schema, table headers, footnotes • learned data patterns: e.g., domain, range, formats, orthography • topological relationships: e.g., foreign key/subset discovery • terminological reference via ontology or thesaurus Our system will use these techniques to learn cross-database correspondences, based on features such as: two databases denoting correlating records } DB1: Smith, John, 2000 High St,Columbus, OH } DB2: Ohio, Franklin, Smith, 43201, 1108 Emissions inventory databases from the municipal up to hemispheric scales will be integrated into the network automatically using this new technology.

  13. Integrated N. American Emission Inventory Co-investigator:Greg Stella, Alpine Geophysics Air pollutant emission inventories for the US, Canada, and Mexico are compiled and stored using different methods The Commission on Environmental Cooperation (CEC) and the US EPA are supporting a project to develop a prototype web tool for enabling uniform access to distributed emissions data from North American electricity generating power plants. • The prototype tool will help: • Assess data gaps • Identify future IT tools that can aid collaborative emissions inventory project

  14. Fire, Smoke and Air Quality Network Co-investigator:Rudolf Husar, Washington U. The management of fire, smoke, and air quality is tasked to multiple agencies at federal, state, and local levels. The diversity in data collection methods, data reporting requirements, data formatting schemes, data analysis methods, and data presentation create a daunting challenge for the integration of these data. However, integration of these heterogeneous datasets is precisely what is called for by federal and regional organizations in order to derive a more comprehensive understanding of fire, smoke, and air quality. The US Environmental Protection Agency and USDA-Forest Service are partnering agencies

  15. Fire, Smoke and Air Quality Network The network will provide: • uniform access to and cataloging of distributed fire related data and tools • easy-to-use interfaces for exploring fire related resources • powerful tools that contribute to fire related data analysis and modeling • a framework that encourages community-wide contributions The fire, smoke, and air quality network will consist of web-based data access and analysis facilities that are flexible and adaptive in meeting the diverse end use requirements of wildland and prescribed fire managers and air quality planners.

  16. Fire, Smoke, and Air Quality Network Integration of multiple sources of fire-related data aids in planning, management, and post-fire analysis. Map View Time View CIRA ColoState-VIEWS European Space Agency Control Panel Generic Map Server The map and time views are linked so that changing the focus in one automatically updates the other. For example, clicking on a PM2.5 monitor in the map displays the time series at that monitor. NASA SeaWiFS Project

  17. Catalog Wrapper Mediator Spatio-Temporal Data Browser Queries yield slices along the spatial, temporal and parameter dimensions of multidimensional data cubes. OGC-Compliant GIS Services Data Sources Homogenizer Spatial Portrayal Spatial Overlay XDim Data SQL Table OLAP Client Browser GIS DataVector Time-Series Services TimeSlice Satellite Images Time Portrayal Time Overlay Spatial Slice Data Cube Cursor/Controller Maintain Data Find/Bind Data Portray Overlay Render

  18. What’s possible with NEISGEI ? NetworkedAssessment Evn • In the news: “The Bush administration is to hold an Earth Observation Summit in Washington this summer to which it hopes the G8 group of industrialised countries will send cabinet-level representatives. The US is to urge the world's governments to set up an "integrated Earth observation system" to "take the pulse of the planet". It would combine satellite and ground-based observations of weather, climate, vegetation and other environmental indicators.” ~~Financial Times (London), Friday Jun 27 2003 • NEISGEI will make possible: • Simplified, multi-party, cross-border collaboration in air quality management • Simplified development of environmental indicators, with the inclusion of data available on other environmental media

More Related