1 / 53

Model Performance Metrics, Ambient Data Sets and Evaluation Tools

Model Performance Metrics, Ambient Data Sets and Evaluation Tools. Gail Tonnesen, Chao-Jung Chien, Bo Wang Youjun Qin, Zion Wang, Tiegang Cao. USEPA PM Model Evaluation Workshop, RTP, NC February 9-10, 2004. Acknowledgments.

cree
Download Presentation

Model Performance Metrics, Ambient Data Sets and Evaluation Tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model Performance Metrics, Ambient Data Sets and Evaluation Tools Gail Tonnesen, Chao-Jung Chien, Bo Wang Youjun Qin, Zion Wang, Tiegang Cao USEPA PM Model Evaluation Workshop, RTP, NC February 9-10, 2004

  2. Acknowledgments • Funding from the Western Regional Air Partnership Modeling Forum and VISTAS. • Assistance from EPA and others in gaining access to ambient data. • 12km Plots and analysis from Jim Boylan at State of Georgia

  3. Outline • UCR Model Evaluation Software • Problems we had to solve • Choice of metrics for clean conditions. • Judging performance for high resolution nested domains.

  4. Motivation • Needed to evaluate model performance for WRAP annual regional haze modeling: • Required a very large number of sites, and days. • For several different ambient monitoring networks • Evaluation would be repeated many times: • Many iterations on the “base case” • Several model sensitivity/diagnostic cases to evaluate • Limited time and resources were available to complete the evaluation.

  5. Solution • Develop model evaluation software to: • Compute 17 statistical metrics for model evaluation. • Generate graphical plots in a variety of formats: • Scatter Plots • all sites for one month • All sites for full year • One site for all days • One day for all sites • Time series for each site

  6. IMPROVE (The Interagency Monitoring of Protected Visual Environments) CASTNET (Clean Air Status and Trend Network) EPA’s AQS (Air Quality System) database EPA’s STN (Speciation Trends Network) NADP (National Atmospheric Deposition Program) SEARCH daily & hourly data PAMS (Photochemical Assessment Monitoring Stations) PM Supersites. Ambient Monitoring Networks

  7. Number of Sites Evaluated by Network

  8. PAMS O3, NOx VOCs EPA PM Sites IMPROVE PM25, PM10 Sp. PM25, Visibility PM25, PM10 Other monitoring stations from state, local agencies CASTNet HNO3, NO3, SO4, O3, NOx, CO, Pb, etc O3, SO2 Overlap Among Monitoring Networks AQS (AIRS)

  9. Specify how to compare model with data for each network. Unique species mapping for each air quality model. Species Mapping

  10. Model vs. Obs. Species Mapping Table

  11. No EPA guidance available for PM. Everyone has their personal favorite metric. Several metrics are non-symmetric about zero causing over predictions to be exaggerated compared to under-predictions. Is coefficient of determination (R2) a useful metric? Recommended Performance Metrics?

  12. Statistical measures used in model performance evaluation

  13. Mean Normalized Bias (MNB) from -100% to + inf. Normalized Mean Bias (NMB) from -100% to + inf. Fractional Bias (FB) from –200% to +200% Fractional Error (FE) from 0% to +200% Bias Factor (Knipping ratio) is MNB + 1, reported as a ratio, for example: 4:1 for over prediction 1:4 for under-prediction. Most Used Metrics

  14. UCR Java-based AQM Evaluation Tools

  15. UCR Java-based AQM Evaluation Tools

  16. SAPRC99 vs. CB4 NO3; IMPROVE cross comparisons

  17. SAPRC99 vs. CB4 SO4; IMPROVE cross comparisons

  18. Time series plot for CMAQ vs. CAMx at SEARCH site – JST (Jefferson St.)

  19. 1 With 60 ppb ambient cutoff 2Using 3*elemental sulfur 3No data available in WRAP domain 4Measurements available at 3 sites

  20. Problem: Model performance metrics and time-series plots do not identify cases where the model is “off by one grid cell”. Process ambient data in the I/O API format so that data can be compared to model using PAVE. Viewing Spatial Patterns

  21. IMPROVE SO4, Jan 5

  22. IMPROVE SO4, June 10

  23. IMPROVE NO3, Jan 5

  24. IMPROVE NO3, July 1

  25. IMPROVE SOA, Jan 5

  26. IMPROVE SOA, June 25

  27. PAVE plots qualitatively indicate error relative to spatial patterns, but do we also need to quantify this? Wind error of 30 degrees can cause model to miss peak by one or more grid cells. Interpolate model using surrounding grid cells? Use average of adjacent grid cells? Within what distance? Spatially Weighted Metrics

  28. Many plots and metrics – but what is the bottom line? Need to stratify the data for model evaluation Evaluate seasonal performance. Group by related types of sites. Judge model for each site or similar groups. How best to group or stratify sites? Want to avoid wasting time analyzing plots and metrics that are not useful. Judging Model Performance

  29. 12km vs. 36km, Winter SO4

  30. 12km vs. 36km, Winter NO3

  31. Comparing performance metrics is not enough: Performance metrics show mixed response. Possible for better model to have poorer metrics Diagnostic analysis is needed to compare nested grid to coarse grid model. Recommended Evaluation for Nests

  32. Some sites had worse metrics for 12km. Analysis by Jim Boylan comparing differences in 12 km and 36 km results showed major effects from: Regional precipitation Regional transport (wind speed & direction) Plume definition Example Diagnostic Analysis

  33. Sulfate Change (36 km – 12 km)

  34. Wet Sulfate on July 9 at 01:00 36 km Grid 12 km Grid

  35. Regional Transport (Wind Speed)

  36. Sulfate on July 9 at 05:00 36 km Grid 12 km Grid

  37. Sulfate on July 9 at 06:00 36 km Grid 12 km Grid

  38. Sulfate on July 9 at 07:00 36 km Grid 12 km Grid

  39. Sulfate on July 9 at 08:00 36 km Grid 12 km Grid

  40. Plume Definition and Artificial Diffusion

  41. Sulfate on July 10 at 00:00 36 km Grid 12 km Grid

  42. Sulfate on July 10 at 06:00 36 km Grid 12 km Grid

  43. Sulfate on July 10 at 09:00 36 km Grid 12 km Grid

  44. Sulfate on July 10 at 12:00 36 km Grid 12 km Grid

  45. Sulfate on July 10 at 16:00 36 km Grid 12 km Grid

  46. Sulfate on July 10 at 21:00 36 km Grid 12 km Grid

  47. Sulfate on July 11 at 00:00 36 km Grid 12 km Grid

  48. Sulfate Change (36 km – 12 km)

More Related