280 likes | 448 Views
Statistical Performance of Dust ID Model, ‘02-’04 Owens Lake Dust Mitigation Project. Bishop, September 28-29 2005. Los Angeles Department of Water & Power CH2M HILL and Air Sciences. Goals of Analysis To evaluate the statistical performance of Dust ID model, and
E N D
Statistical Performance of Dust ID Model, ‘02-’04 Owens Lake Dust Mitigation Project Bishop, September 28-29 2005 Los Angeles Department of Water & Power CH2M HILL and Air Sciences
Goals of Analysis To evaluate the statistical performance of Dust ID model, and To compare the Dust ID model performance with EPA air quality modeling guidelines
Data Evaluation based on period July ’02 through June ’04 75th percentile District k-factors Averaging periods: 1-hour: Basis of k-factor calculation 24-hours: Basis of compliance determination
Approaches Paired and unpaired comparisons EPA air quality modeling guidelines 1-hr vs. 24-hr performance
Approaches Paired and unpaired comparisons EPA air quality modeling guidelines 1-hr vs. 24-hr performance
Dirty Socks Dirty Socks Unpaired: Q-Q plot Paired: Scatter plot
Dirty Socks MonitorR2 (1-hr) Lone Pine 0.00 Keeler 0.02 Flat Rock 0.05 Shell Cut 0.18 Dirty Socks 0.13 Olancha 0.00 Results: High range† R2~0.13 †Sum(TEOM,Model)>150
Dirty Socks MonitorR2 (1-hr) Lone Pine 0.01 Keeler 0.02 Flat Rock 0.03 Shell Cut 0.02 Dirty Socks 0.02 Olancha 0.03 Results: Low range† R2~0.03 †TEOM≤150
Approaches Paired and unpaired comparisons EPA air quality modeling guidelines Three reports: 1984, 1991, 1992 1-hr vs. 24-hr performance
Concept: Accuracy vs. Precision Accuracy evaluates how closely model reproduces observations Precision evaluates how exact and reproducible model estimates are
EPA report, 1984 “…definition of performance evaluation is to be accomplished in terms if their relevance to the regulatory application” Interpretation: Argues for the use paired evaluation methods, since paired data are used to calculate k-factors Source: “Interim Procedures for Evaluating Air Quality Models (Revised), 1984, p. 29
EPA report, 1991 Unpaired comparison (maximum) Model fails criteria, over predicts Paired comparisons (TEOM>100mg m-3): Model fails criteria, both in terms of accuracy and precision Source: “Guideline for Regulatory Application of Urban Airshed Model, 1991
~PRECISION Under prediction Over prediction ~ACCURACY
EPA report, 1992 High range (Top-25): Performance marginal Low range (TEOM<500mg m-3): Model fails criteria Source: “Protocol for Determining the Best Performing Model, 1992
Maximum PM10 1-hr Model: ~108,000 1-hr TEOM: ~49,000 24-hr Model: ~21,000 24-hr TEOM:~11,000 ~PRECISION Under prediction Over prediction ~ACCURACY
Maximum PM10 1-hr Model: ~77,000 1-hr TEOM: ~500 24-hr Model: ~5,700 24-hr TEOM: ~500 ~PRECISION Under prediction Over prediction ~ACCURACY
Approaches Paired and unpaired comparisons Exceedances of 24-hour standard 1-hr vs. 24-hr performance
Does 1-Hour Performance Matter? Yes, because erroneously high 1-hour values can trigger an exceedance of the 24-hour standard. This is not apparent based on EPA evaluation guidelines. The factor-of-two test shows that DS is marginal, yet this is composed of many “problem” 24-hour values.
Mischaracterized Source Areas ? Missed Source Areas
Dirty Socks Results: 24-hr standard Predicted days>150 mg m-3 when observed ≤150: Dirty Socks 34% in error (17 out of 50) All TEOMs 36% in error (50 out of 137) N=17 N=33
Can Model Refinement Improve Model Performance? Yes, by eliminating some of the outliers, both high predicted/low observed and low predicted/high observed.
Final Quotes (EPA,1992): “…a factor–of-two is a reasonable target for model performance that should be achieved before a model is used in refined regulatory analysis” (EPA,1991): “…poor model performance may necessitate delaying model applications until further diagnostic testing and quality assurance checks are performed”