230 likes | 386 Views
Before the Questionnaire…. Feb 2011 - DELTA Tool available to all SG4 members - Document “Content and User’s Guide” May 2011 – DELTA Tool downloaded by 23 experts from 13 Member States May 2011 – A Questionnaire sent to all SG4 members. Aim of the questionnaire.
E N D
Before the Questionnaire… Feb 2011 - DELTA Tool available to all SG4 members - Document “Content and User’s Guide” May 2011 – DELTA Tool downloaded by 23 experts from 13 Member States May 2011 – A Questionnaire sent to all SG4 members
Aim of the questionnaire • To collect users opinions mainly on: • The Template for Reporting Model Performance • The appropriateness of the statistical indicators and diagram • The installation and the usage of the DELTA Tool • In order to : • highlight important points needing discussion and agreement • Identify weaknesses inside DELTA and ways for improvement
Feedback received June 2011 – Feedback by 10 experts from 9 countries (AT, BE, IT, NL, PT, UK, SE, IE, DK) Models: CAMx, CHIMERE,FRAM, MATCH, AURORA, BELEUROS ADMS, OVL, SMOGSTOP, OSPM • Short presentations (max 10 min) by: • Helena Martins (PT) • David Carruthers (UK) • Stefan Andersson (SE) • Helge Olesen (DK) • Mihaela Mircea, Guido Pirovano (IT)
Template for Reporting In general “The current format is fine, clear, summarizing the essential information about the model’s performance” “Keep the template/ format as short as possible ( 1 page)”
Template for Reporting • Points of discussion: • What should performance criteria and goals depend upon ? • How to select the monitoring stations (representativeness) ? • Are the statistical indicators complete ? Are some of them redundant ? • Normalisation of the Target indicator
Template for Reporting • Points of discussion: • Is the 90% concept for the statistical indicators acceptable ? • Are we excluding some type of models with the Target template ? Are additional Templates (e.g. annual averages) required and what should they include ? • How to make the Template more readable Colors, Titles, Legend ?
The Tool in General • Benchmarking – not only the model but the entire system (including input data) ? • Extend the exploration mode options • Keep it restricted to the FAIRMODE community?
Points for discussion - 1a. 1. What should performance criteria & goals depend upon ? • Pollutant specific - YES • Scale specific -? • Should criteria for local scale be less stringent ?
Points for discussion - 1b. • Time averaging specific ? - foreseen and linked to limit values avg. • Geographically dependent ? We propose Not for criteria, but for goals? • Seasonally dependent ? We propose Not Agreement on setting/updating perfromance crietra&goals through joint exercises
Points for discussion - 2. 2.How to select the monitoring stations ? • How to define station representativeness ? • How to select in case of limited number of stations ? • How to select stations in case of data assimilation ? • To be discussed by SG1 & SG4 • SG1 uploaded a document, based on replies of “request of information”
Points for discussion - 2. 2.SG1 suggestions (…now discussing) • SG1 suggests three quantifiable descriptions of spatial representativeness, all of which will depend on the temporal scale required, e.g. hourly or annual means: • The area (distance) surrounding a monitoring site in which the concentration does not vary by more than a predefined value. • A correlation distance, similar to the ‘range’ used in variograms (suitable for data assimilation methods) • The variability of the concentration in a predefined area surrounding a monitoring site. The required area is likely to correspond to the model resolution. Bruce Denby Discussion Document SG1 http://fairmode.ew.eea.europa.eu/monitoring-modelling-sg1
Points for discussion - 2. 2.How to select the stations in case of data assimilation ? (SG1 – suggestions) • Cross-validation using ‘leave one out’ or other sampling methods (effective for kriging type applications, time consuming for 4Dvar or Kalman filters) • Splitting the dataset into an assimilation set and a validation set • Complication for urban and local scale applications: sufficient numbers of monitoring stations available to create a validation subset may not exist Bruce Denby Discussion Document SG1 http://fairmode.ew.eea.europa.eu/monitoring-modelling-sg1
3. Are the statistical indicators complete ? • “Keeping the same statistical indicators and diagrams for all scales and pollutants is good” (most answers). • BUT • 1. Some redundancy ? • ( e.g. MFE, IOA) • 2. Additional indicator – SigM/SigO • 3. Use of median, percentiles (5 & 95) • 4. Adding all stations/statistics in Exploration mode ?
3. Are the Statistical indicators complete? 5. Should we have indicators showing how far are simulated limit/ thresholds in comparison to the AQD requirements ?
3. Are the Statistical indicators complete? Additional statistical indicators suggested in the replies • MAFE=Mean Absolute Factor Error • Observed and computed mean • Observed and computed quartiles • MB=Mean Bias • RMSE=root Mean Square Error • Sigma ratio • PPEA=Pair Peak Estimation Accuracy • ASPEA=Average Station Peak Estimation Accuracy • AOT40 • SOMO35 • Hitrate
Table with more statistics (without benchmarks) foreseen in Exploration mode 3. Are the statistical indicators complete? Table with more statistics (without benchmarks) in Exploration mode
4.The 90% concept In DELTA Criteria for the statistical indicators apply for 90% of the valid stations (consistent with RDE/RPE) From the replies: “90% is a good choice but DELTA should include the directions regarding the minimum number of stations for model evaluation” to be discussed in the session SG1&SG4
5.Additional Templates ? • 1.How to adapt the Template for • annual averaged limit values (PM10) if : • hourly data are available (now in DELTA) • only one value for the yearly average available ( e.g. OVL model) • 2. What to do with Exceedances, AOT40, SOMO35?
6. Make the Template more readable? • Details on the TARGET diagram – to be agreed • color symbols now note stations, is it more useful to have another statistical indicator, eg. R ? • replace the systematic – unsystematicdivision along x-axis by another indicator ( e.g. R <0.65 and R >0.65) ? • Make the Title application specific highlighting pollutant, goal of the model evaluation • Modify the Legend with the Target criteria and goal
Installing and Using the Tool • Installation : generally no problems, WinVista, Win7, Linux (Ubuntu) • Instructions should be updated, consistent in the names • Raise the attention to the utilities programs • Other open source programs should be considered for the utilities programs
Using the Tool – Data formats • Introduce one and the same format for monitored and observed data ? • Make possible the accommodation of wider range of formats • leap year treatment ( now not possible)
Running DELTA Suggested improvements - 1 • make error messages more useful for debugging • display values of statistical indicators in connection with the diagram ( e.g. R and regression line formula in the scatter • improve titles and legends of the graphs, always display the name of the species • Lat min- lat max; lon min – lon max bounds could be added as a station selection criteria; the same could be done for station altitude • allow identification of the stations on the geo-map • add box whisker plot to compare mod vs. obs. Distributions (similar to quantile-quantile plots)
Running DELTA Suggested improvements - 2 • Box-whisker could be used also to plot the distribution of different indicators (e.g. FB,FE, R etc..) for the same scenario; different scenarios for the same indicator (e.g. PM10 FB for different simulations…) • A cut-off threshold on observed data could introduced (species dependent) to skip pairs having unrealistically low observed values that can alter normalized indicator. • Multi-option explanation requires more clarification • Target Diagram needs an explaining example in the User’s Guide