320 likes | 412 Views
Verification and evaluation of standard and value-added services. Barbara Brown 1 , Mike Kay 2 , and Jennifer Mahoney 2 1 National Center for Atmospheric Research 2 NOAA Forecast Systems Laboratory Boulder, Colorado USA September 2005. Outline.
E N D
Verification and evaluation of standard and value-added services Barbara Brown1, Mike Kay2, and Jennifer Mahoney2 1National Center for Atmospheric Research 2NOAA Forecast Systems Laboratory Boulder, Colorado USA September 2005
Outline • Quality Assessment in the Aviation Weather Research Program • Technology transfer process • Real-Time Verification System • New developments in user-oriented verification
The Quality Assessment Product Development Team (QA PDT) • One of several PDTs in the US/FAA Aviation Weather Research Program • Provides assessments of new aviation weather products as part of the process of transitioning them to operations • Performance compared to performance of products that are currently operational • Develops, tests, and implements new verification approaches for aviation weather forecasts • Compensate for observational limitations • Approaches to employ new observations • Provide statistically valid verification information and information from a “user” (rather than forecaster) perspective
AWRP Products evaluated by the QA PDT • Ceiling and Visibility • In-flight icing • In-flight turbulence • Cloud-top height • Convection
Transfer to operations • Aviation Weather Technology Transfer (AWTT) • FAA/NWS process • Graduated process from IDEA to OPERATIONS • AWTT stages • D2:Test • D3: Experimental • D4: Operational guidance • In-depth quality assessment required for D3 and D4
QA PDT Verification systems • Off-line system (NCAR) • In-depth post-analysis • Development of new approaches • Evaluation using “special” datasets (e.g., research aircraft data), for AWTT and other studies • Real-Time Verification System – RTVS (FSL) • On-going monitoring • Development and implementation of new approaches • AWTT evaluations • Both systems have the ability to evaluate current operational and advanced aviation weather products
Purpose for RTVS • Assemble an interactive, flexible, and easy-to-use verification system for access to statistical results that can be used to support, • transition aviation products into National Weather Service (NWS) operations, • on-going assessment of NWS forecast quality, and • intuitive feedback to forecasters through innovative displays that provide information of forecast quality.
Users • Operational agencies supported • National Weather Service: Aviation Services • Aviation Weather Center • Air Traffic Control Systems Command Center • U.S. Airline carriers • Primarily - • Aviation Weather Research Program PDTs
Background Initial RTVS system developed by 1997 in the Aviation Division of Forecast Systems Laboratory Primary funding has been provided by the FAA Aviation Weather Research Program Initial was transferred to operational environment at the Aviation Weather Center (AWC). System has matured to support a diverse set of forecasts with an emphasis on aviation Fully-automated; runs 24/7 without human intervention
Challenges Understanding what information is relevant. Displays (e.g., maps) are highly relevant in real- time. Other tools may be more useful in longer-term settings Clearly defining user requirements from a broad range of user groups. Defining adequate hardware and software resources for implementing user requirements. Is the user's request feasible? Properly managing the data for ingest, storage, and access. Providing sufficient training and documentation
RTVS has an aviation focus but also includes other areas Supports numerous forecast types including human-generated, numerical models, and algorithms from both operational and experimental settings
Example session of a user generating a time series of Critical Success Index (CSI) for two different products for an arbitrary date range
Verification The actual comparison of forecast and observations Components of RTVS Data Ingest Data Pre-processing Data Storage and Archive Analysis and Visualization
RTVS Architecture Data Ingest Relational Database Scheduler Web Interface Computational Cluster 10 node/20 CPU cluster Redundant ingest, scheduling, database, and web servers Currently process more than 10 Gb per day Online storage capacity of nearly 7 Tb
Future Direction • Continue to support the AWRP mission and goals • QA PDT tasks • Transition of new aviation products to NWS operations • Incorporate new verification methodologies • Transition RTVS-Classic to the NWS • Develop RTVS-NextGen www-ad.fsl.noaa.gov/fvb/rtvs/
Advanced approaches for evaluation of aviation weather forecasts Why? • Compensate for non-perfect verification circumstances – e.g., no independent data for evaluation • Provide more meaningful information about uncertainty in verification measures • Provide information that is operationally meaningful
Non-independent verification information • Example: Verification of analysis product (or diagnosis) • Ex: Ceiling and visibility analysis • All relevant observations are used for the analysis • Solution: Cross-validation approach • Randomly select small sample (10-20%) of observations to “reserve” for verification • Repeat many times • Compute verification statistics using combining all sets of reserved forecast/observation pairs • A “cross-validation” approach
Uncertainty in verification measures • Verification measures have uncertainty due to • Observation errors • Sampling • Confidence intervals allow you to say one system is “significantly better” than another • “Bootstrap re-sampling” is a useful (and easy) way to estimate these intervals
Operationally meaningful verification • Many standard verification approaches do not measure things that are meaningful for users • Forecast usefulness generally is not measured by traditional verification measures • Traditional scores are often difficult to interpret Goal: Develop approaches to measure the quality of forecasts in the context of aviation users • Information to help users • Information to help forecasters and forecast developers
F O Good forecast or Bad forecast? POD = 0; FAR = 1; CSI = 0
F O A B Flight Route Good forecast! O
F O A B Flight Route Bad forecast! O Skill for this forecast is POSITIVE by standard approaches; the previous forecast would have NO skill
Approaches to operationally relevant convective verification • Combine forecasts with air traffic information • Spatial diagnostic verification approaches • Examine “optimal” forecast • Object-based verification approach
Use of air traffic data • Develop approaches to combine weather forecasts and observations with air traffic information • Utilize database of flight plans and tracks • Evaluate deviations related to forecast and observed regions
Examine how far forecasts are from the “best” POD Old: 0.23 New 0.45 FAR Old: 0.85 New 0.68 CSI Old: 0.10 New 0.23
“Object-based” verification approach • Identify and match meaningful “objects” in the forecast and observed field • Measure and compare attributes of the forecast and observed objects that are meaningful operationally Ex: E-W displacement
Verification of new and enhanced aviation weather products • Verification is integral to the forecast process and the forecast development process • Requires care due to characteristics of observations and forecasts (not like verifying standard meteorological variables) • Real-time systems provide useful information to forecasters and users • New advanced approaches promise improved information in the context of the use of the forecasts
Resources • RTVS www-ad.fsl.noaa.gov/fvb/rtvs/ • WMO WWRP/WGNE Joint Working Group on Verification • International working group on forecast verification • Web page http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.html • Discussion group
JWGV website http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.html Google search: “Forecast>verification” gives this as the first prompt Includes FAQ, short papers, workshop presentations, numerous links