360 likes | 506 Views
National Hurricane Center 2008 Forecast Verification. James L. Franklin Branch Chief, Hurricane Specialists Unit National Hurricane Center 2009 Interdepartmental Hurricane Conference. Verification Rules.
E N D
National Hurricane Center 2008 Forecast Verification James L. Franklin Branch Chief, Hurricane Specialists Unit National Hurricane Center 2009 Interdepartmental Hurricane Conference
Verification Rules • Verification rules unchanged for 2008. Results presented here in both basins are final. • System must be a tropical or subtropical cyclone at both forecast initial time and verification time. All verifications include depression stage except for GPRA goal verification. • Special advisories ignored (original advisory is verified. • Skill baselines are recomputed after the season from operational compute data. Decay-SHIFOR5 is the intensity skill benchmark.
2008 Atlantic Verification VT NT TRACK INT (h) (n mi) (kt) ============================ 000 373 5.7 1.8 012 346 27.7 7.1 024 318 48.3 10.4 036 288 68.6 12.1 048 261 88.2 13.6 072 221 126.914.6 096 177 159.813.8 120 149 191.817.2 Values in green exceed all-time records. * 48 h track error for TS and H only (GPRA goal) was 87.5 n mi, just off last year’s record of 86.2.
Atlantic Track Errors vs. 5-Year Mean Official forecast was better than the 5-year mean, even though the season’s storms were “harder” than normal.
Atlantic Track Error Trends Errors have been cut in half over the past 15 years. 2008 was best year ever.
Atlantic Track Skill Trends 2008 was the most skillful year on record at all time periods.
Atlantic 5-Year Mean Track Errors Track errors increase by about 50-55 n mi per day. 48 hr mean error below 100 n mi for the first time. Intensity errors level off because intensity is a much more bounded problem. New 5-yr means slightly larger than last year’s.
OFCL Error Distributions and Cone Radii Only modest reductions in the cone radii.
2008 Track Guidance Official forecast performance was very close to the consensus models. Best model was ECMWF, which was so good that it as good or better than the consensus. BAMD was similar to the poorest of the 3-D models (UKMET). AEMI excluded due to insufficient availability (less than 67% of the time at 48 or 120 h).
2008 Track Guidance Examine major dynamical models to increase sample size. ECMWF best at all time periods (as opposed to last year, when it was mediocre). GFDL also better than last year (and better than HWRF). As we’ve seen before, GFDL skill declines relatively sharply at days 4-5. NOGAPS and GFNI again performed relatively poorly. GFNI upgrades were delayed.
GFDL-HWRF Comparison Much larger sample than last year shows that the HWRF is competitive with, but has not quite caught up to the GFDL yet. Consensus of the two (mostly) better than either alone.
Guidance Trends Return to more “traditional” relationships among the models after the very limited sample of 2007.
Guidance Trends Relative performance at 120 h is more variable, although GFSI has been strong every year except 2005. GFDL is not a good performer at the longer ranges.
Consensus Models Best consensus model was TVCN, the variable member consensus that includes EMXI. It does not appear that the “correction” process was beneficial.
Consensus Models Third year in a row AEMI trailed the control run. Multi-model ensembles remain far more effective for TC forecasting. ECMWF ensemble mean is also not as good as the control run (EEMN v EMX).
Atlantic Intensity Errors vs. 5-Year Mean OFCL errors in 2008 were at or below the 5-yr means, but the 2008 Decay-SHIFOR errors were also at or below their 5-yr means, so not much change in skill.
Atlantic Intensity Error Trends No progress with intensity.
Atlantic Intensity Skill Trends Little net change in skill over the past several years.
2008 Intensity Guidance OFCL adds most value over guidance at shorter ranges. Modest high bias in 2008 (2007 was a low bias). Split decision between the dynamical vs statistical models. New ICON consensus, introduced this year, was very successful, beating OFCL except at 12 h.
2008 Intensity Guidance HWRF competitive through 3 days, with issues at the longer times. Although the sample was smaller, there was a hint of this last year as well. Cannot shut GFDL off yet!
2008 Intensity Guidance When the complication of timing landfall/track dependence is removed, OFCL performs better relative to the guidance. Dynamical models are relatively poor performers.
2008 East Pacific Verification VT NT TRACK INT (h) (n mi) (kt) ============================ 000 311 10.7 1.4 012 276 30.9 6.0 024 240 47.5 9.8 036 206 63.7 11.9 048 176 78.012.9 072 124 107.615.7 096 84 138.8 17.6 120 52 161.4 18.0 Values in green tied or exceeded all-time lows.
EPAC Track Error Trends Since 1990, track errors have decreased by 30%-50%.
EPAC Track Skill Trends Skill continues to improve.
2008 Track Guidance Official forecast beat the TVCN consensus at later periods; beat each individual model. OFCL far superior to model guidance at longer time periods (also beat consensus at 4-5 days last year). EMXI, EGRI, AEMI, FSSE, GUNA, TCON excluded due to insufficient availability.
2008 Track Guidance Relax selection criteria to see all major dynamical models. ECMWF best overall. OFCL clearly doing something right.
EPAC Intensity Error Trends Perhaps just a hint of improvement?
EPAC Intensity Skill Trends Skill does seem to be inching upward…
2008 Intensity Guidance OFCL mostly beat the individual models and even the consensus at some time periods. OFCL wind biases turn sharply negative at 96-120 h, which was also true in 2007. Statistical models outperformed dynamical models. This year, DSHP beat LGEM (flip from 2007).
2007-08 Genesis Forecast VerificationLead-Time Analysis for Disturbances that became Tropical Cyclones Atlantic Eastern North Pacific
Genesis Bins for 2009 NHC will issue operational public quantitative/categorical genesis forecasts in 2009 and include categorical forecasts in the text Tropical Weather Outlook.
Summary: Atlantic Track • OFCL track errors set records for accuracy at all time periods. Errors continue their downward trends, skill was also up. • OFCL track forecast skill was very close to that of the consensus models, was beaten by EMXI. • EMXI and GFDL provided best dynamical track guidance. UKMET, which performed well in 2007, did not do so in 2008. NOGAPS lagged again. • HWRF has not quite attained the skill of the GFDL, but is competitive. A combination of the two is better than either alone. • Best consensus model was TVCN (variable consensus with EMXI). Multi-model consensus – good. Single model consensus – not so good. Not a good year for the “corrected consensus” models.
Summary: Atlantic Intensity • OFCL errors in 2008 were below the 5-yr means, but the 2008 Decay-SHIFOR errors were also lower than its 5-yr mean, so no real change in skill. • Still no progress with intensity errors; OFCL errors have remained unchanged over the last 20 years. Skill has been relatively flat over the past 5-6 years. • Split decision between the statistical and dynamical guidance. Simple four-model consensus (DSHP/LGEM/HWRF/GHMI) beat everything else, including the corrected consensus model FSSE.
Summary: East Pacific Track • OFCL track errors set records at 24-72 h. • OFCL beat individual dynamical models, and also beat the consensus at 4 and 5 days. • GFDL, HWRF, and ECMWF were strong performers, although ECMWF had trouble holding on to systems through 5 days. • There continues to be a much larger difference between the dynamical models and the consensus in the eastern North Pacific than there is in the Atlantic, which is suggestive of different error mechanisms in the two basins.
Summary: East Pacific Intensity • OFCL mostly beat the individual models and even the consensus at 12 and 36 h. OFCL wind biases turned sharply negative at 96-120 h, which was also true in 2007. • Best model at most time periods was a statistical model. DSHP provided most skillful guidance overall. HWRF continued to have trouble in this basin. Four-model intensity consensus performed very well.