200 likes | 308 Views
Data mining in the joint D-PHASE and COPS archive. Mathias W. Rotach, Marco Arpagaus Manfred Dorninger, Christoph Hegg, Andrea Montani, Roberto Ranzi, Volker Wulfmeyer COPS Workshop Hohenheim, 27-29 January 2008. Data mining.
E N D
Data mining in the joint D-PHASE and COPS archive Mathias W. Rotach, Marco Arpagaus Manfred Dorninger, Christoph Hegg, Andrea Montani, Roberto Ranzi, Volker Wulfmeyer COPS Workshop Hohenheim, 27-29 January 2008
Data mining • Data mining is the application of statistical -mathematical methods on a data set with the goal of pattern recognition. Thereby in particular those methods will be applied, which have excellent asymptotic run times. Thus data mining is often applied in connection with large data sets. Wikipedia
Lots of model runs • Lots of data What is available? • 7 probabilistic and 23high-resolution deterministic atmospheric models • 7 coupledhydrological models (deterministic and probabilistic)
What is available? • 6 months, 180 days, 4380 hours • > 10‘000 model runs (only atmospheric) • > 20‘000‘000 warnings • > 50‘000‘000 graphic files (only atmospheric) • > 50‘000‘000 model fields (COPS domain, JJA) from high-resolution models All stored at DA in Hamburg ....and made possible through collaboration with COPS
What can be done? • Evaluate numerical models using COPS data--> find model deficiencies--> investigate key processes in orographic precipitation--> case studies • Expect many examples
What can be done? • Evaluate numerical models in D-PHASE Domain • Example: COSMO-LEPS • SYNOP reports over the MAP D-PHASE (about 470 stations each day) • 12h-cumulated precipitation • Courtesy Chiara Marsigli and Andrea Montani, ARPA-SIM
2007 COSMO-LEPS Verification Brier Skill Score, JJA, D-PHASE domain, 12h, 10mm --> BSS --> forecast range
COSMO-LEPS Verification ROC area, JJA, D-PHASE domain, 12h, 10mm --> ROC area --> forecast range
Ensemble System - biases • COSMO-LEPS reforecasts • 30 yrs for each day--> Correct for systematic model errors (spatial & temporal) --> Calibrate over the whole model domain --> Increase the skill (reliability) of the forecast --> Infer on extremity of the forecast • Courtesy Felix Fundel, MeteoSwiss & NCCR Climate
COSMO-LEPS reforecast New warning index: Probability to exceed a Return Period, PRP Approach: - Use the model climatology to find a return level for a certain return period (for each grid point) - Find number of forecasts exceeding the return level - Give a probability to exceed the return level/period (PRP) (- Use Extreme Value Aanalysis for very rare events (e.g. fit of a GPD)) Syntax: PRPx = Probab. to exceed an event occurring with a return period according to the x-quantile Example: PRP0.8 = event occurs every 5th day
Verification Verification of the PRPx Domain Switzerland Verification on CLEPS 10km x 10km grid PRPx from observations necessary Compare uncalibrated to calibrated PRPx Observation Climatology • 24h total precipitation 1971-2000 • >440 stations in Switzerland • interpolated on CLEPS grid • at least 3 Stations/grid point • search radius 2-4 grid points • distance weighting
Verification Relative improvement
Verification Relative improvement
Verification Relative improvement
What can be done? • Compare different high-resolution models • Example: Radar data composit over Switzerland • Summer (JJA) 2007 • Courtesy Felix Ament, MeteoSwiss
Verification of precipitation amount Switzerland, summer (JJA) Individual warning region, summer (JJA) Individual warning region, 3h resolution, summer (JJA)
Warngings – yellow level (yellow = frequency 10x per year) COSMO-2 and but
What can be done? • Investigate properties of ensemble systems @ small scales • Single model vs. multi model--> ‚micro PEPS‘ (Michael Denhard, DWD) • Initial perturbations--> lagged ensembles--> physical perturbation--> spread-skill relations • Predictability (convection)
What can be done? • All the hydrological components..... Countesy S. Jaun ETHZ
Summary • Lots of possibilities • Let‘s just do it!