1 / 26

Benchmark database based on surrogate climate records

Benchmark database based on surrogate climate records. Victor Venema. Goals of COST-HOME working group 1. Literature survey Benchmark dataset Known inhomogeneities Test the homogenisation algorithms (HA). Benchmark dataset. Real (inhomogeneous) climate records Most realistic case

jlayton
Download Presentation

Benchmark database based on surrogate climate records

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Benchmark databasebased on surrogate climate records Victor Venema

  2. Goals of COST-HOME working group 1 • Literature survey • Benchmark dataset • Known inhomogeneities • Test the homogenisation algorithms (HA)

  3. Benchmark dataset • Real (inhomogeneous) climate records • Most realistic case • Investigate if various HA find the same breaks • Good meta-data • Synthetic data • For example, Gaussian white noise • Insert know inhomogeneities • Test performance • Surrogate data • Empirical distribution and correlations • Insert know inhomogeneities • Compare to synthetic data: test of assumptions

  4. Creation benchmark – Outline talk • Start with homogeneous data • Multiple surrogate and synthetic realisations • Mask surrogate records • Add global trend • Insert inhomogeneities in station time series • Published on the web • Homogenize by COST participants and third parties • Analyse the results and publish

  5. 1) Start with homogeneous data • Monthly mean temperature and precip (France) • Later also daily data • Later maybe other variables • Homogeneous • No missing data • Detrended • 20 to 30 years is enough for good statistics • Longer surrogates are based on multiple copies • Larger scale correlations are small • Distribution well defined with 30a data • Generated networks are: 50, 100 and 200 a long

  6. 2) Multiple surrogate realisations • Multiple surrogate realisations • Temporal correlations • Station cross-correlations • Empirical distribution function • Annual cycle removed before, added at the end • Number of stations between 5 and 20 • Cross correlation varies as much as possible • Show plot temporal structure of surrogates • Show plot cross correlations

  7. One station – with annual cycle

  8. One station – anomalies

  9. Multiple stations – 10 year zoom

  10. Multiple stations – 10 year zoom

  11. IAAFT algorithm smoothes jumps

  12. 3) Mask surrogate records • Beginning of records jagged (rough) • Linear increase in number of stations • Last station after 25% of full time • End of record all stations are measuring • Influence of jagged edge on detection and correction • But trend is also increasing in time (i.e. different)! • Is this a problem?

  13. 3) Mask surrogate records

  14. 4) Add global trend • NASA GISS GISS Surface Temperature Analysis (GISTEMP) by J. Hansen • Global mean surface temperature • Last year of any surrogate network is 1999

  15. 5) Insert inhomogeneities in stations • Random breaks (implemented) • Frequency of breaks 1/20a, 1/40a • Size constants for temperature: 0.25, 0.5, 1.0 °C • Size factors for rain: 0.8, 0.9, 1.1, 1.2 • Simultaneous breaks • Frequency of breaks 1/50a • In 10 to 50 % of network

  16. 5) Insert inhomogeneities in stations • Outliers • Frequency: 1 – 3 % • Size: 99 and 99.9 percentiles • Local trends (only temperature) • Linear increase or decrease in one station • Duration: 30, 60a • Maximum size: 0.2 to 1.5 °C • Frequency: once in 10 % of the stations

  17. 6) Published on the web • Inhomogeneous data will be published on the COST-HOME homepage • Everyone is welcome to download and homogenize the data

  18. 7) Homogenize by participants • Return homogenised data • Should be in COST-HOME file format (next slide) • Return break detections • BREAK • OUTLI • BEGTR • ENDTR • Multiple breaks at one data possible

  19. 7) Homogenize by participants • COST-HOME file format: http://www.meteo.uni-bonn.de/ venema/themes/homogenisation/costhome_fileformat.pdf • For benchmark & COST homogenisation software • One data and one quality-flag file per station • Filename: variable, resolution, quality, station • ASCII network-file with station names • ASCII break-file with dates and station names

  20. COST-HOME file format – monthly data

  21. COST-HOME file format – network file

  22. 8) Analyse the results • Detailed analysis will be performed in the working groups • Detection • Correction • Daily data homogenisation • Synthetic and surrogate data • RMS Error • No. breaks detected (function of size) • Application: reduction in the scatter in the trends • Performance difference between synthetic (Gaussian, white noise) and surrogate data

  23. Work in progress • Monthly precipitation • Implement some inhomogeneity types • Daily data: other inhomogeneities • Synthetic data (Gaussian white noise) • More input data! • Agree on the details of the benchmark • Next meeting? • Set deadline for the availability benchmark • Deadline for the return of the homogeneous data

  24. Questions • Ideas for a better benchmark • For example, for other inhomogeneities, constants • Types of inhomogeneities for daily data • Automatic processing • In the order of 100 networks

  25. 7) Homogenize by participants • COST-HOME file format: http://www.meteo.uni-bonn.de/ venema/themes/homogenisation/costhome_fileformat.pdf • For benchmark & COST homogenisation software • Regular ASCII matrix (columns) • One data and one quality-flag file per station • Yearly, daily, subdaily data: columns for time, one for data • Monthly data: year column, 12 columns for data • Filename: variable, resolution, quality, station • ASCII network-file with station names • ASCII break-file with dates and station names

More Related