1 / 20

Quality control and homogenization of the COST benchmark dataset

Quality control and homogenization of the COST benchmark dataset. Petr Štěpánek Pavel Zahradníček Czech Hydrometeorological Institute, regional office Brno. e-mail: p etr.stepanek @chmi.cz zahradnicek@chmi.cz. Processing before any data analysis. Software AnClim, ProClimDB.

tariq
Download Presentation

Quality control and homogenization of the COST benchmark dataset

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quality control and homogenization of the COST benchmark dataset Petr Štěpánek Pavel Zahradníček Czech Hydrometeorological Institute, regional office Brno e-mail: petr.stepanek@chmi.cz zahradnicek@chmi.cz

  2. Processing before any data analysis Software AnClim, ProClimDB

  3. Data Quality Control Finding Outliers Two main approaches: • Using limits derived from interquartile ranges(time series) • comparing values to values of neighbouring stations(spatial analysis)

  4. Creating Reference Series • for monthly data • weighted/unweighted mean from neighbouring stations • Power of weight is 1 for temperature (1/d) and 3 for precipitation (1/d3) - IDW • criterions used for stations selection (or combination of it): • best correlated / nearest neighbours (correlations – from the first differenced series) • limit correlation, limit distance • limit difference in altitudes • neighbouring stations series should be standardized to test series AVG and / or STD/ Atlitude • Comparison with „expected“ value – (calculated as weighted mean from standardized neighbours values)

  5. Example: Proposed list of stations used for creating reference series

  6. „Outliers“ temperature sur1, network 1 • detected 12 „outliers“ • 10 errors for station 150 (5 in year 1909) • Mean difference between measured outliers and expect value is about 6°C

  7. „Outliers“ precipitation sur1, network 1 • detected 8 „outliers“ • Mean difference between measured outliers and expect value is about 180 mm • Max difference is 313 mm (station 4307012, 8/1971)

  8. Months, seasons, year

  9. Creating Reference Series • for monthly, • weighted/unweighted mean from neighbouring stations • criterions used for stations selection (or combination of it): • best correlated / nearest neighbours (correlations – from the first differenced series) • limit correlation, limit distance • limit difference in altitudes • neighbouring stations series should be standardized to test series AVG and / or STD (temperature - elevation, precipitation - variance) - missing data are not so big problem then

  10. Relative homogeneity testing • Test series – 40 years • Longer series – divide to the more section with overlay 10 years • Tests: SNHT, Bivarite, t-test

  11. Example of the detected breaks – temperature, sur1, network 1 - Detected 63 breaks Station no. 50, break 1928 Station no. 50, break 1975 Test and reference series Difference between test and reference series Test statistics

  12. Station no. 100, break 1983

  13. Example of the detected breaks – precipitation, sur1, network 1 - Detected 10 breaks Station no. 4309900, break 1909 Station no. 4311803, break 1991

  14. Adjusting monthly data • using reference series based on distance • Power of weight is 0.5 for temperature and 1 for precipitation • adjustment: from differences/ratios 20 years before and after a change, monhtly • smoothing monthly adjustments (low-pass filter for adjacent values) Station no. 100, break 1983 Station no. 50, break 1928

  15. Adjusting values – evaluation • After adjust must correlation increase – if not, the series is not adjust Temperature Precipitation

  16. Absolute values of adjustment for temperature, surg1, network 1

  17. Iterative homogeneity testing • several iteration of testing and results evaluation • several iterations of homogeneity testing and series adjusting (3 iterations should be sufficient) • question of homogeneity of reference series is thus solved: • possible inhomogeneities should be eliminated by using averages of several neighbouring stations • if this is not true: in next iteration neighbours should be already homogenized

  18. Example – homogenized temperature series Station no. 50 Station no. 100

  19. Example – homogenized precipitation series Station no. 4309900, break 1909 Station no. 4311803, break 1991

  20. http://www.climahom.eu

More Related