220 likes | 378 Views
C lassifications of circulation patterns from the COST733 database: An assessment of s ynoptic-climatological applicability by two-sample Kolmogorov-Smirnov test. Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic huth@ufa.cas.cz.
E N D
Classifications of circulation patterns from the COST733 database: An assessment of synoptic-climatological applicability by two-sample Kolmogorov-Smirnov test Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic huth@ufa.cas.cz
COST733 database (collection) • COST733 Action – “Harmonization and Applications of Weather TypesClassifications for European Regions” • (very) large number of classifications produced • on unified data • SLP at 12 UTC • ERA40 (Sep 1957 – Aug 2002) • ~9, ~18, ~27 types wherever possible • 12 European domains
COST733 database (collection) • version 2.0 of the database • released this spring • 18 methods for each domain • threshold-based: GWT (Beck), Litynski, Lamb (Jenkinson-Collison), P27 (Kruizinga), WLK • leader algorithm: Lund, Kirchhofer, Erpicum • PCA-based: T-mode PCA • optimization algorithms: CKMEANS, PCACA (k-means), Petisco, PCAXTRKMS, SANDRA, SANDRA-S, NNW (SOMs), PCAXTR • pseudo-random: random centroids • plus 7 subjective and objectivized classifications not attributable to any domain • ignored today
COST733 database (collection) • different attributes of classifications • number of types (9 x 18 x 27) • sequencing (no vs. 4-day sequences) • seasonal vs. year-round definition • variable: all based on SLP, several additional variables used
GOAL • assess the synoptic-climatological applicability of classifications • i.e., how well they stratify surface weather (climate) conditions • demonstrate effect of • sequencing • seasonal vs. annual definition • adding more variables • 500 hPa height • 500 hPa vorticity • 850/500 hPa thickness • number of types
Classifications examined • 11 methods • 30 classifications available for each of them • differing in • sequencing (no x 4 days) • additional variables (Z500, THICK850/500, VOR500, all together) • number of types (9, 18, 27) • 5 methods • additional 6 classifications available • differing in • seasonality of definition (year-round x seasonal)
TOOL • 2-sample Kolmogorov-Smirnov test • equality of distributions of the climate element under one type against under all the other types x
TOOL • at each station • types for which the K-S test rejects the equality of distributions are counted • the larger the count, the better the stratification, the better the synoptic-climatological applicability
ANALYSIS • preliminary results • maximum temperature (minimum temperature – very similar results) (precipitation – different) • domain 07 (central Europe) • 39 stations from ECA&D database • winter (DJF) • Jan 1961 – Dec 2000
RANKING OF CLASS’S • at all stations individually: • for each classification: number of rejected K-S counted • classifications ranked by the %age of rejected K-S tests (= well separated classes) • higher %age better lower rank • for each classification: ranks averaged over stations • area mean rank ranking of the classification
Result 1: comparison of methods • area mean ranks averaged over 30 realizations of each method • result: order of the method, independent of any attribute (no. of types, sequencing, variable)
Result 1: comparison of methods so the winner is…
Result 1: comparison of methods NOTE: not all methods participated in the race!
Result 2: sensitivity to the number of types • all pairs of classifications • differing in no. of types • 9 vs. 18 • 18 vs. 27 • with all other attributes equal • difference in rank is calculated • histogram of differences • t-test: equality of the difference to zero -106 ± 17
Result 2: sensitivity to the number of types • all pairs of classifications • differing in no. of types • 9 vs. 18 • 18 vs. 27 • with all other attributes equal • difference in rank is calculated • histogram of differences • t-test: equality of the difference to zero -55 ± 12
Result 3: effect of sequencing • all pairs of classifications • differing in sequencing (no vs. 4-days) • with all other attributes equal • difference in rank is calculated • histogram of differences • t-test: equality of the difference to zero -30 ± 11
Result 4: effect of seasonality • all pairs of classifications • differing in the seasonality in their definition • with all other attributes equal • difference in rank is calculated • histogram of differences • t-test: equality of the difference to zero -44 ± 24
Result 5: effect of additional variables +68 ± 18 +42 ± 24
Result 5: effect of additional variables +41 ± 18 +61 ± 19
CONCLUSIONS • various kinds of cluster analysis perform well • fewer types better performance • sequencing adds value: surface temperature is better described by types of 4-day sequences than types of instantaneous fields • seasonal definition better than annual, but: • systematic difference in the number of types (7 vs. 9) • additional variables bring no benefit; in fact they worsen the synoptic-climatological applicability
OUTLOOK • analysis to extend to • all domains • more variables (Tmin, Precip) • more comparisons will be possible results may be more general • several other criteria as well • other datasets (gridded: ENSEMBLES, reanalyses)