220 likes | 357 Views
Archived at http://orgprints.org/8021. Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis Hans Pinnschmidt Danish Institute for Agricultural Sciences Division of Crop Protection Cereal Plant Pathology Group Denmark. Background
E N D
Archived at http://orgprints.org/8021 Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis Hans Pinnschmidt Danish Institute for Agricultural Sciences Division of Crop Protection Cereal Plant Pathology Group Denmark
Background BAROF WP1 data: multivariate measurements on 86 spring barley genotypes in 10 environments (2 years: 2002 & 2003, 3 sites: Flakkebjerg, Foulum, Jyndevad, 2 production systems: ecological & conventional). Objectives Multivariate characterisation of genotypes with emphasis on yield-related properties.
factors: genotypeenvironment G1 E1 . . . . . . . Ej . . Gi variables: X1(i,j) ... Xm(i,j) parameters Xm(i)1 ... Xm(i)p Xm(j)1 ... Xm(j)p • variables: • yield • 1000 grain weight • grain protein contents • culm length • date of emergence • growth duration • mildew severity • rust severity • scald severity • net blotch severity • disease diversity • weed cover • broken panicles & culms • lodging • parameters: • raw data • mean/median/max./min. • rank/relative values • main effects • interaction slopes • raw data adjusted for E/G main effects/slopes (residuals) • IPCA scores • SD/variance } derive information on general properties, specificity, stability/variability
Non-linear Canonical Correlation Analysis (NCCA): an optimal scaling procedure suited for handling multivariate data of any kind of scaling (numerical/quantitative, ordinal, nominal).
Non-linear Canonical Correlation Analysis (NCCA) data treatment: quantitative variables (vm) were converted into ordinal variables with n categories (v11 ... v1n, ..., vm1 ... vmn).
Vm1 . . . Vmn E1 . . . . . . . . . Ej G1 . . . . . . . . . Gi Non-linear Canonical Correlation Analysis (NCCA) is based on multivariate contingency tables containing frequency counts.
Non-linear Canonical Correlation Analysis (NCCA): • main “dimensions” ( principal components) are determined • “loadings” of variables ( overall correlation) are computed • “category centroids” are quantified • “object scores” ( principal component scores) are computed
Characterisation of environments • based on data adjusted for • G main effects (= residuals)
Flakkebjerg 2002: high rust & 1000 grain weight late sowing Foulum 2002 conventional & Jyndevad 2003 ecological: high mildew & lodging Jyndevad 2002 ecological: low yield, 1000 grain weight, weed infestation, protein content Flakkebjerg 2003: high yield, net blotch & panicle breakage low mildew & lodging
Characterisation of genotypes • based on data adjusted for • E main effects (= residuals)
high yield & 1000 grain weight low protein content & lodging high mildew low net blotch & disease diversity low yield & 1000 grain weight dimension 5 (sq. root) low mildew dimension 1 (sq. root)
Characterisation of • genotypes & environments • based on: • raw data • data adjusted for E main effects • data adjusted for G & E main effects • ( G x E interaction)
low yield, 1000 grain weight, weed infestation & net blotch high mildew high rust late emergence high yield, 1000 grain weight & net blotch low mildew low rust short culms early emergence
high yield & 1000 grain weight low protein content & lodging low net blotch & disease diversity high mildew low yield & 1000 grain weight high protein content
little lodging high panicle breakage high yield & 1000 grain weight low protein content low yield & 1000 grain weight high protein content much lodging
Conclusions & outlook • NCCA is an “intuitive” method good for “visualising” the main features in multivariate data of various scales. • NCCA is useful for obtaining an overall orientation of G properties and E characteristics. • Future work: • Refinements to obtain a better synopsis of E-specific performance of G’s as related to their property profiles. • Include AMMI- and clustering (biclassification) results in NCCA, organise data as environment-specific sets of variables.
Characterisation of genotype performance in • individual environments based on: • raw yield- and disease data • disease main effects of G’s • environmental disease variability of G’s • (= standard deviation of E adjusted data)