1 / 22

Analysis of multivariate genotype - environment data

Archived at http://orgprints.org/8021. Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis Hans Pinnschmidt Danish Institute for Agricultural Sciences Division of Crop Protection Cereal Plant Pathology Group Denmark. Background

jeff
Download Presentation

Analysis of multivariate genotype - environment data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Archived at http://orgprints.org/8021 Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis Hans Pinnschmidt Danish Institute for Agricultural Sciences Division of Crop Protection Cereal Plant Pathology Group Denmark

  2. Background BAROF WP1 data: multivariate measurements on 86 spring barley genotypes in 10 environments (2 years: 2002 & 2003, 3 sites: Flakkebjerg, Foulum, Jyndevad, 2 production systems: ecological & conventional). Objectives Multivariate characterisation of genotypes with emphasis on yield-related properties.

  3. factors: genotypeenvironment G1 E1 . . . . . . . Ej . . Gi variables: X1(i,j) ... Xm(i,j) parameters Xm(i)1 ... Xm(i)p Xm(j)1 ... Xm(j)p • variables: • yield • 1000 grain weight • grain protein contents • culm length • date of emergence • growth duration • mildew severity • rust severity • scald severity • net blotch severity • disease diversity • weed cover • broken panicles & culms • lodging • parameters: • raw data • mean/median/max./min. • rank/relative values • main effects • interaction slopes • raw data adjusted for E/G main effects/slopes (residuals) • IPCA scores • SD/variance } derive information on general properties, specificity, stability/variability

  4. Non-linear Canonical Correlation Analysis (NCCA): an optimal scaling procedure suited for handling multivariate data of any kind of scaling (numerical/quantitative, ordinal, nominal).

  5. Non-linear Canonical Correlation Analysis (NCCA) data treatment: quantitative variables (vm) were converted into ordinal variables with n categories (v11 ... v1n, ..., vm1 ... vmn).

  6. Vm1 . . . Vmn E1 . . . . . . . . . Ej G1 . . . . . . . . . Gi Non-linear Canonical Correlation Analysis (NCCA) is based on multivariate contingency tables containing frequency counts.

  7. Non-linear Canonical Correlation Analysis (NCCA): • main “dimensions” ( principal components) are determined • “loadings” of variables ( overall correlation) are computed • “category centroids” are quantified • “object scores” ( principal component scores) are computed

  8. Characterisation of environments • based on data adjusted for • G main effects (= residuals)

  9. Flakkebjerg 2002: high rust & 1000 grain weight late sowing Foulum 2002 conventional & Jyndevad 2003 ecological: high mildew & lodging Jyndevad 2002 ecological: low yield, 1000 grain weight, weed infestation, protein content Flakkebjerg 2003: high yield, net blotch & panicle breakage low mildew & lodging

  10. Characterisation of genotypes • based on data adjusted for • E main effects (= residuals)

  11. high yield & 1000 grain weight low protein content & lodging high mildew low net blotch & disease diversity low yield & 1000 grain weight dimension 5 (sq. root) low mildew dimension 1 (sq. root)

  12. Characterisation of • genotypes & environments • based on: • raw data • data adjusted for E main effects • data adjusted for G & E main effects • ( G x E interaction)

  13. low yield, 1000 grain weight, weed infestation & net blotch high mildew high rust late emergence high yield, 1000 grain weight & net blotch low mildew low rust short culms early emergence

  14. high yield & 1000 grain weight low protein content & lodging low net blotch & disease diversity high mildew low yield & 1000 grain weight high protein content

  15. little lodging high panicle breakage high yield & 1000 grain weight low protein content low yield & 1000 grain weight high protein content much lodging

  16. Conclusions & outlook • NCCA is an “intuitive” method good for “visualising” the main features in multivariate data of various scales. • NCCA is useful for obtaining an overall orientation of G properties and E characteristics. • Future work: • Refinements to obtain a better synopsis of E-specific performance of G’s as related to their property profiles. • Include AMMI- and clustering (biclassification) results in NCCA, organise data as environment-specific sets of variables.

  17. Characterisation of genotype performance in • individual environments based on: • raw yield- and disease data • disease main effects of G’s • environmental disease variability of G’s • (= standard deviation of E adjusted data)

More Related