1 / 46

The use of complex populations in breeding with markers

The use of complex populations in breeding with markers. SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu. breeding programs tend to have complex population structures consisting of many independent crosses.

silas
Download Presentation

The use of complex populations in breeding with markers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

  2. breeding programs tend to have complex population structures consisting of many independent crosses Genetic studies tend to focus on bi-parental crosses with defined structure.

  3. Jargon: QTL LD SNP Structure Mixed Model Analysis of Variance Identity by descent Please stop me and ask when a definition will help clarify

  4. Objectives Understand the diversity of populations that are being used to test marker-trait associations (linkage). Understand the difference between the discovery of linkage and use of markers for selection. Use this information to facilitate interaction with colleagues from other disciplines (field, marker support, analysis, etc…). Use information to design and implement discovery and selection projects.

  5. Background Introduction to Populations Case study Discovery Populations Selection Populations Association Mapping Single Marker analysis of variance Changes to the model used for analysis: Account for population structure Haplotypes to gain information

  6. Standard populations for inbred species (line crosses) F2 RIL (recombinant inbred lines) BC (back cross) AB (Advanced Back Cross) * IBC (Inbred Back cross) * Emerging populations for association mapping Natural populations Unstructured populations Family-based * Nested Association Mapping (NAM; a variation of RIL)

  7. Standard populations for inbred species (line crosses) F2Few meiosis, population not fixed RIL Few meiosis, population fixed (can replicate) BC Few meiosis, population not fixed AB Few meiosis, population not fixed IBC Few meiosis, population fixed Emerging populations for association mapping Nat. pop. Samples all meiosis in history of species, pop. often fixed. Unst. pop. Samples all meiosis since pop. established Family-based Samples all meiosis in pedigree NAM See RIL. Meiosis increased due to size of pop/ and multiple crosses.

  8. Populations • Early generation (F2, BC1) • Strong theoretical basis • Balanced designs • Tools for interval mapping (point of analysis) • Most breeding programs do not collect extensive data on early generation populations • Retain too much “donor” Parent • AB and IBC populations • Reduce donor parent, isolate genetic factors, allow detection • Unbalanced design may limit power • Unstructured (natural populations) • More like populations that breeders use

  9. Review: affect of inbreeding Frequency of heterozygotes (Cc) and homozygotes (CC+cc) in each generation of selfing a hybrid (F1). Freq CC = p2 + pqF Freq Cc = 2pq (1-F) Freq cc = q2 + pqF

  10. Advanced Backcross and Inbred Backcross Populations Parent 1 x Parent 2 (Donor) F1 x ‘Parent 1 BC1 (n lines) BC1-1 x Parent 1BC2-1S0⊗ . . . BC2-1S5 BC1-2 x Parent 1BC2-2S0⊗ . . . BC2-2S5 . . BC1-n x Parent 1 BC2-nS0⊗ . . . BC2-nS5 AB IBC

  11. Statistical considerations with AB, IBC, and association populations Unequal sample size/unbalanced data Donor class is under represented Need to adjust Df for F-test proper F-test {Mj/Gk(Mj)} These considerations affect power and whether significance level is accurately estimated

  12. Take home messages: • Genotyping throughput and reagent packaging favors working with very large populations (~480) • Measuring traits (Phenotyping) is the limiting factor • C) For elite polpulations, marker number and the ability to distinguish descent (IBD) from state (IBS) are limitations (this is a function of linkage phase and LD) • D) Incorporating pedigree data or population structure data into analysis improves detection of trait associations (QTL) and the efficiency of MAS (defined as relative efficiency of selection). • E) We can detect some known QTL, but not all known QTL in complex populations. Power goes up with population size and marker number. • F) Phenotypic selection is effective.

  13. Case study: mapping and selection of bacterial spot resistance in tomato populations. David Francis, Sung-Chur Sim, Hui Wang, Matt Robbins, Wencai Yang.

  14. Bacterial Spot is a disease complex caused by ~4 species of Xanthomonas bacteria. There are physiological races. Sources of resistance are mostly close relatives of cultivated tomato Solanum lycopersicum or Solanum pimpinellifolium. Hawaii 7998 (T1) Hawaii 7981 (T3) PI128216 (T3) PI 114490 (T1, T2, T3, T4)

  15. Field rating based on Horsfall-Barratt scale quantitative scale (1-12) en.wikipedia/org/wiki/Horsfall-Barratt_scale Distribution approaches normal (ANOVA, regression, mixed models) GH rating based on HR Scored 0 or 1 (non-parametric) Fig.3 Fig.3 A

  16. Bacterial spot QTL discovery in IBC Populations Ohio, T2 & T1 (2000-2004) FL, T3 and T4 (2002-2004) Brasil T3 2002-2004

  17. Results of discovery studies: Three IBC populations [[OH88119 x Ha7998]x(OH88119)] x(OH88119) [[OH88119 x PI128216]x(OH88119)] x(OH88119) [FL7600 x PI114490]x(OH9242)]x(OH9242) Multiple F2 populations IBC x elite parent OH7870 x Ha7981 Results: Hawaii 7998 (T1) Rx-1, Rx-2, Rx-3, Chr11 QTL Hawaii 7981 (T3) R-Xv3 PI128216 (T3) Rx-4, Chr11 PI 114490 (T1, T2, T3, T4) QTL Chr 11, Chr3, Chr4 X X

  18. We have IBC lines and IBC x elite derived lines that “look good” and we want to integrate them with the elite breeding program. Strategy: 1) Develop populations to combine loci for resistance to multiple races 2) Validate Marker-QTL associations in order to assess feasibility of MAS 3) Conduct simultaneous phenotypic and MAS.

  19. Genes Parents Rx-3 (5) Rx-4(11) QTL11 QTL11 ? ? OH75 FL82 K64 OH86 OH74 MR13 “Population” consisting of 11 independent crosses, progeny segregate

  20. First segregating generation: grow ~100 plants in the field (total populations size 1,100) and select plants from each extreme (n = 110)

  21. Following year: Evaluate plots RCB, two replicates, rating based on a plot (not single plant), scale 1-12.

  22. Phenotypic evaluation (Focus on T1). Selection conducted in 2007 was predictive of plot performance in 2008 based on both nonparametric analysis and analysis of variance (p < 0.0001). Heritability estimated from the parent-offspring regression suggests a narrow sense heritability of 0.32. Plants rated as resistant in 2007 produced plots with an average disease rating of 4.02 in 2008; plants rated as susceptible produced plots with an average disease rating of 5.16 in 2008 (LSD 0.39). Realized gain under selection ~13% decrease in disease OH75 rated 3.5; OH88119 rated 9.0

  23. Marker analysis using The Unified Mixed Model Buckler Lab, TASSEL Y = μ REPy + Qw + Markerα + Zv + Error Sequence variation linked to traits

  24. %macro Mol(mark); proc mixed data = three; class &mark gen rep; model T1 = &mark / solution; random gen rep; %mend; %Mol(TOM144); %Mol(CT10737I); %Mol(CT20244I); %Mol(pto); %Mol(SL10526); %Mol(rx3); Markerα

  25. Rx-3 single-point analysis

  26. Adding matrix of population structure can correct for background effects and can add insight to which crosses, pedigrees, subpopulations have highest breeding value Y = μ REPy + Qw + Markerα + Zv + Error

  27. Qw Pedigree information Proportion of genome from a parent (pedigree) Designation of cross (0/1) Q – Matrix from Structure

  28. %macro Mol(mark); proc mixed data = three; class &mark gen rep; model T1 = OH75 FL82 K64 OH86 OH74 &mark / solution; random gen rep; %mend; %Mol(TOM144); %Mol(CT10737I); %Mol(CT20244I); %Mol(pto); %Mol(SL10526); %Mol(rx3); Qw Markerα

  29. Rx-3 single-point analysis single-point analysis corrected for population structure

  30. M1 M2 M1 M1 M2 M2 M1 M2 OH75: 1, R, 1 OH86: 0, S, 1 FL82 1, S, 0 Rx-3 rx-3 rx-3 OH75 x OH86, M1 can be used for selection, M2 cannot OH75 x FL82, M2 can be used for selection, M2 cannot What happens when the breeding material is a combination of progeny from both crosses?

  31. M1 M2 M1 M1 M2 M2 M1 M2 OH75: 1, R, 1 OH86: 0, S, 1 FL82 1, S, 0 Rx-3 rx-3 rx-3 Reality check: Markers are identical by state but not by descent (presumably because of LD decay). Potential solution is to use haplotypes.

  32. proc mixed data = three; class mark1 mark2 gen rep; model T1 = mark1*mark2 OH75 FL82 K64 OH86 OH74 / solution; random gen rep; M1 M2 M3 M4 M5 M6 M1*M2, M2*M3, M3*M4, M5*M6 Interactions term defines haplotypes

  33. Rx-3 single-point analysis single-point analysis corrected for population structure indicates haplotype analysis haplotype analysis corrected for population structure.

  34. Genome-Wide Scan

  35. We can detect resistance conferred by the Rx-3 locus on chromosome 5 We can detect resistance conferred by Rx-4 on chromosome 11 We cannot detect QTL on chromosome 11 We can detect a strong interaction between loci on 11 and 5 (data not shown) What needs to happen to improve prospects for “whole genome” discovery and/or selection? More markers Larger populations F = Gen/Error (non-replicated) F = Gen/Gen(Marker) (replicated) Best Worst (breeding pop) Worst (genetic pop)

  36. Population sizes • F-test • Marker/Gen(Marker) • Larger F from greater marker effect (strength of locus or closely linked to the causal gene) • Larger F by decreasing error • For maker studies it will nearly always be more powerful to increase the number of genotypes rather than increasing replicates of genotypes

  37. Sample size power estimates False + False - Proportion σ2P

  38. Discovery populations: Magnitude of difference between R and S is large Gen(Marker) variation moderate Breeding populations Difference between R and S is moderate Gen(Marker) variation is moderate Detecting significant marker trait associations is more difficult when magnitude of difference between genotypic classes is reduced

  39. Population sizes can be increased by decreasing plot replication. “Augmented designs” with a few checks highly replicated Checks provide “error” to assess significance of differences between un-replicated genotypes Checks can be used to normalize data (nearest check, flanking checks, etc…)

  40. Take home messages: • Genotyping throughput and reagent packaging favors working with very large populations (~480) (effective MAS implementation will require larger populations) • Measuring traits (Phenotyping) is the limiting factor. (scoring larger populations will minimize Gen(Marker) error) • C) For elite polpulations, marker number and the ability to distinguish descent (IBD) from state (IBS) are limitations (this is a function of linkage phase and LD) (haplotypes) • D) Incorporating pedigree data or population structure data into analysis improves detection of trait associations (QTL) and the efficiency of MAS (defined as relative efficiency of selection). (corrects for structure; avoids false positives) • E) We can detect some known QTL, but not all known QTL in complex populations. Power goes up with population size and marker number. (Marker analysis is still more descriptive than predictive) • F) Phenotypic selection is effective.

  41. Acknowledgments Francis Group Matt Robbins Sung-Chur Sim Troy Aldrich Collaborators, OSU Esther van der Knaap Bert Bishop Tea Meulia Sally Miller Melanie Lewis Ivey Collaborators, CAU Hui Wang Wencai Yang Collaborators, UFL Jay Scott Sam Hutton Collaborators, UCD Allen Van Deynze Kevin Stoffel Alex Kozic Funding USDA/AFRI OARDC RECGP matching funds grant; MAFPA

More Related