1 / 33

Understanding Linkage Disequilibrium in Marker-Assisted Selection: Key Concepts and Relevance in Orphan Crops

This workshop discusses the fundamentals of Linkage Disequilibrium (LD), its impact on marker-based selection, advantages, and disadvantages. It covers basic LD concepts, differences between D, D’, and r2, and the causes of LD. LD's significance in marker-assisted selection and its persistence in inbred species are highlighted. The workshop also explains LD measurement methods (D, D’, r2) and the influence of allele frequencies. It emphasizes the importance of LD in MAS and association genetics, depicting examples of LD visualization and its decay across different species.

Download Presentation

Understanding Linkage Disequilibrium in Marker-Assisted Selection: Key Concepts and Relevance in Orphan Crops

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IGSS Workshop Marker Assisted Selection for Orphan CropsPart 3: Linkage Disequilibrium

  2. Objectives • Present basic LD concepts and equations • Advantages and disadvantages of LD • List key differences between D, D’ and r2 • Show how LD impacts many aspects of marker based selection, marker assisted selection, and genomic selection.

  3. Linkage and Linkage Disequilibrium

  4. Physical Linkage 1st Gen of Rec. 2nd Gen of Rec. 5th Gen of Rec. X

  5. Physical Linkage 5th Gen of Rec. • Recombination reduces LD between physically linked loci • Random mating reduces D and r2 as D in the tth generation of random mating will be Dt = (1-c)t D0 after t generations of random mating where D0 is D prior to random mating. • After an infinite number of generations of random mating, D∞ = 0, all loci with c>0 would appear to be in equilibrium (eg r2=0), even physically linked loci. • LD persist longer in inbred species due to reduced heterozygosity, thus less effective recombinations

  6. Linkage Disequilibrium (LD) • LD occurs between two loci when there is a non-random association between alleles at the loci • When two loci are in LD, then the alleles at one locus predict the alleles at the other locus • LD is a measure of the association • Significant LD: Two loci are said to be in LD when their alleles are associated • Non-significant LD: Two loci are in equilibrium (eg LD≈0) when their alleles are NOT associated

  7. Linkage Disequilibrium (LD) • Two forms of LD • Physical linkage of two loci on same chromosome • Non-random association of alleles between loci on different chromosomes (or very distant on same chromosome) • Causes of LD • Actual linkage • Mutation • Migration • Epistatic selection • Drift (small population size), • Non-random mating

  8. Why does LD matter? • MAS and association genetics all require LD as most markers are NOT in a gene • Background levels of LD, determines what marker density to use - • Marker effects (a, r2, etc) are dependent on LD between QTL and marker locus • Marker and QTL must be in linkage disequilibrium (LD) for marker to have an effect Q M = Q c=cM=0 cM > 0 M Note: cM = c * 100 where c is the recombination frequency between M and Q loci

  9. Measuring LD D` Influenced by small sample size and rare alleles R2 D Dependent on allele frequencies LD

  10. LD Example

  11. High or low LD???

  12. Measuring LD: D LD originally measured as: D` = f(MQ)-f(M)f(Q) = observed – predicted D = f(MQ)*f(mq) – f(Mq)*f(mQ) where there are two loci each with two allele (M,m, and Q,q) and f(MQ) is the frequency of the MQ haplotype. D is greatly influenced by allele frequencies making comparisons between pairs of loci or populations impossible. D could be positive or negative depending on whether MQ/mq or Mq/mQ were the parental type gametes

  13. Measuring LD: D` In a common standardization of D, a relative measure of disequilibrium (D) compared to its maximum is used: D` = D / Dmax When D is positive, Dmax= min [ (p1q2) or (p2q1) ] When D is negative Dmax= min [ (p1q1) or (p2q2) ] This standardization makes D-values range between 0 and 1 Reliable estimate of physical distance and therefore recombination history D is greatly influenced by allele frequencies making comparisons between pairs of loci or populations impossible. D could be positive or negative depending on whether MQ/mq or Mq/mQ were the parental type gametes

  14. Measuring LD: r2 The statistic r2 corrects many of the issues associated with D by standardizing D r2 can range from 0 (two loci in equilibrium) to 1 (non-random loci in complete LD). r2 is a squared correlation. r2 measures recombination history as well as mutation frequency r2 can be calculated for loci with >2 alleles and for multiple loci.

  15. LD Visualization

  16. Example 3 : Populations D = f(MQ)*f(mq) – f(Mq)*f(mQ) 1 2 3 4 5

  17. r2 between pairs of loci with different recombination frequencies (c) after successive generations of random mating

  18. LD between unlinked loci • Two loci on different chromosomes or very distant on the same chromosome, can appear to be associated (eg have significant LD, r2 is > 0). This is due to: • Population structure • Selection

  19. Example: One population with two subgroups Sub group 1 Sub group 2 Population structure occurs when not all individuals are derived from the same random mating population in Hardy Weinberg equilibrium. Causes spurious associations if not accounted for

  20. Population structure … Transgene Eathington et al, 2007. 49 markers on 15 out of 20 chromosomes had highly significant effects - ignored When population structure was accounted for, the location of the transgene was correctly identified with one marker

  21. PC of Elite Panel Marker Data 42.1% of parentage From “Truman” or Truman full-Sibs

  22. Accounting for population structure

  23. Linkage Disequilibrium Decay • Varies according to: • Species • LD persists differently say in cattle compared to maize depending on the recombination history of the species. • Types of germplasm • LD decays more slowly among elite inbred lines than OPVs or landraces that have undergone larger numbers of meiotic events • Mode of pollination • LD persists longer map distances or large numbers of base pairs in self pollinating crops than in cross pollinated crops • Different genes • In maize, LD decayed rapidly within 500 base pairs in d3 genes while it did not follow the same pattern in sugnes

  24. Fst Statistic…

  25. LD Decay in Biparental Linkage mapping Populations LD Decays much slower and in a predictable manner

  26. LD Decay in a association mapping population LD Decays more rapidly in a diverse population due to different forces like population structure and more chances of recombination. Dense or sparse genotyping?

  27. Extensive LD in barley of the upper Midwest Small effective population size with limited diversity and therefore LD Decays much slower – extending out to about 8cM when r = 0.2

  28. LD decay in 2 wheat populations r2 = 0.2 at ~ 4-8 cM

  29. Edward S. Buckler Buckler lab, LD decay in Maize

  30. Example: LD decay (r2 = 0.2) for wheat chromosomes based on SNP data

  31. LD r2 and QTL mapping r2 • LD: r2 is measure of association between alleles at two loci, a squared correlation of genotypic data • QTL: r2 is proportion of phenotypic variation modeled by a marker, the association of genotypic from one locus and phenotypic data from many loci. Can be viewed as squared correlation of the genotypic and phenotypic data • The LD r2 though sets the limit of how much genetic variation controlled by a QTL a marker can explain. • r2 = 1 between M and Q loci, then a marker can explain 100% of the genetic variation controlled by Q • r2 = 0.5 between M and Q loci, then a marker can explain 50% of the genetic variation controlled by Q • r2 = 0 between M and Q loci, then a marker can explain 0% of the genetic variation controlled by Q.

  32. Main Points • Need LD between marker and QTL for MAS, genomic selection, etc to work • When two loci are in LD, then the alleles at one locus predict the alleles at the other locus • LD arises from physical linkage of loci, but can also occur between unlinked loci • LD r2 between a marker and QTL determines how much genetic variation controlled by that QTL the marker can model

  33. KDCompute Plugin: Linkage Disequilibrium

More Related