1 / 31

Lecture 6: QTL Mapping

Lecture 6: QTL Mapping. P 1 x P 2. B 2. B 1. F 1. F 1. F 1 x F 1. F 1. F 1. Backcross design. Backcross design. F 2 design. F 2. F 2. Advanced intercross Design (AIC, AIC k ). F k. Experimental Design: Crosses. Experimental Designs: Marker Analysis. Single marker analysis.

avery
Download Presentation

Lecture 6: QTL Mapping

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 6:QTL Mapping

  2. P1 x P2 B2 B1 F1 F1 F1 x F1 F1 F1 Backcross design Backcross design F2 design F2 F2 Advanced intercross Design (AIC, AICk) Fk Experimental Design: Crosses

  3. Experimental Designs: Marker Analysis Single marker analysis Flanking marker analysis (interval mapping) Composite interval mapping Interval mapping plus additional markers Multipoint mapping Uses all markers on a chromosome simultaneously

  4. Conditional Probabilities of QTL Genotypes The basic building block for all QTL methods is Pr(Qk | Mj) --- the probability of QTL genotype Qk given the marker genotype is Mj. Consider a QTL linked to a marker (recombination Fraction = c). Cross MMQQ x mmqq. In the F1, all gametes are MQ and mq In the F2, freq(MQ) = freq(mq) = (1-c)/2, freq(mQ) = freq(Mq) = c/2

  5. Hence, Pr(MMQQ) = Pr(MQ)Pr(MQ) = (1-c)2/4 Pr(MMQq) = 2Pr(MQ)Pr(Mq) = 2c(1-c)/4 Pr(MMqq) = Pr(Mq)Pr(Mq) = c2 /4 Why the 2? MQ from father, Mq from mother, OR MQ from mother, Mq from father Since Pr(MM) = 1/4, the conditional probabilities become Pr(QQ | MM) = Pr(MMQQ)/Pr(MM) = (1-c)2 Pr(Qq | MM) = Pr(MMQq)/Pr(MM) = 2c(1-c) Pr(qq | MM) = Pr(MMqq)/Pr(MM) = c2

  6. Q M2 M1 Genetic map c1 c2 c12 No interference: c12 = c1 + c2 - 2c1c2 Complete interference: c12 = c1 + c2 2 Marker loci Suppose the cross is M1M1QQM2M2 x m1m1qqm2m2 In F2, Pr(M1QM2) = (1-c1)(1-c2) Pr(M1Qm2) = (1-c1) c2 Pr(m1QM2) = (1-c1) c2 Likewise, Pr(M1M2) = 1-c12 = 1- c1 + c2 A little bookkeeping gives

  7. - - - - - - -

  8. - - Expected Marker Means The expected trait mean for marker genotype Mj is just For example, if QQ = 2a, Qa = a(1+k), qq = 0, then in the F2 of an MMQQ/mmqq cross, • If the trait mean is significantly different for the genotypes at a marker locus, it is linked to a QTL • A small MM-mm difference could be (i) a tightly-linked QTL of small effect or (ii) loose linkage to a large QTL

  9. - - - - - - - ( ) This is essentially a for even modest linkage - - ( ) - - - Hence, the use of single markers provides for detection of a QTL. However, single marker means does not allow separate estimation of a and c. Now consider using interval mapping (flanking markers) Hence, a and c can be estimated from the mean values of flanking marker genotypes

  10. Value of trait in kth individual of marker genotype type i Effect of marker genotype i on trait value Linear Models for QTL Detection The use of differences in the mean trait value for different marker genotypes to detect a QTL and estimate its effects is a use of linear models. One-way ANOVA. Detection: a QTL is linked to the marker if at least one of the bi is significantly different from zero Estimation (QTL effect and position): This requires relating the bi to the QTL effects and map position

  11. Effect from marker genotype at first marker set (can be > 1 loci) Effect from marker genotype at second marker set Interaction between marker genotypes i in 1st marker set and k in 2nd marker set Detecting epistasis One major advantage of linear models is their flexibility. To test for epistasis between two QTLs, used an ANOVA with an interaction term • At least one of the ai significantly different from 0 ---- QTL linked to first marker set • At least one of the bk significantly different from 0 ---- QTL linked to second marker set • At least one of the dik significantly different from 0 ---- interactions between QTL in sets 1 and two

  12. Trait value given marker genotype is type j Distribution of trait value given QTL genotype is k is normal with mean mQk. (QTL effects enter here) Probability of QTL genotype k given marker genotype j --- genetic map and linkage phase entire here Sum over the N possible linked QTL genotypes Maximum Likelihood Methods ML methods use the entire distribution of the data, not just the marker genotype means. More powerful that linear models, but not as flexible in extending solutions (new analysis required for each model) Basic likelihood function: This is a mixture model

  13. Maximum of the likelihood under a no-linked QTL model - Maximum of the full likelihood } { - ML methods combine both detection and estimation Of QTL effects/position. Test for a linked QTL given from the LR test The LR score is often plotted by trying different locations for the QTL (I.e., values of c) and computing a LOD score for each

  14. A typical QTL map from a likelihood analysis

  15. i-1 i i+1 i+2 CIM works by adding an additional term to the linear model , Interval Mapping with Marker Cofactors Consider interval mapping using the markers i and i+1. QTLs linked to these markers, but outside this interval, can contribute (falsely) to estimation of QTL position and effect Now suppose we also add the two markers flanking the interval (i-1 and i+2) Inclusion of markers i-1 and i+2 fully account for any linked QTLs to the left of i-1 and the right of i+2 Interval being mapped However, still do not account for QTLs in the areas Interval mapping + marker cofactors is called Composite Interval Mapping (CIM) CIM also (potentially) includes unlinked markers to account for QTL on other chromosomes.

  16. Power and Repeatability: The Beavis Effect QTLs with low power of detection tend to have their effects overestimated, often very dramatically As power of detection increases, the overestimation of detected QTLs becomes far less serious This is often called the Beavis Effect, after Bill Beavis who first noticed this in simulation studies

  17. Mapping in Outbred Populations QTL mapping in outbred populations has far lower power compared to line crosses. Not every individual is informative for linkage. an individual must be a double heterozygote to provide linkage information Parents can differ in linkage phase, e.g., MQ/mq vs. Mq/mQ. Hence, cannot pool families, rather must Analyze each parent separately.

  18. Marker vs. QTL Informative Can easily check to see if a parent/family is Marker informative (at least one parent is A marker heterozygote). No easy way to check if they are also QTL informative (at least one family is a QTL heterozygote) A fully-informative parent is both Marker and QTL informative, i.e., a double heterozygote.

  19. Types of families (considering marker information) Fully Marker informative Family: MiMj x MkMl Both parents different heterozygotes All offspring are informative in distinguishing alternative alleles from both parents. Backcross family One parent a marker homozygote MiMj x MkMk All offspring informative in distinguishing heterozygous parent's alternative alleles Intercross family MiMj x MiMj Both the same marker heterozygote Only homozygous offspring informative in distinguishing alternative parental alleles

  20. Trait value for the kth offspring of sire i with marker genotype j The effect of sire i The effect of marker j in sire i Sib Families QTL mapping can occur in sib families. Here, one looks separately within each family for differences in trait means for individuals carrying alternative marker alleles. Hence, a separate analysis can be done for each parent in each family. Information across families is combined using a standard nested ANOVA. Half-sibs (common sire)

  21. - - A significant marker effect indicates linkage to a QTL This is tested using the standard F-ratio, What can we say about QTL effect and position? Thus, the marker variance confounds both position and QTL effect, here measured by the additive variance of the QTL Since sA2 = 2a2p(1-p), we can get a small variance For a QTL of large effect (a >>1) if one allele is rare If 2p(1-p) is small, heterozygotes are rare (most sires are QTL homozygotes). However, if a is large, in these rare families, there is a large effect.

  22. Effect of sire i Effect of marker allele j from sire i Effect of the kth son of sire i with sire marker allele j Hence, there is a tradeoff in getting a sufficient number of families to have a few with the QTL segregating, but also to have family sizes large enough to detect differences between parental marker alleles The granddaughter design. Widely used in dairy cattle to improve power. Each sire (i) produces a number of sons that are genotyped for the sire allele. Each son then produces a number of offspring in which the trait is measured Advantage: large sample size for each mij value.

  23. Trait value for individual i Genetic value of other (background) QTLs Genetic effect of chromosomal region of interest q Fraction of chromosomal region shared IBD between individuals i and j. Resemblance between relatives correction General Pedigree Methods Random effects (hence, variance component) method for detecting QTLs in general pedigrees The covariance between individuals i and j is thus

  24. ( ) - - - - The resulting likelihood function is { { q Assume z is MVN, giving the covariance matrix as Here Estimated from marker data Estimated from the pedigree A significant sA2 indicates a linked QTL.

  25. Haseman-Elston Regressions One simple test for linkage of a QTL to a marker locus is the Haseman-Elston regression, used in human genetics The idea is simple: If a marker is linked to a QTL, then relatives that share a IBD marker alleles likely share IBD QTL alleles and hence are more similar to each other than expected by chance. The approach: regress the (squared) difference in trait value in the same sets of relatives on the fraction of IBD marker alleles they share.

  26. Fraction of marker alleles IBD in this pair Of relatives. p = 0, 0.5, or 1 - - - - - - - - For the ith pair of relatives, The expected slope is a function of the additive variance Of the linked QTL, the distance c between marker and QTL and the type of relative This is a one-sided test, as the null hypothesis (no linkage) is b =0 versus the alternative b < 0 Note that parent-offspring are NOT an appropriate pair of relatives for this test. WHY?

  27. Affected Sib Pair Methods As with the HE regression, the idea is that if the marker is linked to a QTL, individuals with more IBD marker alleles with have closer phenotypes. Example of an allele-sharing method. Consider a discrete phenotype (disease presence/ absence). A sib can either be affected or unaffected. A pair of sibs can either be concordant (either doubly affected or both unaffected), or discordant (a singly-affected pair)

  28. - The IDB probabilities for a random pair of full sibs are Pr(0 IBD) = Pr(2 IBD) = 1/4, Pr(1 IBD) = 1/2 The idea of affected sib pair (ASP) methods is to compare this expected distribution across one (or more) classes of phenotypes. A departure from this expectation implies the marker is linked to a QTL There are a huge number of versions of this simple test. pij = frequency of a pair with i affected sibs sharing j marker alleles IBD • Compare the frequency of doubly-affected sib pairs That have both marker alleles IBD with the null value 1/4 One-sided test as p22 > 1/4 under linkage

  29. - Number of doubly-affected pairs Freq of pairs sharing 1 alleles IBS Freq of pairs with 2 alleles IBD • Compare the mean number of IBD alleles in doubly-affected pairs with the null value of 1 One-sided test as p21+ 2p22 > 1 under linkage Finally, maximum likelihood approaches have been suggested. In particular, the goodness-of-fit of the full distribution of IDB values in doubly- affected sibs Comparing p2i = n2i/n2 with 1/4 (i=0, 2) or 1/2 (i=1)

  30. ( ) ) ( The resulting test statistic is called MLS for Maximum LOD score, and is given by Where p20 = p22 = 1/4, p21 = 1/2. Linkage indicated by MLS > 3 An alternative formulation of MLS is to consider just the contribution from one parent, where (under no linkage), Pr(sibs share same parental allele IBD) = Pr(sibs don’t share same parental allele) = 1/2

  31. Fraction of sib pairs sharing parental allele IBD ( ( ( ( ) ) ) ) - ) ( ( ) - Example: Genomic scan for type I diabetes Marker D6S415. For sibs with diabetes, 74 pairs Shared the same parental allele IBD, 60 did not Marker D6S273. For sibs with diabetes, 92 pairs Shared the same parental allele IBD, 31 did not

More Related