1 / 37

Lecture 17: Model-Free Linkage Analysis

Lecture 17: Model-Free Linkage Analysis. Date: 10/17/02 IBD and IBS IBD and linkage Fully Informative Sib Pair Analysis Sib Pair Analysis with Missing Info. Limitations of the Model-Based Methods. Loss of power when model misspecified in multilocus scenario.

wynona
Download Presentation

Lecture 17: Model-Free Linkage Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 17: Model-Free Linkage Analysis Date: 10/17/02 IBD and IBS IBD and linkage Fully Informative Sib Pair Analysis Sib Pair Analysis with Missing Info

  2. Limitations of the Model-Based Methods • Loss of power when model misspecified in multilocus scenario. • Potential increase in false positive rate when non-independent ascertainment combined with misspecified model. • Complex traits are controlled by multiple genes and the traditional model doesn’t apply.

  3. Non-Parametric (Model-Free) • If the model is such a potential problem, then how does one go about throwing out the model? • Look for associations between traits of interest and markers in relatives. The markers that relatives with the same disease/trait share are likely to be close to the disease/trait gene, especially if you see it again and again in many relative comparisons. • Read Sham Sections 3.14.1, 3.14.3.

  4. IBD vs. IBS • Let’s count the ways in which two individuals can share the same allele at a locus. • Two alleles that are indistinguishable are said to be identical by state (IBS). • If in addition the two alleles actually are inherited duplicates of the same original gene, then they are also identical by descent (IBD).

  5. aA aA aA IBD vs. IBS: example 1 2 Can you identify another set of three IBD alleles? AA aa 3 4 5 6 AA aa aA aA 7 8 9 10 Aa Aa Aa Aa

  6. IBD/IBS and Linkage • IBD alleles are most relevant to linkage analysis. • We will show this in what follows. • Assume p is the allele frequency of M1, one of the alleles at marker 1. • Indicate IBD alleles in the same color and IBS alleles in different colors.

  7. An Excess of IBD Alleles vs. Other Alleles ÞLinkage • If a marker locus is linked to a locus of interest (disease/trait) with q=0.2, then it will be passed to (affected) offspring along with the locus of interest 80% of the time so offspring will get an IBD copy. p*20% of the time the offspring will get an marker allele that is IBS to the parent’s allele. Otherwise, the offspring will get a different marker allele.

  8. An Excess of IBD Alleles vs. Other Alleles ÞLinkage D parent’s haplotype (1-p)*0.2 M2 D not IBD nor IBS M1 p*0.2 0.8 D M1 D IBS M1 IBD

  9. Equal Numbers of IBD vs. Other Alleles Þ No Linkage • On the other hand, a locus that is not linked to an locus of interest will be passed IBD 50% of the time and IBS p*50% of the time. Otherwise it is neither.

  10. Equal Numbers of IBD vs. Other Alleles ÞNo Linkage D parent’s haplotype (1-p)*0.5 M2 D not IBD nor IBS M1 p*0.5 0.5 D M1 D IBS M1 IBD

  11. S1 S8 S2 S3 S9 S5 S6 S7 S4 IBD Patterns Between Pairs of Relatives A1 A2 Individual 1 A3 A4 Individual 2 Label the alleles at a single locus in two individuals. There are a total of four alleles. The probability of each of these states are called the coefficients of identity.

  12. All IBD Patterns S1 S2 S3 S4 S5 S6 S7 S8 S9

  13. Full Sibs and IBD Value • Let D be the number of IBD alleles that full sibs have. It ranges from 0, 1, to 2. Call this the IBD value of a sib-pair. • Let Dm be 1 if the mother passes on the same allele to both sibs, 0 otherwise. • Let Df be 1 if the father passes on the same allele to both sibs, 0 otherwise. • Then, Dm and Df are both Bin(1,0.5). • What is D in terms of Dm and Df?

  14. Distribution of IBD Value • Therefore, D is Bin(2,0.5). • E(D) = 1 • Var(D) = 0.5

  15. Utility of IBD Values • We already know that if the marker allele is tightly linked to the locus of interest, then affected offspring will tend to have an IBD copy of the marker allele present on the parent’s chromosome. • Therefore, if we compare affected sibs, they will both tend to be IBD in the alleles that are closely linked to the marker of interest (the one that makes them “affected”).

  16. Correlation of IBD Values Near Target Loci • Consider locus A of interest, and locus B that is linked to A with recombination fraction q. • Suppose the genetic model of locus A is dominant and the dominant allele is rare. • Therefore, DA = 1. • We seek P(DB | DA=1), the conditional distribution of DB.

  17. Conditional Distribution at the Second Locus Y= q2 + (1-q)2

  18. A Statistical Test for Linkage • How would one use the above results to develop a statistical test for linkage of locus A and B?

  19. Fully Informative Sib Pairs • Define a sib pair as fullyinformative if the IBD status of the sib-pair can be determined without ambiguity. • What are the requirements for fully informative families? • 1. • 2.

  20. Fully Informative Sib Pairs – Data • Let n be the number of fully-informative sib pairs. • Let n0 be the number of fully-informative sib pairs with D = 0. • Let n1 be the number of fully-informative sib pairs with D = 1. • Let n2 be the number of fully-informative sib pairs with D = 2.

  21. Fully Informative Sib Pairs – Chi-Squared Test • Define the chi-squared statistic: • What are the expected values?

  22. Fully Informative Sib Pairs – Mean Test • Consider the statistic • What is E(S2’)? • Var(S2’) = n/8. (VERIFY)

  23. Why the Mean Test? • Most powerful for loose linkage. • Most powerful for all levels of linkage when penetrances meet certain condition: • When is this requirement true?

  24. Limitations of Fully Informative Sib Pairs • When parents are not heterozygotic at the locus, the sib pair cannot be used. • When a genotype is missing (parent or either sib), the sib pair cannot be used. • Is there a way to use information from uninformative sib pairs to estimate linkage?

  25. Sib Pairs with Missing Information • Missing genotypes can be filled in probabilistically using genotypes of relatives. • Count the number of alleles with unknown IBD status to deal with missing information about IBD alleles. • Assume the ith possible configuration of missing genotype(s) occurs with probability pi. • How do we estimate pi?

  26. Sib Pairs with Missing Information • Assuming genotype configuration i, count the number of times each of the following occurs:

  27. Sib Pairs with Missing Information • Estimate the actual counts of IBD alleles and non-IBD alleles as:

  28. Sib Pairs with Missing Information • If there are multiple affected sib-pairs in the sample, sum over all of them: • Note that a sibling can be used multiple times in different pairings.

  29. Sib Pairs with Missing Information • The statistic is then • Why?

  30. Example • See 3.13 in Sham.

  31. Limitations • Is the statistic really asymptotically distributed as a chi-square? Why might it not be? • The statistic may also be biased in favor of the null because there is more information about missing parental genotypes in sibs with non-IBD alleles. Why?

  32. A Likelihood Approach • Define • Let x be the observed genotypic data.

  33. A Likelihood Approach • The likelihood for a family with one sib-pair is given by: • How does one test the null hypothesis of no linkage?

  34. Using Entire Pedigrees • You might be inclined to use entire pedigrees if available rather than just the affected sibs. • Luckily this is possible. You must come up with some kind of statistic that measures the degree of IBD sharing at each locus among affected members in that pedigree. • To accomplish this, label each allele in the founding members (1,2,...,2f).

  35. A Statistic on Full Pedigrees • Suppose there are a affected members in the pedigree. You want to find which marker alleles these members share with high probability. • Consider the 2a possible vectors formed by selecting one allele from each affected member. • How many permutations would leave this vector unchanged?

  36. A Statistic on Full Pedigrees • Let the number of times that founder allele i occurs in vector h to be bi(h). • Then the number of permutations is given by • And a statistic on the degree of IBD sharing in a pedigree for each locus is given by

  37. A Statistic on Full Pedigrees • How would you determine the significance of this statistic?

More Related