390 likes | 540 Views
Thoughts about the TDT. PPAR -gamma in Type 2 diabetes Altshuler et al. Nat Genet 26:76-80, 2000 NOD2 in Crohn’s Disease Hugot et al., Nature 411: 599-603, 2001 ADAM33 in asthma Van Eerdewegh et al., Nature 418: 426-430, 2002. Contribution of TDT: Finding Genes for 3 Complex Diseases.
E N D
PPAR-gamma in Type 2 diabetes Altshuler et al. Nat Genet 26:76-80, 2000 NOD2 in Crohn’s Disease Hugot et al., Nature 411: 599-603, 2001 ADAM33 in asthma Van Eerdewegh et al., Nature 418: 426-430, 2002 Contribution of TDT: FindingGenes for 3 Complex Diseases
The common PPAR-gamma Pro12Ala polymorphism is associated with decreased risk of type 2diabetes * Altshuler et al. Nat Genet 26:76-80, 2000
NOD2 Variants and Susceptibility to Crohn’s Disease Chrom 16q SNP13: p=6x10-6 Hugot et al., Nature 411: 599-603, 2001
Van Eerdewegh et al., Nature 418: 426-430, 2002 ADAM33 Gene: Asthma and Bronchial Hyperresponsiveness Chrom 20p P= 3x10-6 to 0.04
Supplementary Information Table 2 Transmission Disequilibrium test (TDT) for 5 SNPs in ADAM 33 Asthma Over-Transmitted TDT SNP/ SNP Combination Allele/Haplotype T NT p-value S1 G 37 20 0.033 T1 T 43 27 0.072 V-1 C 43 27 0.072 SNP Haplotypes V1 A 7 7 1.00 V4 C 73 55 0.13 S1/T1 GT 72 38 0.0029 T1/V-1 TC 80 46 0.0043 T1/V4 TC 97 60 0.0070 S1/T1/V-1 GTC 77 41 0.0029 S1/T1/V1 GTA 75 41 0.0047 S1/T1/V4 GTC 96 60 0.0084 T1/V-1/V1 TCA 76 45 0.015 T1/V-1/V4 TCC 97 59 0.0046 T1/V1/V4 TAC 98 58 0.0031 S1/T1/V-1/V1 GTCA 74 41 0.0068 S1/T1/V-1/V4 GTCC 96 58 0.0034 S1/T1/V1/V4 GTAC 97 58 0.0078 T1/V-1/V1/V4 TCAC 96 59 0.0063 S1/T1/V-1/V1/V4 GTCAC 95 58 0.0048
Population distributions of (a) disease given genotype, and (b) genotype given disease.
Clayton Odds Ratio Ott He calls this the relative risk. Confusing!
D M θ D1 M1 D2 M2
Null hypothesis: θ = ½ (Disease and marker loci unlinked) Alternative hypothesis: θ < ½ (Disease and marker loci linked)
freq (D1 M1) ≠ freq (D1) × freq (M1) δ = freq (D1 M1) – freq (D1) × freq (M1)
We assume that we observe the marker locus genotypes, either M1M1, M1M2, or M2M2, of both parents and the affected sibs in all families in the data.
Probabilities for transmitted and non-transmitted marker alleles M1 and M2 from any parent of an affected child. Non-transmitted allele Transmitted Allele M1 M2 Total M1 P(11) P(12) P(1.) M2 P(21) P(22) P(2.) Total P(.1) P(.2) 1
P(11) = q2 + qδ/p P(12) = q(1 – q) + (1 – θ–q)δ/p P(21) = q(1 – q) + (θ – q)δ/p P(22) = (1 – q)2 – (1 – q)δ/p
Numbers of transmitted and non-transmitted marker alleles M1 and M2 among the parents of the affected sibs Non-transmitted allele Transmitted Allele M1 M2 M1n11n12 M2n21n22 Put n12 + n21 = n
Only P(12) and P(21) depend on θ . Also, when θ = ½, P(12)=P(21) So the “natural” (TDT) test statistic is This (McNemar statistic) has an asymptotic 1 df χ2 distribution when the null hypothesis is true.
Note that this statistic depends only on n12 and n21 only, and ignores n11 and n22. This makes sense: the statistic uses data only from M1M2 parents, and only these are informative for linkage. We call these ‘informative” parents. So at the end of the day we consider only transmissions from informative parents.
We will focus entirely on the denominator, n, of the TDT statistic. It is remarkable how many questions one can ask about this. But before we ask these, we first ask, where does this denominator come from?
Assuming the null hypothesis is true, n12 has a binomial (n, ½) distribution. Note: this is true even if the data contain several affected children from the same family. Thus the variance of n12 - n21 (= 2n12 – n) is 4n/4 = n.
We will examine three situations, all focusing on the question: “Is n the correct (variance) denominator for the situation at hand?”.
Situation 1. Testing for association. • Here the null hypothesis is “no association”, or The problem here is that transmissions to different affected sibs in the same family are not independent under this null hypothesis. Thus when there are several families in the data with more than one affected sib, n12 does not have a binomial distribution.
If H0, δ=0, is true, the cell probabilities for the simple random-mating case are P(11) = q2 , P(12) = q(1 – q) , P(21) = q(1 – q) , P(22) = (1 – q)2 (Thus should we not be testing this H0 by using both n11n22 – n12 n21 andn12 – n21and a 2 degrees of freedom test?) Let’s ignore this point for now.
P(11) = (Σiαi(pi2qi2+ δi pi qi ))/(Σi αi pi2) P(12) = (Σiαi(pi2qi(1 – qi) + δi pi(1 – θ–qi)))/(Σi αi pi2) P(21) = (Σiαi (pi2qi (1 – qi) + δipi (θ – qi)))/(Σiαipi2) P(22) = (Σiαi (pi2(1 – qi)2 – δipi (1 – qi))) /(Σiαipi2) αi = relative size of subpopulation i δi = linkage disequilibrium in subpopulation i pi = frequency of D1 in subpopulation i qi = frequency of M1 in subpopulation i
n12 – n21 Suppose that in family j, M1 is transmitted n12j times, M2 is transmitted n21jtimes, from M1M2 parents. Define Dj as n12j – n21j The test statistic is
T Suppose that there is only one affected child in each family. Then Dj = ±1 (for all j) T2 = TDT χ2
Situation 2. Suppose we have families in the data where both parents are dead, (so we do not know their marker locus genotypes), but where there are two affected sibs, one being M1M1, the other M2M2. We therefore can infer that both parents were informative. Should we use the data from these families in the analysis, using the standard TDT statistic?
The answer is “no”. Why is this so? Because the very fact that we can infer the parental genotypes unambiguously means that one sib MUST be M1M1 and the other MUST be M1M1. In such families there is zero variance, rather than some binomial variance, for the number of M1 genes in the two sibs.
Philosophical question: is there any difference between the actions you take in directly observing an event and having unambiguous evidence that the event occurred? In this case, “yes there is”.
Situation 3. Suppose that we have two affected sibs, one informative (i.e. M1M2) parent, in each family in the data.
Suppose that a sharing Χ2 has been carried out, correctly, as a one-sided test. Given i + k = s, what is the distribution of Χ2TDT ?
Subpopulation 1 2 …… i …… k Relative Size α1α2 …… αi …… αk Coefficient of gametic Disequilibrium δ1δ2 …… δi …… δk Generation 0 Generation 1 Generation 2 Generation 3 Parents of generation 1 mate only within their subpopulation Gametic Disequilibrium Δ1 Parents of generation 2 mate at random throughout population Gametic Disequilibrium Δ2 = Δ1(1–θ) Parents of generation 3 mate at random throughout population Gametic Disequilibrium Δ3 = Δ2(1–θ)
Generation 0 Generation 1 Gametic Disequilibrium Δ1 Generation 2 Gametic Disequilibrium Δ2 Generation 3 Gametic Disequilibrium Δ3 Generation 4, etc
The value of the TDT statistic in two models 1. Immediate admixture Generation 1 1.48 Generation 2 2.07 Generation 3 15.34 Generation 4 12.43 2. Gradual admixture Generation 1 1.48 Generation 2 2.07 Generation 3 8.53 Generation 4 6.99