Heping Zhang, Xueqin Wang and Yuanqing Ye Department of Epidemiology and Public Health

A Transmission/disequilibrium Test for Ordinal Traits in Nuclear Families and a Unified Approach for Association Studies Heping Zhang, Xueqin Wang and Yuanqing Ye Department of Epidemiology and Public Health Yale University Presented at Workshop on Genomics, NUS November 14, 2005

Outline • Data structure • TDT, Q-TDT, S-TDT, etc. • O-TDT for ordinal traits • Simulations • Data analysis • Discussion and conclusion

Data Structure … … n families

Linkage Analysis – Null Hypothesis To test for linkage, the null hypothesis is that the marker locus is not linked to any trait locus. Trait locus Marker Linked Unlinked

Trait locus Marker Linkage Analysis - Recombination Fraction

Trait locus Marker Allele Frequency Freq( , ) - Freq( )Freq( )= Coefficient of Linkage Disequilibrium Haplotype Frequency?

Null Hypothesis – Linkage Disequilibrium The null hypothesis of haplotype relative risk (Falk and Rubinstein, 1987) being 1 is: TDT is to test for linkage in presence of association or test for association in presence of linkage (Spielman et al. 1993; Ewens and Spielman 1995).

Transmission/Disequilibrium Test (TDT) • Eliminate the confounding effects caused by population stratification/admixture, and other factors • A McNemar’s test

Father Mother marker A a A a Aa x Aa x Aa x Aa x TDT-McNemar Test Suppose two heterozygous parents and an affected child are genotyped. Trans Nontrans Father Mother

Father Mother marker A a A a Aa x Nontrans A a Trans A a 0 1 0 0 TDT-McNemar Test Suppose two heterozygous parents and an affected child are genotyped. Trans Nontrans Father

Father Mother marker A a A a Aa x Nontrans A a Trans A a 0 0 1 0 TDT-McNemar Test Suppose two heterozygous parents and an affected child are genotyped. Trans Nontrans Mother

Father Mother marker A a A a Aa x Aa x Nontrans A a Trans A a 0 1 1 0 TDT-McNemar Test Suppose two heterozygous parents and an affected child are genotyped. Trans Nontrans Trans Nontrans

TDT-McNemar Test Combinations of Transmitted and Nontransmitted Marker Alleles A and a among 2n Parents of n Affected Children

Further Developments • Q-TDT proposed by Allison (1997) • Q-TDT further investigated by Rabinowitz (1997) • S-TDT (Spielman and Ewens 1998) • FBAT (Lunetta et al. 2000; Rabinowitz and Laird 2000) • Many other extensions

General Test Statistic Assume that there are n nuclear families. In the family, there are siblings, i=1,…, n. For the child in the family, the trait value is and the genotype is . is the number of allele A in the genotype . The linkage/association test statistic can be constructed as follows: where is a weight of the phenotype .

Example • For a sample of affected child-parent triads, let • then is the TDT • introduced by Spielman et al. (1993). • For a sample of nuclear families with quantitative • trait values , let , where is the average of trait values, then is the Q-TDT introduced by Rabinowitz (1997) • For ordinal trait?

TDT for Ordinal Traits Let be the count of children whose trait values greater or less than y and , the test statistic for ordinal traits is Under the null hypothesis, follows .

Model and Method • Di-allelic maker with possible alleles A and a. • Assume that there is a trait increasing allele , and we use to denote the wild type allele(s) • Consider a trait taking values in ordinal responses 1,…, K.

Two Common Assumptions • The trait and marker loci are closely linked such that, given the family’s genotypes at a trait locus , the family’s phenotypes and marker genotypes are independent; • Given disease genotypes, the traits of the family members are conditionally independent.

Conditional Likelihood The score function

Score Statistic After plugging in the estimates for the nuisance parameters, the score function under the null hypothesis is , where

Expectation and Variance • Following the idea of Rabinowitz and laird (2000), we can compute or estimate the conditional expectation and the conditional variance given the observed trait values under null hypothesis in the following three cases: • both parental marker information is available; • only one of parental marker information is available; and • none of parental marker information is available.

Expectation and Variance

Both Parents Genotyped When both parents’ genotypes are observed, the children’s genotypes are conditionally independent.

One Parent Genotyped

One Parent Genotyped (continued)

No Parental Genotype

Simulation Studies • Assess the type I error of our score test with respect to specific nominal levels (0.05, 0.01, and 0.0001) to validate the asymptotic behavior of the test statistic. • Compare the power of our test with other test statistics. • Choose the ordinal level K=3, 4, or 5.

Simulation Design • Generate the parent’s genotypes for given the haplotype frequencies

Simulation Design • Given the parental genotypes, generate the offspring genotypes assuming unlinked (null) or linked (1cM, alternative) trait and marker loci • Conditional on the trait genotype, use the proportional odds model to generate the ordinal trait. • 200 or 400 families are generated

Three models to generated trait values • A proportional odds model is used to generate an ordinal trait; • A non-proportional odds model is also used to generate an ordinal trait to assess the robustness of our score test with respect to the proportionality assumption; • A Gaussian model is used to generate a quantitative trait to evaluate the performance of O-TDT for the quantitative trait.

Ordinal Traits Generated from a Proportional Odds Model (a)

Type I Errors Based on 10,000 Replications (a)

Figure: Power comparison (a)

Non-Proportional Odds Model (b) Conditional and marginal distribution for ordinal trait

Type I Errors Based on 10,000 Replications (b)

Figure: Power comparison (b)

Performance for Quantitative Traits (c) Our test can serve as a unified test for any trait. For quantitative trait, the weights in our test are the functions of quantiles. Simulations show that our test is competitive with, but slightly less powerful than Q-TDT.

Type I Errors for Quantitative Traits Based on 100,000 Replications (c)

Power: Quantitative Trait Data are simulated similarly to the experiments for assessing type I error, except the following. Given the genotype at the trait locus, the quantitative trait follows the normal distribution with mean proportional to the number of the trait increasing allele and unit variance. Namely,

Figure: Power comparison (c)

Data (Dr. Ming Li) • Identify candidate SNPs through association analysis • Nicotine dependence was measured in 313 families with 1,396 subjects. 12 SNPs were genotyped for GPR51 gene (suggested from Framingham Heart Study samples). • One ordinal trait with 8 levels was assessed by Fagerstrom test for nicotine dependence (FTND) • FBAT was also used for comparison

FTND • 1. How many cigarettes a day do you usually smoke? (0-3 points) • 2. How soon after you wake up do you smoke your first cigarette? (0-3 points) • 3. Do you smoke more during the first two hours of the day than during the rest of the day? (0,1) • Which cigarette would you most hate to give up? (0,1) • Do you find it difficult to refrain from smoking in places where it is forbidden, such as public buildings, on airplanes or at work? (0,1) • 6. Do you still smoke even when you are so ill that you are in bed most of the day? (0,1) • TOTAL POINTS =

GPR51 Gene • G protein-coupled receptor 51 (on 5q24 on rat genome and 9p22.33 on human genome) • Combines with GABA-B1 to form functional GABA-B receptors • Inhibits high voltage activated calcium ion channels

Results

Discussion and Conclusion • We propose a score test statistic for Linkage analysis. • Although it is derived from a proportional odds model for ordinal traits, power comparisons reveal that it can serve as a unified approach for dichotomous, quantitative, and ordinal traits. • The score based Q-TDT test yields lower power than O-TDT for ordinal traits, but the difference ranges from a few to tens of percents, depending on the distribution of the ordinal traits.

Heping Zhang, Xueqin Wang and Yuanqing Ye Department of Epidemiology and Public Health