550 likes | 648 Views
Statistical Challenges in Genetic Studies of Mental Disorders. Heping Zhang Collaborative Center for Statistics in Science Yale University School of Medicine June 9, 2009 Institute of Mathematical Statistics National University of Singapore. Outline.
E N D
Statistical Challenges in GeneticStudiesof Mental Disorders Heping Zhang Collaborative Center for Statistics in Science Yale University School of Medicine June 9, 2009 Institute of Mathematical Statistics National University of Singapore
Outline • Heredity of Psychiatric Disorders – A Century Ago • One Example • Genetic Studies of Mental Disorders – As We Are Speaking • Three Examples • Statistical Challenges – Our Progress • Ordinal Traits • Multivariate Traits • Closing Comments and Acknowledgements
PRELIMINARY REPORT OF A STUDY OF HEREDITY IN INSANITY IN THE LIGHT OF THE MENDELIAN LAWS BY GERTRUDE L. CANNON, A.M., AND A. J. ROSANOFF, M.D. KINGS PARK STATE HOSPITAL, NEW YORK Journal of Nervous and Mental Disorders, May 1911
The Genetics of Tourette Syndrome Tourette syndrome is a complex disorder characterized by repetitive, sudden, and involuntary movements or noises called tics. Concordance in MZ twins ~ 50% Concordance in DZ twins < 10% In 1986, Pauls and Leckman concluded that Tourette's syndrome is inherited as a highly penetrant, sex-influenced, autosomal dominant trait. Pete Bennett, winner of the 7th series of Big Brother
The Genetics of Tourette Syndrome In 2005, State’s lab identified mutations involving the SLITRK1 gene (13q31.1) in a small number of people with Tourette syndrome. Most people with Tourette syndrome do not have a mutation in the SLITRK1 gene. Because mutations have been reported in so few people with this condition, the association of the SLITRK1 gene with this disorder has not been confirmed. TSICG (2008): Lack of association between SLITRK1var321 and Tourette syndrome in a large family-based sample
Schizophrenia Schizophrenia is a chronic, severe, and disabling brain disorder that affects about 1.1 percent of the U.S. population age 18 and older in a given year. People with schizophrenia sometimes hear voices others don’t hear, believe that others are broadcasting their thoughts to the world, or become convinced that others are plotting to harm them. These experiences can make them fearful and withdrawn and cause difficulties when they try to have relationships with others. http://www.nimh.nih.gov
Genetic Studies of Schizophrenia Kraepelin (Textbook of Psychiatry, 1896) described ‘Dementia Praecox’ as an inherited disorder. Kety, Rosenthal, and Wender conducted a series of adoption studies beginning in 1968, establishing genetic basis for schizophrenics. Attract a lot of publicity, but couldn’t be replicated In 1987-88, it was reported “Bipolar affective disorders linked to DNA markers on chromosome 11” and “Localization of a susceptibility locus for schizophrenia on chromosome 5.” Some regions (e.g., dysbindin on chromosome 6p, neuregulin on 8p and G72 on 13q) have been more consistently identified as candidate regions. There may not be a true sequence variation in a gene that causes illness. Rather, variable expression through epigenetic modification of gene activation may be the key (DeLisi et al. 2007).
Genetic Studies of Schizophrenia Nature Online July 30, 2008: • International Schizophrenia Consortium: 3,391 schizophrenia cases and 3,181 controls in a European sample • Stefansson et al.: 1,433 schizophrenia cases and 33,250 controls 3,285 cases and 7,951 controls • Both groups report genetic deletions associated with schizophrenia in the same three locations on chromosomes 1 and 15 a third deletion on chromosome 22 that has previously been connected with increased susceptibility to schizophrenia.
Genetic Studies of Schizophrenia Nature July 31, 2008: • The surveys have identified sections of the human genome that, when deleted, can elevate the risk of developing schizophrenia by up to 15 times compared with the general population. • In ISC study, a total of 890 CNVs were observed in either a case or a control as a single occurrence. This set of CNVs showed a 1.45-fold increase in cases (empirical P = 5E-6). On average, 13.1% of cases of schizophrenia possessed a deletion or duplication observed only once in the sample, in contrast to 10.4% of controls.
Smoking In 1990s, a series of large-sample twin studies in the US and other countries showed repeatedly that smoking is a heritable behavior. The heritability for nicotine dependence is estimated around 50%. In the last decade, about 20 genome-wide linkage scans for smoking behavior have been reported, but only a limited number of putative genomic linkages have been replicated in independent studies (Li 2007). Challenges include genetic heterogeneity, the size of the genetic effect, the density of markers, the definition and assessment of the phenotypes, and the statistical approaches (Li 2007).
Diagnosis of Psychiatric Disorders Yale Global Tic Severity Scale and the symptom checklist and Yale-Brown Obsessive Compulsive Scale Ordinal scales Review with the family Perform comorbid psychiatric diagnoses using the Schedule for Affective Disorders and Schizophrenia for School-Age Children, the Children’s Depression Rating Scale-Revised, and the Revised Children’s Manifest Anxiety Scale.
Example 1: 295.30 Schizophrenia, Paranoid Type, Continuous Current: With severe psychotic dimension With absent disorganized dimension With moderate negative dimension Lifetime: With mild psychotic dimension With absent disorganized dimension With mild negative dimension Example 2: 295.60 Schizophrenia, Residual Type, Episodic With Residual Symptoms Current: With mild psychotic dimension With mild disorganized dimension With mild negative dimension Lifetime: With moderate psychotic dimension With mild disorganized dimension With mild negative dimension Schizophrenia – DSM-IV http://www. psychiatryonline.com
Substance Abuse and Dependence An individual continues use of the substance despite significant substance-related problems. Dependence is defined as a cluster of three or more of the symptoms (Tolerance, Withdrawal, etc.) occurring at any time in the same 12-month period.
In Summary Psychiatric disorders are generally assessed with instruments based on ordinal severity scores Comorbid psychiatric disorders are common: TS, OCD, ADHD, etc. Smoking, Alcohol, Depression, etc. Fagerstrom Test for Nicotine Dependence (FTND)
Experimental Cross Ordinal Traits April 24, 2009 September 17, 2008
LOT: Linkage Analysis of Ordinal Traits LOT is a software program that performs linkage analysis of ordinal traits for pedigree data. It implements a latent-variable proportional-odds logistic model that relates inheritance patterns to the distribution of the ordinal trait.
LOT: Methodology • Inference of Inheritance Vectors v(t) • Nuclear family: 2 founders and n nonfounders • Alleles of the two founders (1,2) (3,4) • v(t) = (v1, v2, …, v2n-1, v2n)’ • More complex pedigree: f founders and n nonfounders. • Alleles of the f founders (1,2) (3,4) (5,6) … (2f-1,2f) =1, if grandpaternal allele is transmitted to the paternal meiosis to the jth sibling v2j-1 =2, if grandmaternal allele is transmitted to the paternal meiosis to the jth sibling =3, if grandpaternal allele is transmitted to the maternal meiosis to the jth siblingz v2j =4, if grandmaternal allele is transmitted to the maternal meiosis to the jth sibling • Genetic Model and Hypothesis Testing • Latent variable • U1 : common genetic or environmental factors in a family not observed through the covariates • U2: genetic susceptibility of the family founders and nonfounders • Proportional-odds logistic model
LOT: Data Files • Two input files are required: a locus data file and pedigree file. • Locus file: This file contains information on genetic distances between markers, number of alleles at each locus and their frequencies. The format of this file is very similar from the standard GENEHUNTER (or LINKAGE) format. • Pedigree file: This file consists of columns with the following information in the correct order : • Pedigree_ID Person_ID Father_ID Mother_ID Sex Phenotype Marker_genotypes Covariates
Association Analysis … … n families
O-TDT General Test Statistic Assume that there are n nuclear families. In the family, there are siblings, i=1,…, n. For the child in the family, the trait value is , the covariates is and the genotype is . is the number of allele A in the genotype . The association test statistic can be constructed as follows: where is a weight function of and .
O-TDT Model and Method • Di-allelic maker with possible alleles A and a. • Assume that there is a trait increasing allele , and we use to denote the wild type allele(s) • Consider a trait taking values in ordinal responses 1,…, K.
The score function under the null hypothesis is , where O-TDT Score Statistic
Simulation Powers Based on 10,000 Replications – Test for Association in the Presence of Linkage
Collaborative Studies on Genetics of Alcoholism (COGA) • In United States, 12.5% of Adults has ever had alcohol dependence problem in their life time (Hasin, et al, 2007) • A large scale, multi-center study to map alcohol dependence susceptible genes. • 143 families with 1614 individuals. 4720 SNPs from Illumina genotype data set. • One ordinal trait with 4 levels was recorded (pure unaffected, never drank, unaffected with some symptoms, and affected). • FBAT was also used for comparison
Multivariate Traits Multivariate Traits Smoking Extraneous Variable Nicotine Drinking Comorbid psychiatric disorders are common and their determinants are multi-factorial.
Multivariate Traits Multivariate Traits In theory, comorbid disorders should be considered. Technically, testing multiple traits simultaneously can avoid adjusting for multiple testing. But • How beneficial is it to consider multiple traits? • In what situations, is it most beneficial to consider multiple traits?
Graphical Structures for Simulation Models Although we do not observe the causal relationship between the genotypes and traits or among the traits, we generate the data from 40 directed acyclic graphs (DAGs). For example, An arrow between any two elements points to a causal relationship
SEMs for each DAG (quantitative traits) For in a DAG, if there exist some arrows pointing to , say, an arrow from gene to and an arrow from to , we reflect these relationships through a linear regression model as follows, If there are no arrows pointing to , is independent of the disease gene and other traits, and distributed as
Heritability and Interability Without loss of generality, we use the following models for illustration
Extraneous Variables (EV) There may exist one or more extraneous variables that are not included in the traits under consideration and that results in correlations among the traits under consideration
Simulation Design and Settings • Generate the parent’s genotype via the haplotype frequencies (AD=0.2, Ad=0.1, aD=0.1, ad=0.6, where D is the minor allele in trait locus G and A is the minor allele in the marker locus) • Given the parental genotypes, generate the offspring genotype using 1cM between trait locus and marker locus • Conditional on the trait genotype, using the SEMs of each DAG discussed above to generate the trait values for different scenario.
Testing Strategies • Univariate FBAT • Rabinowitz, 1997; Whittaker and Lewis 1998 • FBAT-GEE for multiple traits • Lange et al. 2003
Power: Quantitative Traits (Alpha=0.01) FBAT: dots and FBAT-GEE: triangles.
Multivariate Trait Kendall’s Tau: a non-parametric statistic measuring the strength of the relationship between two variables
Association Test Observations: • Notations: Test Statistic
Simulation Study-Model Setting Nominal type I error comparison Power evaluation Given the genotype at the trait locus, a non-proportional odds model is used to generate ordinal phenotype data and a Gaussian distributed model is used for quantitative phenotype