310 likes | 498 Views
Rare variant analysis in large-scale association and sequencing studies. Eleftheria Zeggini eleftheria@sanger.ac.uk. Missing heritability in complex traits . Interactions Structural variation Epigenetics and environment Thousands of very small effects
E N D
Rare variant analysis in large-scale association and sequencing studies • Eleftheria Zeggini • eleftheria@sanger.ac.uk
Missing heritability in complex traits • Interactions • Structural variation • Epigenetics and environment • Thousands of very small effects • Large phenotype-genotype heterogeneity • Locus heterogeneity and rare variants
Low frequency and rare variants • Low frequency (0.01<MAF<0.05) and rare variation (MAF<0.01) can contribute to complex common phenotypes • Rare variants can have higher penetrance, contribute to more extreme phenotypes and may be more useful as predictive markers • Accessing low frequency and rare variants through: • GWAS • imputation • re-sequencing
Rare variant analysis • Single-point analysis of rare variants is under-powered • Approximate sample sizes (cases+controls, equally sized) required to attain 80% power to detect an allelic OR=2.0 at α=5×10−8 dramatically increases as MAF decreases: • An alternative is to use multivariate methods to combine information across multiple variant sites • Several locus-specific approaches have been proposed • collapsing methods • allele-matching methods
Rare variant analysis methods: challenges Imputation Genotype-associated probabilities Resequencing Genotype call uncertainty False positive rate Probability that a variant be functional Family-based designs Extreme distribution ends designs Incorporating multiple covariates Correlation structure Direction of effect Meta-analysis
Collapsing methods pi 0.2 0.1 0.0 0.2 ARIEL: Accumulation of Rare variants Integrated and Extended Locus-specific test
Allele-matching methods cases 2 4 4 4 4 2 controls 2 4 0 4 4 4 Compare similarity scores between cases and controls at each SNP, then sum over SNPs: KBAT Mukhopadhyay et al, Gen Epi 2009 Extended to account for uncertainty: AMELIA (Allele-Matching Empirical Locus Integrated Association test)
Power comparison 1000 replications, d=0.02, Q=0.05, non-consensus SNP quality scores, 1000 cases/1000 controls, causal variants are of high quality (phred score 10; probability of correct base-call 0.90) • in the presence of different directions of effect allele-matching methods are much more powerful than collapsing methods • accounting for uncertainty increases power
Power comparisons using 500 cases/500 controls and 1000 cases/1000 controls, when causal variants are of high quality (phred score 10; probability of correct base-call 0.90) • the power of the allele-matching methods further increase over the collapsing methods with increasing sample size • accounting for uncertainty increases power
Population isolates • The study of rare variants can be empowered by focusing on isolated populations, in which rare variants may have increased in frequency and linkage disequilibrium tends to be extended • Need deeply-phenotyped isolated population samples • Whole-genome sequencing in a subset of samples and imputation out into the full set of GWASed samples • Association with traits of interest
Osteoarthritis • Osteoarthritis (OA) is characterised by cartilage degeneration in synovial joints leading to pain and loss of function particularly in the hip and the knee • OA is a common complex disease with environmental and genetic components affecting 40% of people over the age of 70 years • Current treatments: analgesics, total joint replacement (TJR) • To date only two loci have been robustly associated with OA • Common variants (>0.20 MAF) small effect sizes • (OR~1.15)
3,177 cases 4,854 controls Directly typed SNPs (Illumina 610k) Imputed SNPs: HapMap Imputed SNPs: 1000 genomes
Directly-typed HapMap-based Imputation
Directly-typed HapMap-based Imputation 1KGP-based Imputation
Intron 4 of the guanine nucleotide exchange factor-encoding gene MCF2L Mcf2l studies in rat models of OA have shown expression in articular chondrocytes In human cells MCF2L regulates neurotrophin-3 induced cell migration in Schwann cells. Neurotrophin-3 is a member of the nerve growth factor (NGF) family, and inhibition of NGF has an effect on the pain experienced by OA patients
Analysis of rare variants in sequence data Long-range PCR Targeted resequencing Whole-genome and whole-exome resequencing PE sequencing Data processing and statistical analysis PE library preparation Pulldown
500 Exomes Project • Collaborative exome resequencing experiment between the Sanger Institute, GSK and Lausanne University • Study design: • 500 individuals from the CoLaus cohort with BMI>25 • 250 with type 2 diabetes and 250 normoglycaemic matched controls • Affymetrix 500k GWAS data • Exome sequencing • Mean depth ~65x
500 Exomes Project –preliminary data Single-point ARIEL AMELIA
UK10K projectRare genetic variants in health and disease 4,000 whole genomes: population-based cohorts with rich phenotype data 6,000 whole exomes: obesity, neurodevelopmental disorders and further rare diseases • Aims • Elucidate singleton variants by maximising variation detected • Directly associate genetic variations to phenotypic traits • Uncover rare variants contributing to disease • Assign uncovered variations into genotyped cohort and case/control collections • Provide a sequence variation resource for future studies www.uk10k.org
Acknowledgements Jenn Asimit Andrew Morris Reedik Magi
Acknowledgements A.G. Day-Williams, L. Southam, K. Panoutsopoulou, N.W. Rayner, T. Esko, K. Estrada, H.T. Helgadottir, A. Hofman, T. Ingvarsson, H. Jonsson, A. Keis, H.J.M. Kerkhof, G. Thorleifsson, N.K. Arden, A. Carr, K. Chapman, P. Deloukas, J. Loughlin,A. McCaskie, W.E.R. Ollier, S.H. Ralston, T.D. Spector, G.A. Wallis, J.M. Wilkinson, N. Aslam, F. Birell, I. Carluke, J. Joseph, A. Rai, M. Reed, K. Walker, S.A. Doherty, I. Jonsdottir, R.A. Maciewicz, K.R. Muir, A. Metspalu, F. Rivadeneira, K. Stefansson, U. Styrkarsodottir, A.G. Uitterlinden, J.B.J. van Meurs, W. Zhang, A.M. Valdes, M. Doherty, arcOGEN Consortium
500 Exomes Project A partnership between the Wellcome Trust Sanger Institute, the CoLaus principal investigators and the Quantitative Sciences dept. of GlaxoSmithKline GSK: Lausanne: Vincent Mooser Peter Vollenweider John Whittaker Gerard Waeber Linda McCarthy Jacques Beckmann Matt Nelson Sven Bergmann Claudio Verzilli Pedro Marques Vidal Judong Shen Murielle Bochud Stephanie Chissoe Zoltan Kutalik Charles Cox Meg Ehm Keith Nangle Dana Fraser Kijoung Song Peter Woollard Dawn Waterworth Wellcome Trust Sanger Institute: Jennifer Asimit Ines Barroso Caren Brockington Yuan Chen Aaron Day-Williams Richard Durbin Martin Hunt Sarah Hunt Matt Hurles Jimmy Liu Margarida Lopes Daniel MacArthur Aarno Palotie Theo Papamarkou Fliss Payne Manj Sandhu Carol Scott Lorraine Southam Ioanna Tachmazidou Chris Tyler-Smith Ellie Wheeler Bendik Winsvold Yali Xue Eleftheria Zeggini
Named collaborators Phil Beales, University College London Jamie Bentham, University of Oxford Shoumo Bhattacharya, University of Oxford Patrick Bolton, King's College London Gerome Breen, King's College London Krishnan Chatterjee, University of Cambridge Laura K Curran, King's College London Anne Farmer, King's College London David Fitzpatrick, Edinburgh University Daniel Geschwind, UCLA, USA Steve Humphries, University College London JoukoLonnqvist, National Public Health Institute, Finland Peter McGuffin, King's College London Lucy Raymond, University of Cambridge David Savage, University of Cambridge Peter Scambler, University College London Robert Semple, University of Cambridge David St Clair, University of Aberdeen Lennart von Wendt, University of Helsinki, Finland Principal Applicants Leena Peltonen, Wellcome Trust Sanger Institute Richard Durbin, Wellcome Trust Sanger Institute Co-applicants Jeffrey Barrett, Wellcome Trust Sanger Institute Ines Barroso, Wellcome Trust Sanger Institute George Davey-Smith, University of Bristol Ismaa Sadaf Farooqi, University of Cambridge Matthew Hurles, Wellcome Trust Sanger Institute Stephen O'Rahilly, University of Cambridge Aarno Palotie, Wellcome Trust Sanger Institute Nicole Soranzo, Wellcome Trust Sanger Institute Tim Spector, King's College London Eleftheria Zeggini, Wellcome Trust Sanger Institute
Supported by the Wellcome Trust, Arthritis Research UK, Pfizer