420 likes | 569 Views
Some current issues in QTL identification. Lon Cardon Wellcome Trust Centre for Human Genetics University of Oxford. Acknowledgements: Goncalo Abecasis Stacey Cherny Twin course faculty. LOD. Positional Cloning. Genetics. Chromosome Region. Association Study. Sib pairs. Genomics.
E N D
Some current issues in QTL identification Lon Cardon Wellcome Trust Centre for Human Genetics University of Oxford Acknowledgements: Goncalo Abecasis Stacey Cherny Twin course faculty
LOD Positional Cloning Genetics Chromosome Region Association Study Sib pairs Genomics Candidate Gene Selection/ Polymorphism Detection Mutation Characterization/ Functional Annotation Physical Mapping/ Sequencing
Inflammatory Bowel Disease Genome Screen Hampe et al., Am J Hum Genet, 64:808-816, 1999
Inflammatory Bowel Disease Genome Screen Hampe et al., Am J Hum Genet, 64:808-816, 1999
Genome Screens for Linkage in Sib-pairs 1997/98 1999 - Diabetes (IDDM + NIDDM) - Asthma/atopy - Osteoporosis - Obesity - Multiple Sclerosis - Rheumatoid arthritis - Systemic lupus erythematosus - Ankylosing spondylitis - Epilepsy - Inflammatory Bowel Disease - Celiac Disease - Psychiatric Disorders (incl. Scz, bipolar) - Behavioral traits (incl. Personality, panic) - others missed... • - NIDDM • Asthma/atopy • Psoriasis • Inflammatory Bowel Disease • - Osteoporosis/Bone Mineral Density • - Obesity • - Epilepsy • - Thyroid disease • - Pre-eclampsia • - Blood pressure • - Psychiatric disorders (incl. Scz, bipolar) • Behavioral traits (incl. smoking, alcoholism, • autism) • - Familial combined hyperlipidemia • - Tourette syndrome • - Systemic lupus erythematosus • - others missed…
0 Well, at least < 5 Human QTL Linkage Gene Identification Successes
Why so few successes in human QTL mapping? • Many valid reasons proposed: • Phenotypic complexity (not measured well) • Genetic complexity (many genes of small effect, GxE, • epistasis) • Genotype error • Sampling design • Statistical methods • …. Most linkage studies have been under-powered (and over-hyped)
QTL Mapping has very low power ! 1000 sibs, no parents: markers every 10 cM, each marker H=0.8 QTL h2=0.33 Kruglyak L, Lander ES. (1995). Am J Hum Genet 57: 439-454
Increasing power to detect linkage in sib-pairs • Phenotypic selection • Carey & Williamson, 1991, AJHG • Eaves & Meyer, 1994, Behav Genet • Cardon & Fulker, 1994, AJHG • Risch & Zhang, 1996, AJHG
Information Score for Additive Gene Action (p=0.5) 350 300 250 Information score 200 150 100 10 8 6 1 2 3 Sib 2 4 4 5 6 7 2 8 9 10 Decile ranking - Sib 1
Linkage Analysis of QTLs-Summary- • Spotted history. Few, if any, bona fide successes • Power has been large problem • Of the few replicated loci, most have used some form of selection • EDAC, other selection schemes from large cohorts now underway • Genome-scans coming soon • Promising beginning for QTL linkage mapping
LOD Positional Cloning Genetics Chromosome Region Association Study Sib pairs Genomics Candidate Gene Selection/ Polymorphism Detection Mutation Characterization/ Functional Annotation Physical Mapping/ Sequencing
Association Analysis • Simple genetic basis • Short unit of resemblance • Population-specific • One of easiest genetic study • designs • Correlate allele frequencies with traits/diseases • At core of monogenic & oligo/polygenic trait models • Widely used in past 20 years • HLA, candidate genes, pharmacogenetics, positional cloning
Angiotensin-1 Converting Enzyme Keavney et al. (1999) Hum Mol Gen, 7:1745-1751
T-5991C T-3892C T-93C G2215A G2350A A-5466C A-240T T1237C I/D 4656(CT)3/2 Evidence for Linkage
T-5991C T-3892C T-93C G2215A G2350A A-5466C A-240T T1237C I/D 4656(CT)3/2 Results of ACE analysis using VC association model
Alzheimers and ApoE4 Roses, Nature 2000
Association Resolution by Position Roses, Nature 2000
Toward a linkage disequilibrium map of the human genome LD/haplotype map objective: find regions of high and low ancestral conservation to clarify signal/noise in allelic association studies History of LD studies in humans: • > 10 year ago, emphasis mainly on theory • LD measures, decay, population comparisons, … • 1989: 1st use of LD for disease mapping: Cystic Fibrosis • Recent years, gene-based haplotypes used widely for monogenic mapping • Last 2 years: larger scale assessment of common alleles • in reference populations
Reich et al, Nature 2001 Eaves et al, Nat Genet 2000 Taillon-Miller et al, Nat Genet 2000 Haplotype Map: Data/Interpretations Distribution of pairwise LD ‘average extent of LD’ LD differences in genes Stephens et al, Science 2001 Johnson et al, Nat Genet 2001 Abecasis et al, AJHG 2001
Haplotype Map: Data/Interpretations Local patterns of LD … Conserved haplotype segments ... ‘Blocks’ 5q31. Daly et al, Nat Genet 2001 MHC class II. Jeffreys et al, Nat Genet 2001 Chr21. Patil et al, Science 2001
Current Status: Data/Interpretations • How to define ‘useful’ LD is still unclear • Easier to focus on pairwise LD rather than haplotypes. • Is this efficient? • For common alleles, D’ measure, LD extends ~ 50-60 kb on average • For rare alleles, ? • There is great variability in regional patterns of LD • Explanations, predictors yet unknown • Haplotype blocks are detectable and present broadly • Size of blocks? How best to define them? Utility of htSNPs?
Human Genome Haplotype Map • NIH/TSC/Wellcome Trust funded international collaboration (likely) • follow-on from human sequencing project & SNP consortium • Hierarchical strategy • ‘sparse-map’ then more fine • Initially use available SNPs • Multiple populations • some family-based, most likely to be unrelateds • Aim is to catalog regions of high LD down to very fine-scale (ie., find big and small blocks)
Human Chromosome 22 • First human chromosome to be “fully” sequenced • Extensive knowledge of genomic landscape • Abundance of SNPs and other variants/bp ~34.5 Mb on q-arm; p-arm mostly structural RNA; 679 genes on q Dunham et al, Nature, 1999
Samples • 7 x 3 generation CEPH families • 77 Individuals • 59 founder chromosomes • 1505 SNPs successfully genotyped • 90 Unrelated Caucasian Individuals • 1286 SNPs genotyped (1261 overlapping with CEPHs) • 51 Unrelated Estonian Individuals • 908 SNPs genotyped (594 overlapping with CEPHs)
N = 1505 markers. Median spacing = 15.07kb. 4 gaps > 200 kb. Smallest = 12 bp; largest = 293 kb.
Decay of LD on chromosome 22 Means inCEPHs, Unrelateds, Combined &EstonianSamples
Representing LD along a chromosome • Following several trends in genetics, genotyping technology outpaced ability to analyze LD information… • How to characterize regions of ‘interesting’ linkage disequilibrium? • Simply examine average levels across region/chromosome? • Fit models to data, look at expectations & specific predictions • Consider ‘interesting’ LD tracts as long runs of LD – borrow from extant statistical approaches • Look for ‘blocks’ of LD in the genome
LD Along Chromosome 22 Average D’ D’ Half-Life Disequilibrium Fingerprint
Chromosome 22 Haplotype Blocks Plus 3 individual blocks: Position SNPs Haplos Length 4.6-4.8 M 11 6 231 kb 8.2-8.4 M 8 4 264 kb 34.3 M 11 3 82 kb
Microsatellite distance 1 Mb/cM Recombination Pattern on Chromosome 22 60 50 40 cM 30 20 10 0 0 5 10 15 20 25 30 35 Sequence Position (Mb)
Microsatellite distance 1 Mb/cM Recombination and Gene Density on Chromosome 22 Gene Density
Linkage Disequilibrium Map of Chromosome 22 - Summary - • LD ‘half-length’ ~ 50 kb, but depends on measure & what is “useful” LD • Family & unrelated samples yield consistent patterns • Different analytical tools provide complementary views of long blocks • 15% chromosome 22 in long LD blocks in these samples (40% in shorter blocks) • Why? Selection, selective sweeps? Chromosome structure? Popln age? • LD correlated with gene-density, GC content and related repeats. • Gene/GC correlations almost entirely collinear with genetic distance. • LD patterns can immediately assist positional association studies: • Prioritise candidate regions. • Use extant genetic maps and simple repeat structures in design & power.
Mapping QTLs in families: Summary • Linkage and association studies follow directly from fundamental biometrical principles. • Linkage studies of complex traits can work: All principles of this course apply • - power, study design, careful phenotype selection/modelling, • comparison of statistical models • New information about LD patterns should facilitate association studies • - help form a priori hypotheses and guide replication. 16th Annual Course on Methodology for Twins and Families Advanced workshop: Boulder, Colorado, March 2003