240 likes | 250 Views
This article explores the lessons learned from human studies and the use of low density chips in genetic research, including the successful identification of alleles associated with various diseases and the challenges of finding causal variants. It also discusses the importance of data sharing and the use of animal models in quantitative trait association studies. The article concludes with a focus on imputing low density genotypes from high density data and the accuracy of this process.
E N D
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008
What Can We Learn from Human Studies? • 3 years of GWAS (genome-wide associations using) using high-density SNP panels has been successful in identifying alleles that contribute risk to disease such as diabetes, age-related macular degeneration, Crohns disease and cardiovascular events • Genetic variation in CAPON associated with Type 2 diabetes, QT heart interval and schizophrenia
Allelic Architecture McCarthy et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges, NatGenRev 2008
Lessons Learned • Allelic architecture • Alleles found to date do not account for majority of familial risk estimated from epidemiological studies • Finding causal variants a challenge • Sequencing cost to identify all variation in 50-100kb regions still prohibitive • Characterizing biologic mechanism through functional studies
Ingredients for Success - Technology • Human Genome Project • Genome sequence • HapMap • Catalog of common variation and haplotype structure in 4 target populations • High density fixed-content chips • 1M chips Illumina and Affymetrix (combined 1.6M SNPs) • 50K targeted panels • 1000 Genomes Project • Identify low frequency polymorphisms
Ingredients for SuccessData Sharing • Increased (forced?) cooperation across groups • Essential for replication • Meta analyses to increase sample size power • Public access to data • dbGAP (repository of GWA data) • Best minds have access to the data for analysis and methods development • Reports of new findings on public data from different methods
Results • Ten newly identified and two previously reported loci were strongly associated with variation in height • P values from 4x10E-7 to 8xE10-22. • Together 12 loci account for < 2% of the population variation in height • Individuals with <= 8 height-increasing alleles and >16 height-increasing alleles differ in height by< 3.5 cm. • Sample sizes > 100,00O have identified over 60 height alleles
Lessons Learned • Sample sizes required to detect common with low effect sizes are large • Replication is essential to confirm findings • Initial results often not reproduced • Meta analysis methods important to combine data across studies • SNP effects and ranking often change as sample sizes increase
Animal Model Quantitative Trait Association • Yi = m + bj cij + kgi + ai + ei, • Yi is the phenotype of the ith individual • cij are covariates, bj is the covariate effect • gi is the genotype, k is the genotype effect • ai the additive polygenic effect • ei is the residual error
Chr 29 LD Plot 1000 OLD Animals Chr 29 LD Plot 1000 YNG Animals
Low Density SNP Selection • Forward regression model building • Add SNP to model • Compare to model without SNP • If the model fit is better, keep the SNP • Final set depends in order SNPs added to model • Genomic matrix • Relationship between animals based on genetic data rather than pedigree
Animal Model and Genetic Prediction • Ypredictee = + WV-1(Y-Xb), • m is the contribution of SNP effects • V-1(Y-Xb) are the fitted residuals using predictor set • W = Cov(Predictee,Predictor) is the covariance matrix between predictee and predictor animals (A or G matrix) • Predictive Ability • Predictor set: 3570 proven bulls from 2003 • Predictee set: 1791 bulls from 2003 that have proofs in 2008 • Measure correlation of predicted with observed
Net Merit Predicted vs. Observed PTAGenomic Matrix R2 = 0.32
Low Density to High Density • Use high density of ancestors to infer genotypes of offspring • Inferred genotypes used in genomic prediction for other phenotypes • 384 low density: 38,400 high density • 100 SNPs between two high density • Low density SNP every 10 Mb • Crossovers every 100 Mb
Imputing Low Density High High 12 12 21 1 2 1 2 1 2 Low 11 ? ? 22 100 missing markers
Imputing Low Density High High 12 12 21 1 2 1 2 1 2 Low 11 12 22
Imputing Low Density High High 12 12 21 1 2 1 2 1 2 Low 11 12 22
Low Density to High Density • Accuracy of low density to high density depends on number and proximity of high density genotyped relatives • Current work will quantify the accuracy using the 15,000 Holstein samples with high density genotyping • Censor high density calls • Predict low density • Compare with observed data
University of Maryland Brackie Mitchell Toni Pollin Alan Shuldiner USDA AIPL / BFGL Paul VanRaden Tad Sonstegard Curt Van Tassell George Wiggans Funding NIH U01 HL084756 NRI 2007-32205-17883 Acknowledgements