1 / 6

Computational Challenges in Whole-Genome Association Studies

Computational Challenges in Whole-Genome Association Studies. Ion Mandoiu Computer Science and Engineering Department University of Connecticut. Approaches to Disease Gene Mapping. Cases. Controls. Association analysis  2 -test

barneskaren
Download Presentation

Computational Challenges in Whole-Genome Association Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Challenges in Whole-Genome Association Studies Ion Mandoiu Computer Science and Engineering Department University of Connecticut

  2. Approaches to Disease Gene Mapping Cases Controls • Association analysis • 2-test • Genome-wide scans made possible by recent progress in SNP genotyping technologies • Linkage analysis • LOD:=log10(L()/L(1/2)) • Very successful for Mendelian diseases (cystic fibrosis, Huntington’s,…) • Low power to detect genes with small relative risk in complex diseases [RischMerikangas’96]

  3. Computational Challenges • Detecting genotyping errors • Imputation of missing genotypes • Imputation of untyped genotypes based on reference population (e.g., Hapmap) • Haplotype inference and haplotype-based association tests • Modeling gene-gene interactions • Handling structural variation data provided by new sequencing technologies • Optimal multi-stage study design

  4. Genotype Error Detection • A real problem despite advances in technology • In [KMP07] we proposed efficient methods for error detection in trio data based on LLR approach combined with an HMM model of haplotype diversity • In ongoing work we seek to improve error detection accuracy by using low-level data such as typing confidence scores

  5. Genotype Imputation • Current genotyping platforms cover <1 mil. SNPs of ~10mil. SNPs  causal variant unlikely to be assayed directly • Untyped SNPs can be imputed based on linkage disequilibrium info inferred from high-density datasets such as Hapmap • Maximum likelihood approach: • probabilities computed using HMM Allele frequency, typed genotypes Allele frequency, imputed genotypes

  6. Acknowledgements & Advertisment • Justin Kennedy, Bogdan Pasaniuc • NSF funding (Awards 0546457 and 0543365) DIMACS Workshop on Computational Issues in Genetic Epidemiology August 21 - 22, 2008 DIMACS Center, CoRE Building, Rutgers University Presented under the auspices of the DIMACS/BioMaPS/MB Center Special Focus on Information Processing in Biology. Organizers: Andrew Scott Allen, Duke University, Ion Mandoiu, University of Connecticut Dan Nicolae, University of Chicago, Yi Pan, Georgia State University, Alex Zelikovsky, Georgia State University

More Related