1 / 17

Extracting Phenotypes from EMRs for genetic association studies: the eMERGE consortium

Topics. What is eMERGE?Lessons learned in extracting phenotypes from EMR dataTools available to other institutions engaged in genome-phenome correlationThe future of eMERGE. RFA HG-07-005: Genome-Wide Studies in Biorepositories with Electronic Medical Record Data . 2007 NIH Request for Applicati

susan
Download Presentation

Extracting Phenotypes from EMRs for genetic association studies: the eMERGE consortium

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Extracting Phenotypes from EMRs for genetic association studies: the eMERGE consortium Dan Masys, MD Principal Investigator electronic Medical Records and Genomics (eMERGE) Coordination Center Professor and Chair Department of Biomedical Informatics Professor of Medicine Vanderbilt University School of Medicine

    2. Topics What is eMERGE? Lessons learned in extracting phenotypes from EMR data Tools available to other institutions engaged in genome-phenome correlation The future of eMERGE

    3. RFA HG-07-005: Genome-Wide Studies in Biorepositories with Electronic Medical Record Data 2007 NIH Request for Applications from the National Human Genome Research Institute “The purpose of this funding opportunity is to provide support for investigative groups affiliated with existing biorepositories to develop necessary methods and procedures for, and then to perform, if feasible, genome-wide studies in participants with phenotypes and environmental exposures derived from electronic medical records, with the aim of widespread sharing of the resulting individual genotype-phenotype data to accelerate the discovery of genes related to complex diseases.”

    4. eMERGE: an NIH-funded consortium (of CTSA awardee institutions) with DNA Biobanks linked to EMR data Consortium members Group Health of Puget Sound Marshfield Clinic Mayo Clinic Northwestern University Vanderbilt University Principal sponsor: NHGRI with additional funding from NIGMS Condition of NIH funding: contribute genomic and EMR-derived phenotype data to dbGAP database at NCBI

    6. Goal: to assess utility of electronic medical records as resources for genome science Each site includes DNA linked to electronic medical records Each project includes community engagement, genome science, natural language processing Initial proposals included identifying a phenotype of interest in 3,000 subjects and conduct of a genome-wide association study at each center: S=18,000 Supplemental funding subsequently provided for cross-network phenotypes (now up to 14 done or in progress)

    7. eMERGE Sample Numbers: Primary phenotypes

    9. Supplement phenotypes using genotyped samples from primary phenotypes*

    10. Lessons Learned from the eMERGE consortium to date

    11. eMERGE Lessons Learned Clinically derived phenotypes are a valid source for replicating genome-phenome associations found in research cohorts, and a discovery resource for new associations (>25 manuscripts published or in process) Strength in numbers. Many EMR derived phenotypes have allelic associations that only achieve genome-wide significance when all data sets across the network are pooled.

    12. eMERGE Lessons Learned, cont’d High quality EMR-derived phenotypes require four elements: codes, labs, meds, and Natural Language Processing capability Cohort identification logic, once validated to have a high positive predictive value (PPV), at one site, can be transported to other very heterogeneous EMR systems with little degradation of PPV

    13. Informatics Issues in multi-institution data pooling Determination of comparability of patient populations across institutions Data exchange standards for phenotype data Data harmonization of variables across heterogeneous EMRs

    15. Informatics Issues in multi-institution data pooling Determination of comparability of patient populations across institutions Data exchange standards for phenotype data Data harmonization of variables across heterogeneous EMRs Re-identification potential of clinical data and associated samples: maximizing scientific value while complying with federal privacy protection policies

    17. eMERGE Lessons Learned, cont’d More = merrier eMERGE Phase II RFA is available, and expansion of the consortium is expected Due date November 17, 2010

More Related