170 likes | 422 Views
Topics. What is eMERGE?Lessons learned in extracting phenotypes from EMR dataTools available to other institutions engaged in genome-phenome correlationThe future of eMERGE. RFA HG-07-005: Genome-Wide Studies in Biorepositories with Electronic Medical Record Data . 2007 NIH Request for Applicati
E N D
1. Extracting Phenotypes from EMRs for genetic association studies: the eMERGE consortium
Dan Masys, MD
Principal Investigator
electronic Medical Records and Genomics (eMERGE) Coordination Center
Professor and Chair
Department of Biomedical Informatics
Professor of Medicine
Vanderbilt University School of Medicine
2. Topics What is eMERGE?
Lessons learned in extracting phenotypes from EMR data
Tools available to other institutions engaged in genome-phenome correlation
The future of eMERGE
3. RFA HG-07-005:Genome-Wide Studies in Biorepositories with Electronic Medical Record Data 2007 NIH Request for Applications from the National Human Genome Research Institute
“The purpose of this funding opportunity is to provide support for investigative groups affiliated with existing biorepositories to develop necessary methods and procedures for, and then to perform, if feasible, genome-wide studies in participants with phenotypes and environmental exposures derived from electronic medical records, with the aim of widespread sharing of the resulting individual genotype-phenotype data to accelerate the discovery of genes related to complex diseases.”
4. eMERGE: an NIH-funded consortium (of CTSA awardee institutions) with DNA Biobanks linked to EMR data Consortium members
Group Health of Puget Sound
Marshfield Clinic
Mayo Clinic
Northwestern University
Vanderbilt University
Principal sponsor: NHGRI with additional funding from NIGMS
Condition of NIH funding: contribute genomic and EMR-derived phenotype data to dbGAP database at NCBI
6. Goal: to assess utility of electronic medical records as resources for genome science
Each site includes DNA linked to electronic medical records
Each project includes community engagement, genome science, natural language processing
Initial proposals included identifying a phenotype of interest in 3,000 subjects and conduct of a genome-wide association study at each center: S=18,000
Supplemental funding subsequently provided for cross-network phenotypes (now up to 14 done or in progress)
7. eMERGE Sample Numbers:Primary phenotypes
9. Supplement phenotypes using genotyped samples from primary phenotypes*
10. Lessons Learned from the eMERGE consortium to date
11. eMERGE Lessons Learned Clinically derived phenotypes are a valid source for replicating genome-phenome associations found in research cohorts, and a discovery resource for new associations (>25 manuscripts published or in process)
Strength in numbers. Many EMR derived phenotypes have allelic associations that only achieve genome-wide significance when all data sets across the network are pooled.
12. eMERGE Lessons Learned, cont’d
High quality EMR-derived phenotypes require four elements: codes, labs, meds, and Natural Language Processing capability
Cohort identification logic, once validated to have a high positive predictive value (PPV), at one site, can be transported to other very heterogeneous EMR systems with little degradation of PPV
13. Informatics Issues in multi-institution data pooling Determination of comparability of patient populations across institutions
Data exchange standards for phenotype data
Data harmonization of variables across heterogeneous EMRs
15. Informatics Issues in multi-institution data pooling Determination of comparability of patient populations across institutions
Data exchange standards for phenotype data
Data harmonization of variables across heterogeneous EMRs
Re-identification potential of clinical data and associated samples: maximizing scientific value while complying with federal privacy protection policies
17. eMERGE Lessons Learned, cont’d
More = merriereMERGE Phase II RFA is available, and expansion of the consortium is expectedDue date November 17, 2010