180 likes | 298 Views
The CardioVascular Research Grid (CVRG): A National Infrastructure for Representing, Sharing, Analyzing, and Modeling Cardiovascular Data. Stephen J. Granite, MS Director of Database/Software Development The Johns Hopkins University Center for Cardiovascular Bioinformatics and Modeling.
E N D
The CardioVascular Research Grid (CVRG):A National Infrastructure for Representing, Sharing, Analyzing, and Modeling Cardiovascular Data Stephen J. Granite, MS Director of Database/Software Development The Johns Hopkins University Center for Cardiovascular Bioinformatics and Modeling
Why Is There A Need For The CVRG? • The challenge of how best to represent CV data • Emerging data representation standards are seldom used • No standards for representing and no culture of sharing electrophysiological data • The challenge of sharing data • National initiatives in CV genetics, genomics and proteomics are underway but there is no direct, easy way to discover data • Facilitate data discovery • The challenge of how best to develop and deploy “hardened” data analysis workflows • The challenge of discovering new knowledge from the CV data itself
Grids And E-Science • Grids1 • Interconnected networks of computers and storage systems • Running common software • Enabling resource sharing and problem solving in multi-institutional environments • E-Science • Computationally intensive science carried out on grids • Science with immense data sets that require grid computing • Two major “bio-grids” are active today • The Biomedical Informatics Research Network (BIRN; http://www.nbirn.net/) • The Cancer Biomedical Informatics Grid (caBIG; http://cabig.nci.nih.gov/) 1I. Foster and C. Kesselman (2004). The Grid: Blueprint for a New Computing Infrastructure. Elsevier.
The Biomedical Informatics Research Network (BIRN)Principal Investigator: Mark Ellisman UCSD • Grid infrastructure for sharing, analyzing and visualizing brain imaging data sets • 32 participating research sites, > 400 investigators • 4 “driving biological projects” • BIRN is a “bottom up” effort (scientific applications drive technology) BWH Image Segmentation End User Shape Visualization UCLA Image Acquisition JHU Shape Analysis
The Cancer Biomedical Informatics Grid (caBIG)caGrid Lead Development Team: Joel Saltz OSU • NCI intramural research effort, with selected external collaborators • Develop open-source software • Enables cancer researchers to become a caBIG node • Share data with the cancer research community • Develop controlled vocabularies for describing cancer phenotypes and multi-scale data • Develop grid analytic services for analyzing cancer data sets
The CVRG Driving Biological Project (DBP)The D. W. Reynolds Cardiovascular Clinical Research Center (PI - E. Marban) • Center studies the cause and treatment of Sudden Cardiac Death (SCD) in the setting of heart failure (HF) • HF is the primary U.S. hospital discharge diagnosis • Incidence of ~ 400,000 per year, prevalence of ~ 4.5 million • Prevalence increasing as population ages • Leading cause of SCD (30-50% of deaths are sudden) • Medical expenditures ~ $20 billion per year • Manifestation of HF occurs at multiple biological levels • Genetic Predisposition via Single/Multi-Gene Mutations • Modified Gene/Protein Expression • Electrophysiological Remodeling and Altered Cellular Function • Heart Shape and Motion Changes • Reduced cardiac output, mechanical pump failure
The CVRG Driving Biological Project (DBP)The D. W. Reynolds Cardiovascular Clinical Research Center (PI - E. Marban) • Large patient cohort (~ 1,200) at high risk for SCD • All have received ICD placement to prevent SCD • Collecting multi-scale data for all these patients • Patients with ICD firings are defined as high risk for SCD; patients without as low risk • Within the 1st year, only 5% of the ICDs implanted have actually fired • Challenge – discover multi-scale biomarkers that are predictive of which patients should receive ICDs Multi-Scale Data Gene Expression Profiling Genetic Variability (SNPs) Electrophysiological Data Protein Expression Profiling Multi-Modal Imaging
The CVRG Project • R24 NHLBI Resource, start date 3/1/07 • 3 development teams • Winslow, Geman, Miller, Naiman, Ratnanather, Younes (JHU) • Saltz, Kurc (OSU) • Ellisman, Grethe (UCSD) • Aims • Develop tools for representing, managing and sharing multi-scale data • SNP, genomic and proteomic data (Project 1) • Electrophysiological data (Projects 1 & 2) • Heart Shape and Motion (Cardiac Computational Anatomy) data (Projects 1, 3 & 4) • Use multi-scale data to discover biomarkers that predict need for ICD placement (Project 5)
Project 1: The CVRG Core Infrastructure • Develop and deploy CVRG-Core middleware • Reuse components and assure interoperability with BIRN and caBIG • Open-source software stack that instantiates a CVRG node The CVRG-Core (Projects 1-5) BioMANAGE (Project 1) BioPORTAL (Project 1) BioINTEGRATE (Project 1) Data Services Multiple Analytical Methods (Projects 2,3,4&5) SNP (Project 1) Gene Expression (Project 1) Protein Expression (Project 1) EP Data (Project 2) Imaging (Project 1) Patient/ Study (Project 1) CVRG Data Services CVRG Analytic Services
Project 2: Electrophysiological (EP) Data Management And Dissemination • Goal • Adopt/develop data models to represent cardiovascular EP data • Create databases for managing and sharing these data ECG EP data ONTOLOGIES ECG & EP Data Analysis Portal XML DATABASES
Project 3: Mathematical Characterization Of Cardiac Ventricular Anatomic Shape And Motion • Goal • Develop methods for statistically characterizing variability of heart shape and motion in health and disease • Use these methods to discover shape and motion biomarkers for CV disease • Methods • Measure heart shape and motion over time in the Reynolds population using multi-modal imaging (MR, multi-detector CT and Gd+ contrast-enhanced MR) • Model variation of heart shape/motion in both the low/high risk Reynolds patients • Discover shape and motion parameters that predict who should receive ICD placement
Cardiac Computational Anatomy And Shape Analysis Large Deformation DiffeomorphicMetric Mapping2 Targets (Normal Training Set) Targets (Diseased Training Set) ? 2Beg et al (2004). Mag. Res. Med. 52: 1167 Template (smoothed)
Cardiac Computational Anatomy And Shape Analysis Large Deformation DiffeomorphicMetric Mapping2 Diseased Heart Targets (Normal Training Set) Targets (Diseased Training Set) 2Beg et al (2004). Mag. Res. Med. 52: 1167 Template (smoothed)
5 4 Supercomputing TeraGrid 6 3 1 BioMANAGE 2 Project 4: Grid-Tools for Cardiac Computational Anatomy Landmarking, Affine & LDDMM Shape Analysis Statistical Analysis Visualization Segmentation De-identification and upload
Project 5: Statistical Learning With Multi-Scale Cardiovascular Data • Goal – predict risk of SCD and identify patients to receive ICDs • Develop learning methods that work in the “small sample regime” Patient A Patient A SCD HIGH RISK Patient B Patient B SCD LOW RISK • Deploy these multi-scale biomarker discovery tools on the CVRG Portal Algorithms3-6 3Geman et al (2004). Stat. Appl. Genet. Mol. Biol. 3(1): Article 19. 4Xu et al (2005). Bioinformatics, 21(20): 3905-3911 5Anderson et al (2007). Proteomics 7(8): 1197 6Price et al (2007). PNAS 104(9): 3414
Project 6: Resource Management • Establish CVRG Working Groups to create a mechanism for community input on design and function of CVRG-Core and the CVRG CV ontologies/data models Testbed Projects (HLB-STAT) New Technologies Data Sharing/IRB • Undertake outreach efforts to inform, train, and support researchers in use of CVRG tools and resources
Acknowledgements The CVRG Development Team Ohio State University Johns Hopkins University UCSD Shannon Hastings Tahsin Kurc Stephen Langella Scott Oster Tony Pan Justin Permar Joel Saltz Mark Ellisman Jeff Grethe Ramil Manasala Siamak Ardekani Donald Geman Stephen Granite Joe Henessy David Hopkins Anthony Kolasny Aaron Lucas Michael Miller Daniel Naiman Tilak Ratnanather Kyle Reynolds Aik Tan Rai Winslow Gem Yang NHLBI (R24 HL085343)