260 likes | 473 Views
HHS Ignite is an “incubator for new ideas” run out of the HHS IDEA Lab. The Evolution of HHS Ignite. Expansion. NIAID SEB Innovation Challenge. Jul ‘14. Aug ‘14. Sep ‘14. Jan ‘14. Nov ‘14. HHS Ignite Innovation Program.
E N D
HHS Ignite is an “incubator for new ideas” run out of the HHS IDEA Lab. The Evolution of HHS Ignite Expansion NIAID SEB Innovation Challenge Jul ‘14 Aug ‘14 Sep ‘14 Jan ‘14 Nov ‘14 HHS Ignite Innovation Program Responding to an informal request for innovation ideas from the NIH’s National Institute of Allergy and Infectious Diseases (NIAID), a small Deloitte team submitted a written proposal. At the client’s suggestion, the proposal was submitted and the team selected to compete the 2014 HHS IdeaLab’s Ignite innovation tournament. During the 3-month pilot, the Deloitte team engaged 10 NIAID customers and created a functional proof-of-concept solution for 2 intramural scientists. The Deloitte/NIAID team successfully presented their pitch to a panel of relevant federal executives and the HHS CTO during the concluding Shark Tank on September 30, 2014. Phase 1: HHS Ignite boot camp Phase 2: NIH Interviews & Pilot Development Phase 3: HHS Ignite Shark Tank Vincent Munster, PhD Peter Jahrling, PhD NIH SEMOSS team with HHS Deputy Secretary Bill Corr & HHS CTO Bryan Sivak
SEMOSS Evolution Solution History SEMOSS is a result of several years of federal investment in federated, semantic web technology. In 2010, the Deputy Chief Management Officer (DCMO) of the Department of Defense began experimenting with Semantic Web technology. Military Health System (MHS) with help from Deloitte created a graph-based toolset for multi-dimensional analysis of disparate data sources to determine investment sequencing for the MHS IT portfolio. After MHS presented its solution to DCMO, a joint investment was made to fund a similar tool utilizing the Semantic Web. The guiding principles of the tool were that it must be standards based; allow integrating data from multiple sources; and adopt visualizations and analytics on an as-needed basis. This investment spawned SEMOSS. Solution Evolution 2010-2011 - Excel / Tableau 2011 – 2012 Neo4J 2012 – 2013 SEMOSS • Limited to single database • Answers modeled as graph traversals demonstrations • Integrated knowledge analytics environment • Transitive across databases • Collaboration • Answers modeled as reports Analysis • Excessive time spent on data preparation • analysis and visualization constraints Data Sources 1-2 Data Sources 3-4 Data Sources No Limit 1 Stakeholders 2-3 No Limit • Minimal • Repetitive Visualizations • Single Dimensional • Difficult to customize • Multi-Dimensional • Self Service Knowledge Exploration Issues Focus on visualization Not Malleable Proprietary Long cycle times None as product created to meet client needs
What Does Federated Analytics Mean? Perform Analysis Discover Insights Federate Data Visualize Decisions Share Knowledge • Elastic data integration with more than 6 connectors, including Excel/CSV, NLP, RDBMS, Cloud Aware Data sources • Context aware data, that can link across databases • W3C Standards – RDF, SPARQL • Rich library of visualizations • Parallel Coordinates • Excel style charting • Network Viz. • Heat-maps • Extensibility to adopt any visualization • Overlay visualizations to see overlaps Data Viz. Analytics • Graph Algorithms • Optimization – Linear and Non-Linear algorithms • Statistical algorithms • Equation Solving
Diverse Researchers across HHS • Strategic Planning • Public Health • Bioinformatics • Science Research • Dawei LinPhD, NIHComputer Modeling • Vincent MunsterPhD, NIHInfectious Diseases • Susanna VisserDrPh, CDCADHD • Marie Parker, NIHResearch Initiatives
Common Research Goals • Strategic Planning • Public Health • Bioinformatics • Science Research • Dawei LinPhD, NIHComputer Modeling • Vincent MunsterPhD, NIHInfectious Diseases Data Access Robust Analysis Collaboration • Susanna VisserDrPh, CDCADHD • Marie Parker, NIHResearch Initiatives
Technology Barriers • Strategic Planning • Public Health • Bioinformatics • Science Research Big Data • Dawei LinPhD, NIHComputer Modeling • Vincent MunsterPhD, NIHInfectious Diseases Multiple Sources Isolated Analysis Inaccessibility Integration Challenges Collaboration Barriers • Susanna VisserDrPh, CDCADHD • Marie Parker, NIHResearch Initiatives
Dr. Munster’s Research Middle East Respiratory Syndrome (MERS) • Strategic Planning • Public Health • Bioinformatics • Science Research Big Data • Vincent MunsterPhD, NIHInfectious Diseases Multiple Sources Isolated Analysis Inaccessibility Integration Challenges Collaboration Barriers The platform allows me to analyze and grasp large seemingly incomprehensible datasets. - Vincent Munster, PhD
Dr. Munster’s Research Challenges Private Data PubMed • 10,000Titles • Tools FAERS • 200Hours DisGeNet • 250,000 Pages • Knowledge PharmGKB • Diseases • Articles • Collaborators HGNC
Our Tested Solution Private Data PubMed FAERS DisGeNet PharmGKB • Diseases • Articles • Collaborators HGNC
Use Case Metamodel Drug Component Drug Chemical Disease DrugBank Pathway Gene CTD Author PharmGKB Molecular Function PubMed DisGeNet HGNC Researcher Datasets Biological Process Publication PubChem Private Datasets Chromosome • No single database has exhaustive information. Multiple connections ensure complete data. • The data sources above reflect the information requested by our customer. This solution can be easily customized for any researcher. Cell Component
SEMOSS Supplementing Insights Relevant Data Data Sources • Gene Expression • Chemical • Cellular Pathway • Molecular Function • Biological Process • Cytolocation • Cell Component • Gene Nomenclature • Disease • Publication • Author • Private Research Data • Online Mendelian Inheritance in Man (OMIM) • PubMed • HUGO Gene Nomenclature Committee (HGNC) • DrugBank • Comparative Toxicogenomics Database (CTD) • Disease Gene Network (DisGeNet) • PubChem • PharmGKB
Solution Benefits & Capabilities Researcher Benefits • Data Accuracy; ensure you are using validated, authoritative sources • Time Efficiency; eliminate days spent reading publications and searching for data • Single Platform; use centralized platform rather than multiple data locations • Rapid Visualization & Analysis; to gain insight and accelerate research • Scientific Collaboration; secure public/private cloud instance for collaboration • Solution Capabilities • Big Data; navigate and distill relevant data seamlessly • Extensible, Scalable Data Model; shared model of understanding • Undirected Research; what questions do we ask public data that we do not have answers to? • Broad Applicability; across many subject areas and data types • Open Data Initiatives; federal public data initiatives with no data consumption tool
SEMOSS maximizes HHS Open Data ROI by leveraging the vast networks of public and private life science data to promote insight and discovery. Solution Overview SEMOSS Solution Scientific Use Case PubMed PharmGKB DisGeNet MESH CTD Federal Health Data Environment HGNC FAERS Cloud Infrastructure SEMOSS SEMOSS SEMOSS Platform Which diseases are associated with my genes of interest? End Users Cancer Researcher
Solution Demonstration Data Sources Private Data HGNC OMIM DisGeNet CTD PharmGKB PubMed 8. FAERS 9. VAERS Diseases Articles Collaborators
Solution Demonstration Data Sources Private Data HGNC OMIM DisGeNet CTD PharmGKB PubMed 8. FAERS 9. VAERS Diseases Articles Collaborators
Solution Demonstration Data Sources Private Data HGNC OMIM DisGeNet CTD PharmGKB PubMed 8. FAERS 9. VAERS Diseases Articles Collaborators
Solution Demonstration Data Sources Private Data HGNC OMIM DisGeNet CTD PharmGKB PubMed 8. FAERS 9. VAERS The platform allows me to analyze and grasp large seemingly incomprehensible datasets. - Vincent Munster, PhD Diseases Articles Collaborators
SEMOSS Supplementing Insights Identify Question • SEMOSS pre-packages more than eighty questions across domains that can be readily utilized. New questions can be modeled as reports. Synthesize Meta Model • SEMOSS has more than ten different domain metamodels. New models can be created / extended to emulate mental models. Find and Import Data Find and Import Data • SEMOSS has industry data across healthcare, infrastructure, data and BPR that can be readily explored. Link excel data or RDBMS to existing data for analysis • SEMOSS has industry data across healthcare, infrastructure, data and BPR that can be readily explored. Link excel data or RDBMS to existing data for analysis. Visual Analysis • SEMOSS allows automatic linking of data across databases and allows cross-database visualization. Users no longer need to import everything into a single database.
The Team Brock Smith Project Lead NIH Karthik Balakrishnan Technical Lead NIH Joe Croghan Project Supervisor NIH Alexander Sherman Technical SME NIH Alexandra Kwit Science Lead NIH Prabhu Kapaleeswaran Author, SEMOSS MHS Regina Cox Data SME CDC LeeAnn Bailey, PhD Science SME FDA • Special Thanks to… Alex Rosenthal NIH Peter Jahrling, PhD NIH Mike Tartakovsky NIH Dawei Lin, PhD NIH David Parrish NIH Vincent Munster, PhD NIH