1 / 24

Semantics for eScience Susie Stephens, Principal Research Scientist, Eli Lilly

Semantics for eScience Susie Stephens, Principal Research Scientist, Eli Lilly. Outline. Introduction to the Semantic Web W3C’s Semantic Web for Health Care and Life Sciences Interest Group Semantic Web Solutions at Lilly. Introduction to the Semantic Web. Drivers for the Semantic Web.

brooksd
Download Presentation

Semantics for eScience Susie Stephens, Principal Research Scientist, Eli Lilly

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantics for eScienceSusie Stephens, Principal Research Scientist, Eli Lilly

  2. Outline • Introduction to the Semantic Web • W3C’s Semantic Web for Health Care and Life Sciences Interest Group • Semantic Web Solutions at Lilly

  3. Introduction to the Semantic Web

  4. Drivers for the Semantic Web • Business models develop rapidly these days, so infrastructure that supports change is needed • Organizations are increasingly forming and disbanding collaborations so need to be able to better share data • Increasing need in pharma to be able to query across data silos • Data is growing so quickly that it is no longer possible for individuals to identify patterns in their heads • Increasing recognition of the benefits of collective intelligence

  5. Characterizing the Semantic Web • Semantic Web is an interoperability technology • An architecture for interconnected communities and vocabularies • A set of interoperable standards for knowledge exchange

  6. Creating a Web of Data Applications Graph representation Data in various formats Source: Ivan Herman

  7. Mashing Data Source: W3C

  8. W3C’s Semantic Web for Health Care and Life Sciences Interest Group

  9. Task Forces • Terminology – Semantic Web representation of existing resources • Task lead - John Madden • Scientific Discourse – building communities through networking • Task leads - Tim Clark, John Breslin • Clinical Observations Interoperability – patient recruitment in trials • Task lead - Vipul Kashyap • BioRDF – integrated neuroscience knowledge base • Task lead - Kei Cheung • Linking Open Drug Data – aggregation of Web-based drug data • Task lead - Chris Bizer • Other Projects: Clinical Decision Support, URI Workshop, Collaborations with CDISC & HL7

  10. BioRDF: Integrating Heterogeneous Data • Integration and analysis of heterogeneous data sets • Hypothesis, Genome, Pathways, Molecular Properties, Disease, etc. PDSPki NeuronDB Reactome Gene Ontology BAMS Allen Brain Atlas BrainPharm Antibodies Entrez Gene MESH NC Annotations PubChem Mammalian Phenotype SWAN AlzGene Homologene Publications

  11. BioRDF: Looking for Targets for Alzheimer’s • Signal transduction pathways are considered to be rich in “druggable” targets • CA1 Pyramidal Neurons are known to be particularly damaged in Alzheimer’s disease • Casting a wide net, can we find candidate genes known to be involved in signal transduction and active in Pyramidal Neurons? Source: Alan Ruttenberg

  12. BioRDF: SPARQL Query Source: Alan Ruttenberg

  13. BioRDF: Results: Genes, Processes • DRD1, 1812 adenylate cyclase activation • ADRB2, 154 adenylate cyclase activation • ADRB2, 154 arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway • DRD1IP, 50632 dopamine receptor signaling pathway • DRD1, 1812 dopamine receptor, adenylate cyclase activating pathway • DRD2, 1813 dopamine receptor, adenylate cyclase inhibiting pathway • GRM7, 2917 G-protein coupled receptor protein signaling pathway • GNG3, 2785 G-protein coupled receptor protein signaling pathway • GNG12, 55970 G-protein coupled receptor protein signaling pathway • DRD2, 1813 G-protein coupled receptor protein signaling pathway • ADRB2, 154 G-protein coupled receptor protein signaling pathway • CALM3, 808 G-protein coupled receptor protein signaling pathway • HTR2A, 3356 G-protein coupled receptor protein signaling pathway • DRD1, 1812 G-protein signaling, coupled to cyclic nucleotide second messenger • SSTR5, 6755 G-protein signaling, coupled to cyclic nucleotide second messenger • MTNR1A, 4543 G-protein signaling, coupled to cyclic nucleotide second messenger • CNR2, 1269 G-protein signaling, coupled to cyclic nucleotide second messenger • HTR6, 3362 G-protein signaling, coupled to cyclic nucleotide second messenger • GRIK2, 2898 glutamate signaling pathway • GRIN1, 2902 glutamate signaling pathway • GRIN2A, 2903 glutamate signaling pathway • GRIN2B, 2904 glutamate signaling pathway • ADAM10, 102 integrin-mediated signaling pathway • GRM7, 2917 negative regulation of adenylate cyclase activity • LRP1, 4035 negative regulation of Wnt receptor signaling pathway • ADAM10, 102 Notch receptor processing • ASCL1, 429 Notch signaling pathway • HTR2A, 3356 serotonin receptor signaling pathway • ADRB2, 154 transmembrane receptor protein tyrosine kinase activation (dimerization) • PTPRG, 5793 ransmembrane receptor protein tyrosine kinase signaling pathway • EPHA4, 2043 transmembrane receptor protein tyrosine kinase signaling pathway • NRTN, 4902 transmembrane receptor protein tyrosine kinase signaling pathway • CTNND1, 1500 Wnt receptor signaling pathway Many of the genes are related to AD through gamma secretase (presenilin) activity Source: Alan Ruttenberg

  14. LODD: Introduction Linked Data Browsers Linked DataMashups Search Engines Thing Thing Thing Thing Thing Thing Thing Thing Thing Thing typedlinks typedlinks typedlinks typedlinks A E C D B • Use Semantic Web technologies to • 1. publish structured data on the Web • 2. set links between data from one data source to data within other data sources Source: Chris Bizer

  15. LODD: Potential Links between Data Sets Source: Chris Bizer

  16. LODD: Potential questions to answer Physicians and Pharmacists What are alternative drugs for a given indication (disease)? What are equivalent drugs (generic version of a brand name, or the chemical name of a active ingredient)? Are there ongoing clinical trials for a drug? Patients What background information is available about a drug? What are the contraindications of a drug? Which alternative drugs are available? What are the results of clinical trials for a drug? Pharmaceutical Companies What are other companies with drugs in similar areas? Which companies have a similar therapeutic focus? Source: Chris Bizer

  17. LODD: Linked Version of ClinicalTrials.gov Total number of triples: 6,998,851 Number of Trials: 61,920 RDF links to other data sources: 177,975 Links to: DBpedia and YAGO (from intervention and conditions) GeoNames (from locations) Bio2RDF.org's PubMed (from references) Source: Chris Bizer

  18. Semantic Web Solutions at Lilly

  19. Implementations at Lilly • Integration of Clinical and Pathways Data • Competitive Intelligence • Experimental Metadata • Discovery Metadata

  20. Discovery Metadata: Goals • Integrate master data throughout the discovery process to enable information sharing/integration for the scientific community • Model key relationships between master data classes • Provide ability to integrate disparate data sets quicker than the normal warehouse paradigm typically allows • Create a re-usable and sustainable semantic implementation • Allow for user-driven, manual curation of key data relationships Source: Phil Brooks

  21. SAP Legacy REFDB GSM NCBI Manual Curation Discovery Metadata: Ontology Source: Phil Brooks

  22. A P P S Application 1 Application 2 Application 3 … S O A SOA Layer/Enterprise Service Bus (WebServices, Visualizers, DataAccess Components) Authentication D A T A SQL SPARQL ETL Provenance Source Model 1 Source Model 2 Source Model 3 Source Model 4 Local Assertions Top Level Ontology Other Sources Other Sources Source … Other Tools Spreadsheets Rdbms Discovery Metadata: Architecture Source: Phil Brooks

  23. External Collaborations • RDF Access to Relational Databases - Chris Bizer, Eric Prud'hommeaux • Scalability testing of relational to RDF mapping approaches • End User Semantic Web Authoring - David Karger • Enhancing the scalability and robustness of the Exhibit and Potluck tools • Scientist-Driven Semantic Integration of Knowledge in Alzheimer's Disease - Tim Clark, June Kinoshita • Project to develop an integrated knowledge infrastructure for the neuromedical research community, pairing rich digital semantic context with the ever-growing digital scientific content on the web • Provenance Collection and Management - Carole Goble, Beth Plale • Project to develop a metadata taxonomy for global data at Lilly which enables the rapid integration of data and mining/analysis algorithms into dataflows which support clinical and discovery decisions • W3C’s Health Care and Life Sciences Interest Group

  24. Conclusion • Many Semantic Web solutions are being explored within the health care and life sciences community • Lilly is seeing tangible benefits in multiple projects from Semantic Web • Semantic Web provides a flexible framework for data integration • Incremental adoption of technology • Flexibility to integrate unanticipated data sets • Link existing silos together • Lilly is setting up open collaborations in this space • Try out LSG

More Related