1 / 27

Introduction to the W3C for Semantic Web and Life Sciences Interest Group Eric Prud’hommeaux

Introduction to the W3C for Semantic Web and Life Sciences Interest Group Eric Prud’hommeaux. What is the Mission of HCLS IG?.

font
Download Presentation

Introduction to the W3C for Semantic Web and Life Sciences Interest Group Eric Prud’hommeaux

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to the W3C for Semantic Web and Life Sciences Interest GroupEric Prud’hommeaux

  2. What is the Mission of HCLS IG? • The mission of HCLS is to develop, advocate for, and support the use of Semantic Web technologies for biological science, translational medicine and health care. These domains stand to gain tremendous benefit by adoption of Semantic Web technologies, as they depend on the interoperability of information from many domains and processes for efficient decision support.

  3. Task Forces • Terminology – Semantic Web representation of existing resources • Task lead - John Madden • BioRDF – integrated neuroscience knowledge base • Task lead - Kei Cheung • Linking Open Drug Data – aggregation of Web-based drug data • Task lead - Chris Bizer • Scientific Discourse – building communities through networking • Task leads - Tim Clark, John Breslin • Clinical Observations Interoperability – patient recruitment in trials • Task lead - Vipul Kashyap • Other Projects: Clinical Decision Support, URI Workshop, Collaborations with CDISC & HL7

  4. Terminology: Overview • Goal is to identify use cases and methods for extracting Semantic Web representations from existing, standard medical record terminologies, e.g. UMLS • Methods should be reproducible and, to the extent possible, not lossy • Identify and document issues along the way related to identification schemes, expressiveness of the relevant languages • Initial effort will start with SNOMED-CT and UMLS Semantic Networks and focus on a particular sub-domain (e.g. pharmacological classification)

  5. BioRDF: Answering Questions • Goals: Get answers to questions posed to a body of collective knowledge in an effective way • Knowledge used: Publicly available databases, and text mining • Strategy: Integrate knowledge using careful modeling, exploiting Semantic Web standards and technologies

  6. BioRDF: Looking for Targets for Alzheimer’s • Signal transduction pathways are considered to be rich in “druggable” targets • CA1 Pyramidal Neurons are known to be particularly damaged in Alzheimer’s disease • Casting a wide net, can we find candidate genes known to be involved in signal transduction and active in Pyramidal Neurons? Source: Alan Ruttenberg

  7. BioRDF: Integrating Heterogeneous Data PDSPki NeuronDB Reactome Gene Ontology BAMS Allen Brain Atlas BrainPharm Antibodies Entrez Gene MESH Literature PubChem Mammalian Phenotype SWAN AlzGene Homologene Source: Susie Stephens

  8. BioRDF: SPARQL Query Source: Alan Ruttenberg

  9. BioRDF: Results: Genes, Processes • DRD1, 1812 adenylate cyclase activation • ADRB2, 154 adenylate cyclase activation • ADRB2, 154 arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway • DRD1IP, 50632 dopamine receptor signaling pathway • DRD1, 1812 dopamine receptor, adenylate cyclase activating pathway • DRD2, 1813 dopamine receptor, adenylate cyclase inhibiting pathway • GRM7, 2917 G-protein coupled receptor protein signaling pathway • GNG3, 2785 G-protein coupled receptor protein signaling pathway • GNG12, 55970 G-protein coupled receptor protein signaling pathway • DRD2, 1813 G-protein coupled receptor protein signaling pathway • ADRB2, 154 G-protein coupled receptor protein signaling pathway • CALM3, 808 G-protein coupled receptor protein signaling pathway • HTR2A, 3356 G-protein coupled receptor protein signaling pathway • DRD1, 1812 G-protein signaling, coupled to cyclic nucleotide second messenger • SSTR5, 6755 G-protein signaling, coupled to cyclic nucleotide second messenger • MTNR1A, 4543 G-protein signaling, coupled to cyclic nucleotide second messenger • CNR2, 1269 G-protein signaling, coupled to cyclic nucleotide second messenger • HTR6, 3362 G-protein signaling, coupled to cyclic nucleotide second messenger • GRIK2, 2898 glutamate signaling pathway • GRIN1, 2902 glutamate signaling pathway • GRIN2A, 2903 glutamate signaling pathway • GRIN2B, 2904 glutamate signaling pathway • ADAM10, 102 integrin-mediated signaling pathway • GRM7, 2917 negative regulation of adenylate cyclase activity • LRP1, 4035 negative regulation of Wnt receptor signaling pathway • ADAM10, 102 Notch receptor processing • ASCL1, 429 Notch signaling pathway • HTR2A, 3356 serotonin receptor signaling pathway • ADRB2, 154 transmembrane receptor protein tyrosine kinase activation (dimerization) • PTPRG, 5793 ransmembrane receptor protein tyrosine kinase signaling pathway • EPHA4, 2043 transmembrane receptor protein tyrosine kinase signaling pathway • NRTN, 4902 transmembrane receptor protein tyrosine kinase signaling pathway • CTNND1, 1500 Wnt receptor signaling pathway Many of the genes are related to AD through gamma secretase (presenilin) activity Source: Alan Ruttenberg

  10. LODD: Introduction Linked Data Browsers Linked DataMashups Search Engines Thing Thing Thing Thing Thing Thing Thing Thing Thing Thing typedlinks typedlinks typedlinks typedlinks A E C D B • Use Semantic Web technologies to • 1. publish structured data on the Web • 2. set links between data from one data source to data within other data sources Source: Chris Bizer

  11. LODD: Potential Links between Data Sets Source: Chris Bizer

  12. LODD: Data Set Evaluation Source: Chris Bizer

  13. LODD: Potential questions to answer Physicians and Pharmacists What are alternative drugs for a given indication (disease)? What are equivalent drugs (generic version of a brand name, or the chemical name of a active ingredient)? Are there ongoing clinical trials for a drug? Patients What background information is available about a drug? What are the contraindications of a drug? Which alternative drugs are available? What are the results of clinical trials for a drug? Pharmaceutical Companies What are other companies with drugs in similar areas? Which companies have a similar therapeutic focus? Source: Chris Bizer

  14. LODD: Linked Version of ClinicalTrials.gov Total number of triples: 6,998,851 Number of Trials: 61,920 RDF links to other data sources: 177,975 Links to: DBpedia and YAGO (from intervention and conditions) GeoNames (from locations) Bio2RDF.org's PubMed (from references) Source: Chris Bizer

  15. LODD: Mashing Clinical Trials and Geo Classification of Places Geo Coordinates Source: Chris Bizer

  16. Scientific Discourse: Overview Source: Tim Clark

  17. Scientific Discourse: Goals • Provide a Semantic Web platform for scientific discourse in biomedicine • Linked to • key concepts, entities and knowledge • Specified • by ontologies • Integrated with • existing software tools • Useful to • Web communities of working scientists Source: Tim Clark

  18. Scientific Discourse: Some Parameters • Discourse categories: research questions, scientific assertions or claims, hypotheses, comments and discussion, and evidence • Biomedical categories: genes, proteins, antibodies, animal models, laboratory protocols, biological processes, reagents, disease classifications, user-generated tags, and bibliographic references • Driving biological project: cross-application of discoveries, methods and reagents in stem cell, Alzheimer and Parkinson disease research • Informatics use cases: interoperability of web-based research communities with (a) each other (b) key biomedical ontologies (c) algorithms for bibliographic annotation and text mining (d) key resources Source: Tim Clark

  19. Scientific Discourse: SWAN+SIOC • SIOC • Represent activities and contributions of online communities • Integration with blogging, wiki and CMS software • Use of existing ontologies, e.g. FOAF, SKOS, DC • SWAN • Represents scientific discourse (hypotheses, claims, evidence, concepts, entities, citations) • Used to create the SWAN Alzheimer knowledge base • Active beta participation of 144 Alzheimer researchers • Ongoing integration into SCF Drupal toolkit Source: Tim Clark

  20. Scientific Discourse: SIOC Ontology Source: John Breslin

  21. Scientific Discourse: SWAN KB Source: Tim Clark

  22. COI: Bridging Bench to Bedside • How can existing Electronic Health Records (EHR) formats be reused for patient recruitment? • Quasi standard formats for clinical data: • HL7/RIM/DCM – healthcare delivery systems • CDISC/SDTM – clinical trial systems • How can we map across these formats? • Can we ask questions in one format when the data is represented in another format? Source: Holger Stenzhorn

  23. COI: Use Case • Pharmaceutical companies pay a lot to test drugs • Pharmaceutical companies express protocol in CDISC • -- precipitous gap – • Hospitals exchange information in HL7/RIM • Hospitals have relational databases Source: Eric Prud’hommeaux

  24. Inclusion Criteria • Type 2 diabetes on diet and exercise therapy or • monotherapy with metformin, insulin • secretagogue, or alpha-glucosidase inhibitors, or • a low-dose combination of these at 50% • maximal dose. Dosing is stable for 8 weeks prior • to randomization. • … • ?patient takes meformin . Source: Holger Stenzhorn

  25. Exclusion Criteria • Use of warfarin (Coumadin), clopidogrel • (Plavix) or other anticoagulants. • … • ?patient doesNotTake anticoagulant . Source: Holger Stenzhorn

  26. Criteria in SPARQL • ?medication1 sdtm:subject ?patient ;spl:activeIngredient ?ingredient1 . • ?ingredient1 spl:classCode 6809 . #metformin • OPTIONAL { • ?medication2 sdtm:subject ?patient ; spl:activeIngredient ?ingredient2 .?ingredient2 spl:classCode 11289 . #anticoagulant • } FILTER (!BOUND(?medication2)) Source: Holger Stenzhorn

  27. Getting Involved • Benefits to getting involved include: • early access to use cases and best practice • influence standard recommends • cost effective exploration of new technology through collaboration • Get involved by contacting the chairs: • team-hcls-chairs@w3.org

More Related