1 / 25

Improving Discovery in Biology through Linked Data

Improving Discovery in Biology through Linked Data. Helena F. Deus. We live in a world of data. Data, data everywhere. Sequences. Microarrays. Electrophoresis. Chrystalography. In vitro experiments.

thuyet
Download Presentation

Improving Discovery in Biology through Linked Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Discovery in Biology through Linked Data Helena F. Deus

  2. We live in a world of data

  3. Data, data everywhere Sequences Microarrays Electrophoresis Chrystalography In vitro experiments sources: http://www.lbl.gov/publicinfo/newscenter/pr/2008/PBD-microarray.html; http://www.biologyreference.com/Dn-Ep/Electrophoresis.html; http://biology.kenyon.edu/courses/biol114/Chap08/Chapter_08a.html

  4. Ingredients for Linked Data Use resource description framework (RDF) to create relationships between named things Discover new links by reusing ontologies and vocabularies • Name things and concepts using URI (Universal Resource Identifiers) label EGFR http://uniprot.org/EGFR genomicLocation sameAs http://geneontology.org/EGFR 7p12.1 westernBlot rdfs:subClassOf image rdf:type

  5. Ingredients for Linked Data • SPARQL, the query language of the Web of Data :overExpressedIn http://uniprot.org/EGFR Alzheimer’s SPARQL ?Gene :overExpressedIn ?Disease ?Gene :hasFunction ?GOterm ?Pathway :hasParticipant ?Gene

  6. Integrate Biological Data - the easy way NCBI Reactome epidermal growth factor receptor rea:Membrane nci:has_description rea:keyword CCCCGGCGCAGCGCGGCCGCAGCAGCCTCCGCCCCCCGCACGGTGTGAGCGCCCGACGCGGCCGAGGCGG … nih:sequence rea:Receptor nih:EGFR rea:EGFR rea:keyword nih:organism rea:keyword sameAs Homo sapiens nih:interacts rea:Transferase nih:organism nih:EGF

  7. The Linked Data Cloud “Life sciences will drive adoption of the Semantic Web, just as high-energy physics drove the early Web.” - Sir Tim Berners-Lee, 2005

  8. Building a Knowledge Continuum Knowledge Top-down approaches Formal Logical Models to be validated by reality Knowledge re-engineering bottleneck Linked Data Cloud Bottom-up approaches Knowledge Generation, data-driven Data

  9. Biological Knowledge Continuum Metabolomics Knowledge Continuum Protein 3D structure Microarrays Proteomics Transcriptomics Genomics Electrophoresis Sequencing

  10. Mapping genes to their functional roles Src: Science Jan 2010: Vol. 327 no. 5964 pp. 425-431 

  11. Querying the UCSC Genome Browser • Look up annotation for all genes with functions similar to protein P04637 select uniProt.gene.val, go.association.term_id, go.term.name from uniProt.gene, go.gene_product, go.association, go.term where uniProt.gene.acc ='P04637' and go.gene_product.symbol = uniProt.gene.val and go.gene_product.id = go.association.gene_product_id and go.association.term_id = go.term.id SQL uniprot:P04637 ?gene :product SPARQL go:term ?goterm Ack: Nigam Shah & Eric Prud’hommeaux

  12. How about Experimental Results? ~20 000 genes ~100 interesting genes/proteins ~ 10 interesting pathways ~5 proteins testable in the lab Linked Data High-throughput technologies Literature Browse databases Computational statistics Hypothesis Generation “I like to call it low-input, high-throughput, no-output biology.” 

  13. The Cancer Genome Atlas

  14. From genes to discovery Drugbank ClinicalTrial OMIM MDM2 EGFR PTEN KIT PDGFRA NME4ARL6IP6 NOTCH1 unknown MTHFD2

  15. Linking genes to diseases to drugs Sources: Marc Vidal; Albert-Laszlo Barabasi; Michael Cusick;Proceedings of the National Academy of Sciences

  16. Linked Open Drug Data

  17. Linked data to follow MRSA spread UK MRSA Portugal MRSA

  18. Can we model Systems Biology? Src: Nature Reviews 2010:11; 414-426 Ras CPLA2 RAF MEK ERK

  19. Start using Linked Data NOW!! http://sindice.com HELENA.DEUS@DERI.ORG http://www.w3.org/wiki/HCLSIG/LODD/Data

  20. One stop shop for all your data needs

  21. Adoption by the Bioinformatics Community

  22. Who is using Linked Data?

  23. Who are we talking to? • At NUIG: • Professor CathalSeoighe • Professor Frank Barry

  24. Plug yourself to Linked Data NOW!

More Related