1 / 28

BioPAX Biological Pathways Data Exchange biopaxwiki

BioPAX Biological Pathways Data Exchange www.biopaxwiki.org. Joanne Luciano, PhD University of Manchester, Harvard Medical School BioPathways Consortium, BioPAX Group, Predictive Medicine, Inc. 25 Jan 2006 Cambridge, MA USA. Pathway Data Why does HCLS care? (where we fit).

turnerjean
Download Presentation

BioPAX Biological Pathways Data Exchange biopaxwiki

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BioPAXBiological PathwaysData Exchangewww.biopaxwiki.org Joanne Luciano, PhD University of Manchester, Harvard Medical School BioPathways Consortium, BioPAX Group, Predictive Medicine, Inc. 25 Jan 2006 Cambridge, MA USA

  2. Pathway Data Why does HCLS care?(where we fit) Pathway Research has Broad Impact • Drug Discovery (pathway of target, safety) • Basic Science (identify pathways) • Disease Research (cancer pathways, diabetes, malaria) • Environmental Research (microbial research) Combine knowledge from multiple sources • Whole is greater than the sum of its parts • Biological knowledge is fragmented and isolated • Need database to manage resources

  3. What is a Pathway? Depends on who you ask! Glycolysis Protein-Protein Apoptosis TFs in E. coli Gene Regulatory Networks Molecular Interaction Networks Metabolic Pathways Signaling Pathways

  4. Genetics Microarray High Throughput Experimental Methods MassSpectrometry Two-Hybrid Protein modifications Interaction Data Expression Function Existing Literature Multiple Pathway Databases Integration Nightmare! Slide from Gary Bader

  5. Pathway Databases So many pathway databases, their own data models, formats, and data access methods and internal inconsistencies. More than 200 and growing Source: Pathway Resource List (http://cbio.mskcc.org/prl/) Slide from Mike Cary

  6. Molecular Interactions Pro:Pro All:All Metabolic Pathways Low Detail High Detail Interaction Networks Molecular Non-molecular Pro:Pro TF:Gene Genetic Regulatory Pathways Low Detail High Detail Small Molecules Low Detail High Detail Closes Gaps in Pathway Data Space Exchange Language Domain Database Exchange Formats Simulation Model Exchange Formats BioPAX SBML, CellML Genetic Interactions PSI-MI 2 Rate Formulas Biochemical Reactions Slide from Gary Bader

  7. } Research Community Need WIT BioCyc Reactome aMAZE KEGG BIND DIP HPRD MINT IntAct PSI format CSNDB TRANSPATH TRANSFAC INOH PubGene GeneWays Pathway Databases Metabolic Molecular Interaction Cell Signaling Gene Regulatory Networks Integrated Pathway Database Distributed Pathway Databases

  8. One Interfaceone converter per data source or tool >200 DBs and tools Application Database User Without BioPAX With BioPAX Common “computable semantic” enables scientific discovery Slide from Gary Bader (adapted)

  9. Design Goals Encapsulation • An entire pathway in one record Compatible • Use existing standards wherever possible Computable • From file reading to logical inference Successful • Buy-in from the research community

  10. Why OWL DL? Expressivity (biology = “complex relationships” • W3C Standard (use existing (and upcoming) standards) “Semantic Web enabled” • OWL has representations in RDF and XML (XML the exchange language) Machine Computable Enable full reasoning capability from file reading to logical inference • facilitate integration of knowledge, data, tool development • uncover inconsistencies and new knowledge

  11. Different representations of the same pathways <!ELEMENT reaction (substrate*,product*)> <!ATTLIST reaction name %keggid.type; #REQUIRED> <!ATTLIST reaction type %reaction-type.type; #REQUIRED> <!ELEMENT substrate EMPTY> <!ATTLIST substrate name %keggid.type; #REQUIRED> <!ELEMENT product EMPTY> <!ATTLIST product name %keggid.type; #REQUIRED> starts at a-D-Glucose 1P KEGG Reference Pathway GLYCOLYSIS

  12. Different representations of the same pathways reactions.dat This file lists all chemical reactions in the PGDB. Attributes: UNIQUE-ID TYPES COMMON-NAME ACTIVATORS BASAL-TRANSCRIPTION-VALUE DBLINKS DELTAG0 DEPRESSORS EC-LIST EC-NUMBER ENZYMATIC-REACTION EQUILIBRIUM-CONSTANT IN-PATHWAY INHIBITORS LEFT MOVED-IN MOVED-OUT OFFICIAL-EC? REACTANTS REQUIREMENTS RIGHT SIGNAL SPECIES SPONTANEOUS? STIMULATORS SYNONYMS starts at b-D-glucose6-phosphate BioCYC Reference Pathway GLYCOLYSIS

  13. BioPAX uses other ontologies • Use pointers to existing ontologies to provide supplemental annotation where appropriate • Cellular location  GO Component • Cell type  Cell.obo • Organism  NCBI taxon DB • Incorporate other standards where appropriate • Chemical structure  SMILES, CML, InChI

  14. BioPAX Ontology: Overview an set of interactions & parts parts how the parts are known to interact Level 1 v1.0 (July 7th, 2004) Slide from Gary Bader (adapted)

  15. OWL (semantics) Instances (data)

  16. SBML annotated with BioPAX <sbml xmlns:bp=“http://www.biopax.org/release1/biopax-release1.owl” xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <listOfSpecies> <species id=“PdhA” metaid=“PdhA”> <annotation> <bp:protein rdf:ID=“#PdhA”/> </annotation> </species> <species id=“NADP+” metaid=“NADP+”> <annotation> <bp:smallMolecule rdf:ID=“#NADP+”/> </annotation> </listOfSpecies> <listOfReactions> <reaction id=“pyruvate_dehydrogenase_cplx”> <annotation> <bp:complexAssembly rdf:ID=“#pyruvate_dehydrogenase_cplx”/> </annotation> </reaction> </listOfReactions> species is protein protein is PdhA species is small molecule small molecule is NADP+

  17. <species id=“pyruvate” metaid=“pyruvate”> <annotation xmlns:bp=“http://biopax.org/release1/biopax-release1.owl”> <bp:smallMolecule rdf:ID=“#pyruvate”> <bp:Xref> <bp:unificationXref rdf:ID=“#unificationXref119"> <bp:DB>LIGAND</bp:DB> <bp:ID>c00022</bp:ID> </bp:unificationXref> </bp:Xref> </bp:smallMolecule> </annotation> </species> BioPAX: External References

  18. <species id=“pyruvate” metaid=“pyruvate”> <annotation xmlns:bp=“http://biopax.org/release1/biopax_release1.owl”/> <bp:smallMolecule rdf:ID=“#pyruvate” > <bp:SYNONYMS>2-oxo-propionic acid</bp:SYNONYMS> <bp:SYNONYMS>2-oxopropanoate</bp:SYNONYMS> <bp:SYNONYMS>BTS</bp:SYNONYMS> <bp:SYNONYMS>pyruvic acid</bp:SYNONYMS> </bp:smallMolecule> </annotation> </species> BioPAX: Synonyms

  19. Tools Protégé Ontology Editor GKB Editor SRI SWOOP Pellet Racer Fact++ Pathway Tools EditPlus (Text editor) Want More: See Jeremy & Alan

  20. Overlap? Integration • Combine sources in a meaningful way Identity • Recognize same things in different contexts and different names Composition • Re-usable representations of composite pathway components • to help us manage, query, and reference Exchange • Agreement on: • What is to be exchanged • How to represent it • How to interpret it Want more? See Alan, Jeremy, me

  21. Gene Ontology, Microarray Gene Expression Database BioDASH BioPAX, UniProt Corporate Semantic Web from Carole Goble ISWC2005 Hype graph Gartner hype graph

  22. BioDASH: Bridging Chemistry and Molecular Biology • Different Views have different semantics: Lenses • When there is a correspondence between objects, a semantic binding is possible Uniprot:P49841 Apply Correspondence Rule:if ?target.xref.lsid == ?bpx:prot.xref.lsidthen ?target.correspondsTo.?bpx:prot Slide from Eric Neumann and Dennis Quan

  23. Probe Seamark Demonstration: Identification of new drug candidates • 1. Differentiate different forms of disease • 2. Identify patients subgroups. • 3. Identify top biomarkers • 4. Identify function • 5. Identify biological and chemical properties and disease associations of biomarker • 6. Identify documents • 7. Identify role in metabolic pathways • 8. Identify compounds that interact • 9. Identify and compare function in other organisms • 10. Identify any prior art GO2Keyword.rdf Keywords.rdf ProbeSet.rdf Keyword GO2OMIM.rdf GO2UniProt.rdf Protein Gene MIM Id OMIM.rdf IntAct.rdf GO.rdf GO2Enzyme.rdf UniProt.rdf Enzyme Organism Citation Compound Taxonomy.rdf Enzymes.rdf PubMed.xml KEGG.rdf Pathway

  24. BioPAX Supporting Groups Databases • BioCyc (www.biocyc.org) • BIND (www.bind.ca) • WIT (wit.mcs.anl.gov/WIT2) • Reactome (www.reactome.org) • PharmGKB (www.pharmgkb.org) • KEGG Grants • Department of Energy (Workshop) Groups • Memorial Sloan-Kettering Cancer Center: G. Bader, M. Cary, J. Luciano, C. Sander • SRI Bioinformatics Research Group: P. Karp, S. Paley, J. Pick • University of Colorado Health Sciences Center: I. Shah • BioPathways Consortium: J. Luciano, E. Neumann, A. Regev, V. Schachter • Argonne National Laboratory: N. Maltsev, E. Marland • Samuel Lunenfeld Research Institute: C. Hogue • Harvard Medical School: E. Brauner, D. Marks, J. Luciano, A. Regev • NIST: R. Goldberg • Stanford: T. Klein • Columbia: A. Rzhetsky • Dana Farber Cancer Institute: J. Zucker • Millennium Pharma: Alan Ruttenberg • Science Commons: Jonathan Rees Collaborating Organizations: • Proteomics Standards Initiative (PSI) • Systems Biology Markup Language (SBML) • Chemical Markup Language (CML) The BioPAX Community

More Related