1 / 21

An Ontological Approach for Describing Phospho-proteins in Rhodococcus

An Ontological Approach for Describing Phospho-proteins in Rhodococcus. Dept. of Computer Science, University of British Columbia. Dennis Wang, Gavin Ha, Jennifer Chen, Nancy Wang CPSC 445. April 5 th . 2007. What is an ontology?. Purpose: knowledge representation & reasoning

zuri
Download Presentation

An Ontological Approach for Describing Phospho-proteins in Rhodococcus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Ontological Approach for Describing Phospho-proteins in Rhodococcus Dept. of Computer Science, University of British Columbia. Dennis Wang, Gavin Ha, Jennifer Chen, Nancy Wang CPSC 445. April 5th. 2007

  2. What is an ontology? • Purpose: • knowledge representation & reasoning • Facilitates knowledge sharing and reuse • Definition: • a data model that represents a set of concepts within a domain and the relationships between those concepts. • It is used to reason about the objects within that domain. • Describe individuals (instances), classes (concepts), attributes, relations and axioms • Uses: • AI, information architecture, semantic web, software engineer

  3. Problems in biology • Biology = knowledge based • use prior knowledge to infer new knowledge • data rich • Biologist needs extensive prior knowledge to analyze data obtained • Pace of data production beyond one’s ability to acquire knowledge • Need an automated system to apply domain experts’ knowledge to biological data

  4. Solution: ontology & bioinformatics • Joint effort of biologist and computer scientist • Build ontologies using domain knowledge • Rapid classification of large datasets • Allows query to find instances of a class • Create controlled vocabularies for shared use across different biological and medical domains. • In bioinformatics, ontology can make knowledge available to community and its applications.

  5. Example: Gene Ontology (GO) “provides structured, controlled vocabularies and classifications that cover several domains of molecular biology” • Uses: • annotation of large data sets • the ability to group gene products to some high level term • Computational (putative) assignments of molecular function based on sequence similarity to annotated genes or sequences. ? Inferred gene function from electronic annotation Unknown gene product Seqsimilarity Infer function Sequence in SWISS-PROT Known function

  6. How are ontologies built? • There is no standardized methodology • But, efforts to make more comprehensive guidelines • In general: • Informal Stage • natural language • Formal Stage • formal knowledge representation language

  7. Ontology-building life cycle Inspired by software engineering. User Model(Biologist): #1) Identification of the purpose and scope of the ontology #2) Acquisition of domain knowledge Identify purpose and scope Knowledge Acquisition

  8. Ontology-building life cycle Conceptualization Model (Bioinformatician/Biologist): #3) Identifying key concepts in the domain. #4) Integration by using and incorporating other existing ontologies Identify purpose and scope Knowledge Acquisition Building Conceptualization Integrating existing ontologies

  9. Ontology-building life cycle Implementation Model (Bioinformatician): #5) Representing concepts with a formal language #6) Documenting informal and formal definitions #7) Evaluation of the appropriateness of the ontology for its intended application Identify purpose and scope Available Development Tools Knowledge Acquisition Language & Representation Building Conceptualization Integrating existing ontologies Encoding Evaluation

  10. Describing Phospho-Protein using Phosphabase Ontology Biologists Signal Protein Experts Provides Provides Proteomic experimental data Phosphatase & Kinase backgroundknowledge Uses Bioinformatician Made up of Build using OWL-DL Data (Instances/Individuals) Ontology (Classes) Results Pellet Reasoner • Can we use the phosphabase ontology to describe phospho-proteins discovered by the Rhodococcus Genome Project?

  11. Web Ontology Language (OWL) Class Professor subClassOf Superclass FacultyMember InstanceOf Individual Jennifer Chen Individual Anne Condon teaches • XML syntax • OWL-DL (Description Logic) : Certain restrictions to guarantee decidability based on description logic • OWL uses Resource Description Framework (RDF) • Subject Predicate Object • Basic components in OWL: • classes • Individuals • properties

  12. Phosphobase Ontology • Wolstencroft et al, 2006 • Biological Motivation • Driven by protein domain architecture to describe signalling protein families • Background knowledge required for construction: • Signal protein domains • Presence of protein domains within signal proteins • OWL Ontology • Ontology uses OWL-DL • Description-logic can be applied to classify proteins using reasoners • Many different ways to represent this knowledge in OWL

  13. Phosphabase.owl Domain_Entity Macromolecule Protein_Phosphatase Protein_Kinase

  14. OWL DL Reasoners: Pellet • Input • Ontology – OWL-DL format • axioms about classes into TBox • type and property assertions (individuals) into ABox • Query - RDQL (SPARQL) format • Instance data (individuals) • Tableau Reasoner • Checks satisfiability of an ABox with respect to a TBox • Test for knowledge base consistency [Parsia and Sirin, ISWC 2004]

  15. Instance Data

  16. Query Result

  17. Instance Data

  18. Query Result

  19. Instance Data No Result

  20. Conclusions • Ontologies can be used as a standard model for the exchange of biological information • Building ontologies can get very complicated • Biologists with little description logic training • Computer scientist with little knowledge of biology • Need more bioinformaticians • Ontologies can facilitate automated annotation of genes / gene products • Difficult to Read and Infer from Ontologies • Ontologies can get very big (Phosphabase only small example) • Reasoners are sometimes slow and inaccurate www.quicklybored.com

  21. Acknowledgements • Rhodococcus sp. RHA1 data • Eltis Lab: Dr. Lindsay Eltis, Dept. Microbiology & Biochemistry • Phosphabase Ontologoy • Wolstencroft Lab, University of Manchester, UK • Bioinformatics paper: Wolstencroft et al, 2006 • Phosphabase Ontology processing • Benjamin Good, iCAPTURE Centre, Vancouver

More Related