1 / 16

Compatible text, visual and mathematical representations for biological process ontologies

Compatible text, visual and mathematical representations for biological process ontologies. Nigam Shah Penn State University. Ontologies in Molecular Biology. An ontology is a formal way of representing knowledge.

azriel
Download Presentation

Compatible text, visual and mathematical representations for biological process ontologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compatible text, visual and mathematical representations for biological process ontologies Nigam Shah Penn State University

  2. Ontologies in Molecular Biology • An ontology is a formal way of representing knowledge. • In an ontology, concepts are described both by their meaning and their relationship to each other.* • Gene Ontology • 43 open ontologies under OBO • First name ‘things’ … then name ‘relations’. • If we specify the ‘logic’ of combining ‘things’ and ‘relations’ we can write hypotheses about biological processes in a formal manner & evaluate them for consistency with existing information. * Bard and Rhee, Nature Reviews Genetics, Vol 5, March 2004, pg 213

  3. Hypotheses and Events An hypothesis about a biological process is a statement about relationships within a biological system. Protein P induces transcription of gene X We define an ‘event’ as a relationship between two biological entities, which we call ‘agents’.

  4. Testing events Protein P induces transcription of gene X Implicit claims (that can made explicit): • P is a transcription factor. • P is a transcriptional activator. • P is localized to the nucleus. • P can bind to the promoter of gene X P promoter | gene X nucleus

  5. Hypothesis Ontology • Expressive enough to describe the galactose system at a coarse level of detail. • It is compatible with other ontology efforts. • E.g. GO so that GO annotations can be used directly in HyBrow. • We have also developed a grammar to write hypotheses using events from this ontology.

  6. Grammar for a hypothesis A hypothesis consists of at least one event stream An event stream is a sequence of one or more events or event streams with logical joints (or operators) between them. An event has exactly one agent_a, exactly oneagent_b and exactly one operator (i.e. a relationship between the two agents). It also has a physical location that denotes ‘where’ the event happened, the genetic context of the organism and associated experimental perturbations when the event happened. A logicaljoint is the conjunction between two event streams.

  7. Making Hypotheses with increasing ‘formality’ Controlled Vocabulary Formal Language Context-Free Grammar A biological event is any occurrence for which we gather experimental data. Hypotheses make testable statements about combinations of biological events. The mathematical representation We have developed a formal language & grammar for representing an hypothesis as a sequence of events. We use ‘constraints and rules’ to decide if an hypothesis is a valid production of the language. http://conferences.computer.org/bioinformatics/CSB2003/SectA.html#Poster9

  8. Constraints and Rules • Consistency of an hypothesis with prior knowledge is evaluated by applying constraints and rules. • A constraint is a statement specifying the evidence that contradicts or supports an event. • A protein must be in the nucleus to bind to a promoter. • A rule comprises the ‘steps’ for deciding whether a constraint is satisfied or violated. Binds_to_promoter [P, g] : Annotation constraints if cellular location of P is not nucleus, give a penalty. if biological process is not transcription, give a penalty.

  9. A point-n-click interface

  10. Visual language representation Uses a formalVisual Language: • Direct composition of hypotheses in a format akin to reaction pathway diagrams • Translatable to other representation forms

  11. Other notations: Cook Notation -- BioD Kohn Notation

  12. Multiple ‘views’ of the ontology • Once we have an ontology for hypotheses … it can be represented as • Text files that users type. • As formal constructs that can be evaluated for validity in a formal manner. • As files that are ‘browsed’ by using special programs. • Having such equivalent formats allows us to perform computer aided hypothesis-evaluation.

  13. Multiple equivalent representations Biological process described in a formal language ev0 = Gal2p transports galactose in mem in wt ev1 = galactose activate Gal3p in wt in cyt ev2 = Gal3p Binds_to_promoter gal1 in wt in nuc ev3 = Gal3p induce gal1 in presence_of galactose in wt in nuc hy1 = (ev0+ev1) and (ev2+ev3) XML format?

  14. Evaluating an hypothesis Demo

  15. Screen shot of the output

  16. Stephen Racunas sar147@psu.edu Nina Fedoroff (Mentor) nvf1@psu.edu Credits More on project website: www.hybrow.org & Aug 1st @ 11:10 AM.

More Related