210 likes | 316 Views
Ontological relations and computable definitions for sequences at DNA, RNA and protein levels. Karen Eilbeck Neocles Leontis Thomas Bittner Colin Batchelor. Two sections. A report on the joint RNAO and SO meeting held in SLC in April 2008 (Eilbeck, Leontis and Bittner)
E N D
Ontological relations and computable definitions for sequences at DNA, RNA and protein levels Karen Eilbeck Neocles Leontis Thomas Bittner Colin Batchelor
Two sections • A report on the joint RNAO and SO meeting held in SLC in April 2008 (Eilbeck, Leontis and Bittner) • Computable definitions for 1D and 2D structures (Batchelor)
Ontological Relations for Sequences at DNA, RNA, and protein levelsA report on the joint RNAO and SO meeting held in SLC in April 2008. Karen Eilbeck Neocles Leontis Thomas Bittner
Aim of meeting • Coordinate the development of relationships between SO and RNAO
Universals and instances • Universal:repeatable or recurrent entities that can be instantiated or exemplified by many particular things. • Instance: A universal may have instances, known as its particulars. They identify single objects such as “that chromosome under that microscope”.
What is a sequence? • Sequence is a universal. A sequence can be located in places at the same time. • Manifestation of the sequence happens at the molecular level.
Identifying regions and relations between regions • Category theory. • Morphism: relationship between some posited domain and codomain. • Isomorphism between dna and RNA (both directions) • Morphism between rna and protein (information loss from protein to rna.) • Morphism between DNA and protein.
Next step 1: core terms and relations http://song.cvs.sourceforge.net/*checkout*/song/ontology/working_draft.obo
Next step 2: even more relationships • Homology and similarity relationships • Topological relationships • Supportive evidence relationships
Next step 3: Description logic • Conversion of core types and relations to formal logic. • A sound foundation to build upon for the features and other types in RNAO and SO
People: Karen Eilbeck - SO University of Utah keilbeck@genetics.utah.edu Neocles Leontis - RNAO BGSU leontis@bgnet.bgsu.edu Thomas Bittner - OBO Buffalo bittner3@buffalo.edu Colin Batchelor - relations in SO RSC BatchelorC@rsc.org
Computable definitions Colin Batchelor
Computable definitions These consist of necessaryandsufficientconditions. Generally written in OBO or OWL format. Example from SO: any primary transcript that is adjacent to a cap must be a capped_primary_transcript, and conversely all capped_primary_transcripts are primary transcripts that are adjacent to caps. id: SO:0000861 name: capped_primary_transcript def: "A primary transcript that is capped." [SO:xp] intersection_of: SO:0000185 ! primary_transcript intersection_of: adjacent_to SO:0000581 ! cap
What does this buy us? It makes ontology maintenance easier for the curators. But most importantly: With computable definitions, reasoners can in principle annotate automatically…
Loops (1) Consider an example 1D sequence: ……(((((….((….))..))).))… The definition of a tetraloop could look like this: tetraloop =”.…” that (adjacent_to “(“) and (adjacent to “)”) Much like the capture group in the regex \((\.{4})\)
Loops (2):(includes cardinality) loop = “.+” that adjacent_to “(“ and adjacent_to “)” diloop = loop that has_part “.” cardinality exactly 2 triloop = loop that has_part “.” cardinality exactly 3
Loops (3): stem-loops Assume no kinks or bulges or pseudoknots. Take a simple example: ((((..)))) “424 stemloop” = sequence that has_part (“(“ cardinality exactly 4 that adjacent_to diloop) and has_part (diloop adjacent_to “(“) and has_part (diloop adjacent_to “)”) and has_part (“)” cardinality exactly 4 that adjacent_to diloop) But what about the general case?
Loops (4): stem-loops and formal grammar This: \({n}\.+\){n} is not a valid regular expression. It reduces to anbn, which is well-known to be non-regular. Likewise in OWL you cannot say cardinality exactly n. So what do we do?
A way out Write the necessary and sufficient conditions in terms of the 2D structure. Hence: stem-loop = structure that has_part (base-pair that bound_to base-pair) and has_part (base-pair that bound_to loop) and has_part loop
What next? Write necessary and sufficient conditions for some example motifs. Take 2D structures in RNAML that contain known example motifs. Convert RNAML to OWL. Run reasoner.