430 likes | 580 Views
Finding the building blocks of RNA 3-D structure using graph analysis. Romain Rivière AReNa – 28.03.2007. Interest in RNA modelling. Characterise RNA families Improve non- coding RNA identification in genomic data Determine the RNA players in regulatory networks
E N D
Finding the building blocks of RNA 3-D structure using graph analysis Romain Rivière AReNa– 28.03.2007
Interest in RNA modelling • Characterise RNA families • Improve non-coding RNA identification in genomic data • Determine the RNA players in regulatory networks • Identifypotential RNA drugtargets
Project Background: ribonomicsiscurentlystuckwith the secondary structure paradigm, whereaswewouldneedhighthroughputtertiary data Hypothesis: Finding the fundamental building blocks of RNA structures willreduce the complexity of RNA folding
From 3-D Structure to Graph :MC-Annotate YeasttRNA-Phecrystal structure (pdb 4TNA)
Solution in 3 steps Enumerate all the motifs Regroup by similarity Find the building blocks
Enumerating motifs A motif is a set of connectednucleotidestogetherwiththeir interactions
Solution in 3 steps Enumerate all the motifs Regroup by similarity Find the building blocks
Graph Isomorphism 1 2 4 2 1 2 = 5 3 3 Scan through all permutations to decide if two graphs are isomorphic ! 1 5 3 4 5 4 Matrixrepresentation of the graph : 2 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 0110 1 1 0 0 1 0 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 0110 1 1 0 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 0 0 = ≠
? ? ? ? ? ? ? ? Group motifs withisomorphism
Canonical labelling 1 2 4 2 1 2 = 5 3 3 Take the minimum through all permutations of the matrixrepresentation 1 5 3 4 5 4 Matrixrepresentation of the graph : 2 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 0110 1 1 0 0 1 0 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 0110 1 1 0 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 0 0 = ≠
18 21 18 33 21 21 Group motifs with canonical labelling
Solution in 3 steps Enumerate all the motifs Regroup by similarity Find the building blocks
The covering graph : a mappingbetween motifs and edges Type of motifs Edges 7 5 4 7 6 Find a small set of types of motifs thatcovers the mostedges 10 7 4 3 0 … …
Building blocks of the 50S?(graphs of size 4) General graphs are not compact enough Not usable for modelling …
Future works • Motif discovery • Biological relevance of block-functionrelationships? • RNA folding • Practicalusability?
Acknowledgments • François Major • Sébastien Lemieux • Véronique Lisi • Karine St-Onge • Philippe Thibault • Patrick Gendron • Martin Larose • All otherlabmembers
The Results Weapplied the method to the large ribosomal subunit Werestrict the graphs allowed for the base to cycles Wefound 334 cycles thatcovers 90% of the structure.
Canonical labeling Canonical label of a graph : Take the minimum of the matrices over all the permutations Property : 2 graphs have the same canonical label if and only if they are isomorphic
Second step : group motifs • Group together motifs which are identical • Donewith canonical labelling • Idea : associate a string to each graph suchthattwo graphs are associatedwith the same string if and only if they are identical (isomorphic). • Difficultproblemwellstudied • Potentiallyhighly computation time.
Conformationalspacetoo large 3^n . 10^n where n is the size of the structure 10^14 op/s world fastest computer
Interests in RNA RNA is a very important medium in the transfer of genetic information Convey information throughits structure in addition to itssequence Example : miRNA
Structure of an miRNA Similarsecondary structure But, only one miRNAfunctional !
RNA 3-D Structure ineractions • 3 main types of molecular interactions : • Phosphodiesterlink • Base pairing • Base stacking