180 likes | 668 Views
RNA Secondary Structure Prediction. 16s rRNA. RNA Secondary Structure. Pseudoknot. Dangling end. Single- Stranded. Interior Loop. Bulge. Junction (Multiloop). Stem. Hairpin loop. Image– Wuchty. RNA secondary structure. G A A A G G A-U U-G C-G
E N D
RNA Secondary Structure Pseudoknot Dangling end Single- Stranded Interior Loop Bulge Junction (Multiloop) Stem Hairpin loop Image– Wuchty
RNA secondary structure G A A A G G A-U U-G C-G A-U G-C Loop wobble pair Stem canonical pair
Pseudoknots RNA secondary structure representation Legitimate structure
Non-canonical interactions of RNA secondary-structure elements These patterns are excluded from the prediction schemes as their computation is too intensive. Pseudoknot Kissing hairpins Hairpin-bulge contact
“Rules for 2D RNA prediction” • Base Pairs in stems: GOOD • Additional possible assumptions: e.g., G:C better than A:T • Bulges, Loops: BAD • Canonical Interactions (base pairs, stems, bulges, loops): OK • Non canonical interactions (pseudoknots, kissing hairpins): Forbidden • The more interactions: The better
Predicting RNA secondary Structure • Allowed base pairing rules (Watson-Crick A:U, G:C, and Wobble pair G:U) • Sequences may form different structures • An free energy value is associated with each possible structure • Predict the structure with the minimal free energy (MFE)
Simplifying Assumptions for Structure Prediction • RNA folds into one minimum free-energy structure. • There are no non-canonical interactions. • The energy of a particular base pair in a double stranded regions is sequence independent • Neighbors have no influence. Was solved by dynamic programming Zucker and Steigler 1981
Sequence-dependent free-energy (the nearest neighbor model) U U C G U A A U G C A UCGAC 3’ U U C G G C A U G C A UCGAC 3’ Example values: GC GC GC GC AU GC CG UA -2.3 -2.9 -3.4 -2.1
Free energy computation U U A A G C G C A G C U A A U C G A U A 3’ A 5’ +5.9 (4 nt loop) -1.1 mismatch of hairpin -2.9 stacking +3.3 (1 nt bulge) -2.9 stacking -1.8 stacking -0.9 stacking -1.8 stacking 5’ dangling -2.1 stacking -0.3 G= -4.6 KCAL/MOL -0.3
Prediction Programs • Mfold http://www.bioinfo.rpi.edu/applications/mfold/old/rna/form1.cgi • Vienna RNA Secondary Structure Prediction http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi
Mfold - Suboptimal Folding • For any sequence of N nucleotides, the expected number of structures is greater than 1.8N • A sequence of 100 nucleotides has ~31025 possible folds. If a computer can calculate 1000 folds/second, it would take 1015 years (age of universe = ~1010 years)! • Mfold generates suboptimal folds whose free energy fall within a certain range of values. Many of these structures are different in trivial ways. These suboptimal folds can still be useful for designing experiments.