410 likes | 432 Views
11/2/05 RNA Structure Prediction. Announcements. Seminar 12:10 PM Fri BCB Faculty Seminar in E164 Lago How to do sequence alignments on parallel computers Srinivas Aluru, ECprE & Chair, BCB Program http://www.bcb.iastate.edu/courses/BCB691F2005.html. Announcements.
E N D
11/2/05RNA Structure Prediction D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Announcements Seminar 12:10 PM FriBCB Faculty Seminar in E164 Lago How to do sequence alignments on parallel computers Srinivas Aluru, ECprE & Chair, BCB Program http://www.bcb.iastate.edu/courses/BCB691F2005.html D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Announcements BCB 544 Projects - Important Dates: Nov 2 Wed noon - Project proposals due to David/Drena Nov 4 Fri 10A - Approvals/responses to students Dec 2 Fri noon - Written project reports due Dec 5,7,8,9 class/lab - Oral Presentations (20') (Dec 15 Thurs = Final Exam) D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA Structure & Function Prediction Mon Review - promoter prediction RNA structure & function Wed RNA structure prediction 2' & 3' structure prediction miRNA & target prediction - perhaps.. RNA function prediction? Won't have time to cover this… D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Reading Assignment (for Mon/Wed) • Mount Bioinformatics • Chp 8 Prediction of RNA Secondary Structure • pp. 327-355 • Ck Errata:http://www.bioinformaticsonline.org/help/errata2.html • Cates (Online) RNA Secondary Structure Prediction Module • http://cnx.rice.edu/content/m11065/latest/ D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Review last lecture:RNA Structure & Function D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA Structure & Function • RNA structure • Levels of organization • Energetics (more about this on Wed) • RNA types & functions • Genomic information storage/transfer • Structural • Catalytic • Regulatory D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Covalent & non-covalent bonds in RNA • Primary: • Covalent bonds • Secondary/Tertiary • Non-covalent bonds • H-bonds • (base-pairing) • Base stacking Fig 6.2 Baxevanis & Ouellette 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Base-pairing in RNA • G-C, A-U, G-U ("wobble") & variants • U can form base-pairs with both A & G • Nucleotides in RNA are frequently modified • this is not very common in DNA • These features & flexible "single-stranded" RNA • backbone allow for many potential base-pairs Modified bases are especially important) in tRNA: e.g., pseudo-Uridine, rD, 5-CH3-C6-isopentenyl-A 7-CH3-G, many others… See: IMB Image Library of Biological Molecules D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Common structural motifs in RNA • Helices • Loops • Hairpin • Internal • Bulge • Multibranch • Pseudoknots Fig 6.2 Baxevanis & Ouellette 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA functions • Storage/transfer of genetic information • Genomes • many viruses have RNA genomes • single-stranded (ssRNA) • e.g., retroviruses (HIV) • double-stranded (dsRNA) • Transfer of genetic information • mRNA = "coding RNA" - encodes proteins D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA functions • Structural • e.g., rRNA, which is major structural component of ribosomes • BUT - its role is not just structural, also: • Catalytic • RNA in ribosome has peptidyltransferase activity • Enzymatic activity responsible for peptide bond formation between amino acids in growing peptide chain • Also, manysmall RNAs are enzymes "ribozymes" (Gloria Culver, ISU) (W Allen Miller, ISU) D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA functions • Regulatory • Recently discovered important new roles for RNAs • In normal cells: • in "defense" - esp. in plants • in normal development • e.g., siRNAs, miRNA • As tools: • for gene therapy or to modify gene expression • RNAi (used by many at ISU: Diane Bassham, Thomas Baum, Jeff Essner, Kristen Johansen, Jo Anne Powell-Coffman, Roger Wise, etc.) • RNA aptamers (Marit Nilsen-Hamilton, ISU) D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA types & functions L Samaraweera 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Thanks to Chris Burge, MITfor following slidesSlightly modified from:Gene Regulation and MicroRNAsSession introduction presented at ISMB 2005, Detroit, MIChris Burgecburge@MIT.EDU C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Protein Coding Gene … Transcription Polyadenylation intron exon primary transcript / pre-mRNA Splicing For each of these processes, there is a ‘code’ (set of default recognition rules) AAAAAAAAA Export mRNA Degradation Translation Protein Folding, Modification, Transport, Complex Assembly Protein Complex Degradation Expression of a Typical Eukaryotic Gene DNA C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Lots of data Genomes, structures, transcripts, microarrays, ChIP-Chip, etc. Gene Expression Challenges for Computational Biology • Understand the ‘code’ for each step in gene expression (set of default recognition rules), e.g., the ‘splicing code’ • Understand the rules for sequence-specific recognition of nucleic acids by protein and ribonucleoprotein (RNP) factors • Understand the regulatory events that occur at each step and the biological consequences of regulation C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
• have modular organization » Understand DNA-binding specificity Sequence-specific Transcription Factors Yan (ISU) A computational method to identify amino acid residues involved in protein-DNA interactions ATF-2/c-Jun/IRF-3 DNA complex Panne et al. EMBO J. 2004 C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Early Steps in Pre-mRNA Splicing • Formation of exon-spanning complex • Subsequent rearrangement to form intron-spanning spliceosomes which catalyze intron excision and exon ligation hnRNP proteins Matlin, Clark & Smith Nature Mol Cell Biol 2005 C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Alternative Splicing > 50% of human genes undergo alternative splicing Matlin, Clark & Smith Nature Mol Cell Biol 2005 Wang (ISU) Genome-wide Comparative Analysis of Alternative Splicing in Plants C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
ESE/ESS = Exonic Splicing Enhancers/Silencers ISE/ISS = Intronic Splicing Enhancers/Silencers Splicing Regulation Matlin, Clark & Smith Nature Mol Cell Biol 2005 C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
lin-4 precursor lin-4 RNA target mRNA lin-4 RNA “Translational repression” V. Ambros lab C. eleganslin-4 Small Regulatory RNA We now know that there are hundreds of microRNA genes (Ambros, Bartel, Carrington, Ruvkun, Tuschl, others) C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
MicroRNA Biogenesis N. Kim Nature Rev Mol Cell Biol 2005 C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
microRNA pathway RNAi pathway Exogenous dsRNA, transposon, etc. MicroRNA primary transcript Drosha Dicer Dicer precursor siRNAs miRNA target mRNA RISC RISC RISC “translational repression” and/or mRNA degradation mRNA cleavage, degradation miRNA and RNAi pathways C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
miRNA Challenges for Computational Biology • Find the genes encoding microRNAs • Predict their regulatory targets • Integrate miRNAs into gene regulatory pathways & networks Computational Prediction of MicroRNA Genes & Targets Need to modify traditional paradigm of "transcriptional control" primarily by protein-DNA interactions to include miRNA regulatory mechanisms! C Burge 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
New Today: RNA Structure Prediction D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA structure prediction strategies Secondary structure prediction • Energy minimization (thermodynamics) 2) Comparative sequence analysis (co-variation) 3) Combined experimental & computational D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Secondary structure prediction strategies Energy minimization (thermodynamics) • Algorithm: Dynamic programming to find high probability pairs (also, some Genetic algorithms) • Software: Mfold - Zuker Vienna RNA Package - Hofacker RNAstructure - Mathews Sfold - Ding & Lawrence R Knight 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Secondary structure prediction strategies 2) Comparative sequence analysis (co-variation) • Algorithm: Mutual information Context-free grammars • Software: ConStruct Alifold Pfold FOLDALIGN Dynalign R Knight 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Secondary structure prediction strategies 3) Combined experimental & computational • Experiment: Map single-stranded vs double-stranded regions in folded RNA • How? Enzymes: S1 nuclease, T1 RNase Chemicals: kethoxal, DMS R Knight 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Experimental RNA structure determination? • X-ray crystallography • NMR spectroscopy • Enzymatic/chemical mapping D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
1) Energy minimization method What are the assumptions? Native tertiary structure or "fold" of an RNA molecule is (one of) its "lowest" free energy configuration(s) Gibbs free energy = G in kcal/mol at 37C = equilibrium stability of structure lower values (negative) are more favorable Is this assumption valid? in vivo? - this may not hold, but we don't really know D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Basepair Basepair A U A U A=U A=U G = -1.2 kcal/mole Why 1.2 vs 1.6? A U U A A=U U=A G = -1.6 kcal/mole Free energy minimization What are the rules? What gives here? C Staben 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Energy minimization calculations:Base-stacking is critical - Tinocco et al. C Staben 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Nearest-neighbor parameters Most methods for free energy minimization use nearest-neighbor parameters (derived from experiment) for predicting stability of an RNA secondary structure (in terms of Gat 37C) & most available software packages use the same set of parameters: Mathews, Sabina, Zuker & Turner, 1999 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
Energy minimization - calculations: Total free energy of a specific conformation for a specific RNA molecule = sum of incremental energy terms for: • helical stacking (sequence dependent) • loop initiation • unpaired stacking (favorable "increments" are < 0) Fig 6.3 Baxevanis & Ouellette 2005 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
But how many possible conformations for a single RNA molecule? Huge number: Zuker estimates (1.8)N possible secondary structures for a sequence of N nucleotides for 100 nts (small RNA…) = 3 X 1025 structures! Solution? Not exhaustive enumeration… • Dynamic programming O(N3) in time O(N2) in space/storage iff pseudoknots excluded, otherwise: O(N6 ), time O(N4 ), space D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
2) Comparative sequence analysis (co-variation) Two basic approaches: • Algorithms constrained by initial alignment Much faster, but not as robust as unconstrained Base-pairing probabilities determined by a partition function • Algorithms not constrained by initial alignment Genetic algorithms often used for finding an alignment & set of structures D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA Secondary structure prediction: Performance? • How evaluate? • Not many experimentally determined structures • currently, ~ 50% are rRNA structures • so "Gold Standard" (in absence of tertiary structure): • compare with predicted RNA secondary structure with that determined by comparative sequence analysis (!!??)using Benchmark Datasets • NOTE: Base-pairs predicted by comparative sequence analysis for large & small subunit rRNAs are 97% accurate when compared with high resolution crystal structures! - Gutell, Pace D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA Secondary structure prediction: Performance? • Energy minimization (via dynamic programming) 73% avg. prediction accuracy - single sequence 2) Comparative sequence analysis 97% avg. prediction accuracy - multiple sequences (e.g., highly conserved rRNAs) much lower if sequence conservation is lower &/or fewer sequences are available for alignment 3) Combined - recent developments: combine thermodynamics & co-variation & experimental constraints? IMPROVED RESULTS D Dobbs ISU - BCB 444/544X: RNA Structure Prediction
RNA structure prediction strategies Tertiary structure prediction Requires "craft" & significant user input & insight • Extensive comparative sequence analysis to predict tertiary contacts (co-variation) e.g., MANIP - Westhof • Use experimental data to constrain model building e.g., MC-CYM - Major • Homology modeling using sequence alignment & reference tertiary structure (not many of these!) • Low resolution molecular mechanics e.g., yammp - Harvey D Dobbs ISU - BCB 444/544X: RNA Structure Prediction