320 likes | 821 Views
mRNA secondary structure optimization using a correlated stem-loop prediction. Nucleic Acids Res. 2013 Apr 1;41(6):e73. doi : 10.1093/ nar /gks1473. Epub 2013 Jan 15. Presented by WONG Pak Kan. Agenda. Background Motivation Methodology Results Discussion.
E N D
mRNA secondary structure optimization using a correlated stem-loop prediction Nucleic Acids Res. 2013 Apr 1;41(6):e73. doi: 10.1093/nar/gks1473. Epub 2013 Jan 15. Presented by WONG Pak Kan
Agenda • Background • Motivation • Methodology • Results • Discussion
Central dogma of molecular biology Image: http://www.nature.com/scitable/topicpage/Translation-DNA-to-mRNA-to-Protein-393 http://www.britannica.com/EBchecked/topic/377106/messenger-RNA-mRNA
Secondary Structure of mRNA • Affect the translation of mRNA • riboswitches, directly bind small molecules, changing their fold to modify levels of transcription or translation. G-C: 3 H-bondA-U: 2 H-bond G-U: 2 H-bond Image: http://www.nature.com/nrm/journal/v5/n12/fig_tab/nrm1528_F1.html
Motivation: Secondary Structure and Translation • Negative correlation between the strength of the structures and the mRNA ease of association with the ribosome. • Studer SM, Joseph S. Unfolding of mRNA secondary structure by the bacterial translation initiation complex. Mol. Cell 2006;22:105-115. • Sequences without secondary structures associated faster with the 30S ribosomal subunit stable initiation complexes determine the translation efficiency • Studer SM, Joseph S. Unfolding of mRNA secondary structure by the bacterial translation initiation complex. Mol. Cell 2006;22:105-115. • Silent mutations to expose the start codon from secondary structures improved translation and heterologous expression of two proteins (human interleukin-10 and human interferon-α) by 10-fold • Zhang W, Xiao W, Wei H, Zhang J, Tian Z. mRNA secondary structure at start AUG codon is a key limiting factor for human protein expression in Escherichia coli. Biochem. Biophys. Res. Commun. 2006;349:69-78. • … What is the best mRNA structure to accelerate the translation process?
What is the best mRNA structure to accelerate the translation process? • Without any secondary structure implies internal bond pair • AAAAAAAAAA… • ACACAC… • GAAGGGAAA… • UUUUCCCCCCUUU… • How about mRNA sequence that maintains the polypeptide primary structure? G-C: 3 H-bondA-U: 2 H-bond G-U: 2 H-bond
mRNA sequence that maintains the polypeptide primary structure Synonymous substitutions using codon table: Example: CU{U,C,A,G} gives Leucine.
mRNA-optimizer • Objective: Avoiding stable secondary structures in mRNA molecules by maximizing the minimum free energy (MFE) while maintaining polypeptide primary structure.
Overview • Metaheuristicapproach to explore the space of possible synonymous codon sequences • A fast algorithm to calculate a metric that is linearly dependent on the MFE
Synonymous Gene Exploration • Stimulated annealing approach [1] • Previous successes: • Genetic algorithms along with a Pareto archive is used for the gene synthetic redesign problem [2] • Eugene [3] [1] Kirkpatrick S, GelattCD Jr., Vecchi MP. Optimization by simulated annealing. Science 1983;220:671-680. [2] Oliveira J, Gaspar P. Advantages of a pareto-based genetic algorithm to solve the gene synthetic design problem. Curr. Bioinformatics 2012;7:304-309. Advantages of a pareto-based genetic algorithm to solve the gene synthetic design problem [3] Gaspar P, Oliveira JL, FrommletJ, Santos MA, Moura G. Eugene: maximizing synthetic gene design for heterologous expression. Bioinformatics 2012;28:2683-2684.
Coding sequences (3’-5’) AGGAAACGGUAUAAU Create the initial population AGGAAACGGUACAAC Pick a coding sequence Select codons randomly Change to the synonymous ones AGGAAACGAUAUAAU AGGAAACGGUACAAC Pick the sequence with probability : energy value of the current sequence : energy value of the new sequence
Simplistic approximation to MFE estimation Illustration of the MFE estimation algorithm. All possible folds of a single stem–loop are considered, starting from the 3′ end. In each fold, the nucleotides close to the folding region are not considered to interact. The average of the nucleotide-pair contributions of all folds is the result. Does not consider multiple stem-loop structures or pseudo-knots
Simplistic approximation to MFE estimation 3’ AGGAGAUAC 5’ 3’ AG 0G 5’ CAUAGA 3’ AGGA 2G 5’ CAUA 3’ AGG 20A 5’ CAUAG Energy: -2 Energy: 0 Energy: -2
Fine-Tuning Enhance the statistical dependence with accurate MFE measure • RNAfold as the “gold standard” ? • Highest performance among single-strand secondary structure predictors • Fastest • 48 genes from six different species • Simulated annealing to overcome the length bias • Spearman’s rank correlation coefficient : rank of author’s approximation : rank of RNAfold’s output : # of sequences used (48 genes)
=1 monotonically related Pearson’s product-moment correlation=0.91 almost linear dependence linear regression Results
Optimization Result 36 genes An average of 46% increase in the MFE (t-test probability 3x10-11)
Optimization Result # of GC pairs was decreased by 60%. Optimization results. In (b) and (c) the secondary structures of a Drosophila melanogaster gene are shown for the wild type and optimized mRNAs.
RNAfold Energy vs. Pseudo Energy 48 genes are selected from 6 different species, with the same length (100 codons)
Result: Time Complexity On Windows Server 2008, 2.67-GHz 4-core Intel Xeon and 4-GB RAM Experiments ran with 4000 iterations
Discussion First strategy to optimize mRNA secondary structures, to increase or decrease the minimum free energy of a nucleotide sequence, without changing its resulting polypeptide • Application • Can easily be used in combination with other factors that influence gene expression, e.g. codon usage, harmonization and GC content • Aid redesign genes to produce less structured mRNA • Approximation algorithm for RNAfold • Future directions: • randomize algorithm finding secondary structure?
Demo • http://bioinformatics.ua.pt/software/
Related papers • The double-stranded-RNA-binding motif: interference and much more • http://www.nature.com/nrm/journal/v5/n12/full/nrm1528.html • Design of a synthetic riboswitch • http://nar.oxfordjournals.org/content/41/4/2541.full.pdf+html • A comprehensive comparison of comparative RNA structure prediction approaches • http://www.biomedcentral.com/1471-2105/5/140
Related Tools • Fast dynamic programming approach from Zuker and Stiegler • Attempts to find the structural base-pair configuration of an RNA sequence that yields the minimum possible free energy. • mFold, Vienna RNA • Inverse RNA folding • Produce nucleotide configurations for a given secondary structure, regardless of any gene • RNAexinv, INFO-RNA, RNA-SSD • How about mRNA sequence that maintains the polypeptide primary structure and achieves minimal secondary structure?