390 likes | 552 Views
A genome-wide perspective on translation of proteins. Dec 2012 Regulatory Genomics Lecturer: Prof. Yitzhak Pilpel. Teaching assistant: Idan Frumkin. idan.frumkin@weizmann.ac.il Submit Sunday at midnight. The Central Dogma of Molecular Biology Expressing the genome. RNA. Inactive DNA.
E N D
A genome-wide perspective on translation of proteins Dec 2012 Regulatory Genomics Lecturer: Prof. Yitzhak Pilpel
Teaching assistant: Idan Frumkin idan.frumkin@weizmann.ac.il Submit Sunday at midnight
The Central Dogma of Molecular Biology Expressing the genome RNA Inactive DNA DNA mRNA Protein f f
The Lac Operon (Jacob and Monod) In the presence of Lactose http://esg-www.mit.edu:8001/esgbio/pge/lac.html
The basic logic of metabolic control Catabolism (breakdown of molecules, e.g. lactose) Anabolism (synthesis of molecules, e.g. amino acids) Gene is ON when substrate is absent Gene is OFF when substrate is present Gene is ON when substrate is present Gene is OFF when substrate is absent
A combined transcription -translation control switchAt the Attenuation mechanism Charles Yanofsky
A negative control at the transcription level (similar and different from the lac operon)
How not to make too much triptophene? • A fail safe mechanism complements transcription control • At the translation level!
The up-stream ORF structure of the trp operon An uORF Mutual palindromes 1-2 are complementary 2-3 are complementary 3-4 are complementary
The various palindromic pairings 2-3 1-2, and 3-4 Transcription terminator! Not a terminator!
High Trp The structure of the Attenuation switch Low Trp RNA pol RNA pol Ribosome Ribosome
Could that be implemented in eukaryotes as well? • No! because requires co transcription-translation
Spatial organization of the flow of geneticinformation in bacteria (Llopis Nature 2010) DNA =DNA =mRNA =Protein
Translation consists of initiation, elongation and termination STOP 3’ 5’ Anti-codon Codon
The ribosome reads nucleotide sequence and produces amino acid sequence based on the genetic code • Some important properties of the code • The code is (almost) universal • There are 61 amino acid codons, and 3 STOP codons • The code is “redundant” - many amino acids have more than one codon • The genetic code is optimal wrt to many properties, such as error tolerance
The tRNA The generic form A specific form In 3D
Aminoacyl tRNA synthetase:The really “smart” part Error rate: 1/10,000-1/100,000 (in-vitro; higher in-vivo) 20 amino acids, 61 codons, 20 Aminoacyl tRNA synthetases
Possible mechanisms of translational regulation • optimality of ribosomal attachment site • mRNA secondary structure • codon usage
Multiple codons for the same amino acid C1 C2 C3 C4 C5 C6 Serine: UCU UCC UCA UCG AGC AGU Cysteine: UGU UGC Methionine: UGG STOP: UAA, UAG UGA
G T R Y E C Q A S F D C1C1C1C1C1C1C1C1C1C1C1 C2C2C2C2C2C2C2C2C2C2C2 C1C1C2C1C1C2C1C1C2C1C1 C2C2C2C2C1C1C1C1C1C1C1 C1C1C1C1C1C1C1C2C2C2C2 For a hypothetical protein of 300 amino acids with two-codon each, There are 2^300 possible nucleotide sequences These variants will code for the same protein, and are thus considered “synonymous”. Indeed evolution would easily exchange between them But are they all really equivalent??
Two potential types of sources for codon bias Mutation pattern (neutral) Selection Codon bias
The effect of (or on?) GC content Mutation pressure Selection Amino acid composition Nucleotide composition Inter-genic composition (esp in bacteria) explain codon bias Codon bias Inter- genic Coding Coding
Selection of codons might affect: Accuracy Throughput RNA-structure Costs Folding
AAA CCA GAA UCG AAG A simple model for translation efficiency 5’ 3’ … … … 8 2 5 4 1 Average: 4 AA Codon Amount Lys AAA 8 Asp AAC 6 Lys AAG 1 Asp AAU Thr ACA Thr ACC . . Phe UUU
relative concentration of tRNA in thecell ATGCCT 1 0 5 0 intermediate efficiency ATGCCC least efficient ATGCCA most efficient ATGCCG intermediate efficiency The same protein can be encoded in many ways… amino acid sequence: MPKSNFRFGE ATG
tRNA Gene copies 10 10 7 2 6 coding sequence translation efficiency score ( (geometric) average of all tRNA gene copy numbers) ATC CCA AAA TCG AAT Scoring coding sequences for efficiency in translation Efficient intermediate non-efficient … … … … (dos Reis et al. Nucleic Acids Res, 2004)
Wobble Interaction { Wi/Wmaxif Wi0 wi = wmeanelse ATC CCA AAA TCG AAT A simple model for translation efficiency … … … The tRNA Adaptation Index (tAI) dos Reis et al. NAR 2004
Physiological Correlation of tAI with experimentally determined protein levels r=0.63 Measured protein abundance Predicted translation efficiency (Ghaemmaghami et al. Nature 2003)
The correlation is quite high, but why not even higher? • The limitations of the model • tRNA gene copy numbers • Model only capture elongation • Difference in mRNA levels • Protein are also degraded at different rates
Gene1 Gene2 AA . . . Gly Gly Gly Gly . . . codon . . . GGT GGC GGA GGG . . . Codon count . . . 0 12 0 0 . . . AA . . . Gly Gly Gly Gly . . . codon . . . GGT GGC GGA GGG . . . Codon count . . . 3 3 3 3 . . . Highly biased synonymous codon usage (Nc=20) No bias in synonymous codon usage (Nc≥61) The effective number of codons (Nc) - a measure of overall synonymous codon usage bias Wright, F. (1990). "The 'effective number of codons' used in a gene." Gene 87(1): 23-9.
Codon usage bias is correlated with translation efficiency r=-0.79 (p<0.001) Mutation pattern (neutral) Selection Codon bias
But not in all species(e.g. A. gossypii) r=-0.48 (p=0.218) Mutation pattern (neutral) Selection Codon bias
A. gossypii D. hansenii S. cerevisiae S. bayanus C. glabrata C. albicans Y. lipolytica S. pombe Translation selection acts in some but not all species (e.g. debate on human…)
Physiological Correlation does not imply causality!! r=0.63 Z Measured protein abundance Physiological Evolutionary Predicted translation efficiency (Ghaemmaghami et al. Nature 2003)