490 likes | 652 Views
Evolution of the Genetic Code: Before and After the LUCA. The genetic code evolved to its canonical form before the Last Universal Common Ancestor of Archaea, Bacteria and Eukaryotes - >3 billion years ago. It appears to be highly optimized. How did it get to be this way?
E N D
Evolution of the Genetic Code:Before and After the LUCA The genetic code evolved to its canonical form before the Last Universal Common Ancestor of Archaea, Bacteria and Eukaryotes - >3 billion years ago. It appears to be highly optimized. How did it get to be this way? Numerous small changes have occurred to the canonical code since then. What is the mechanism of codon reassignment?
Codon Reassignment – The Genetic code is variable in mitochondria (and also some cases of other types of genomes) UGA Stop to Trp AUA Ile to Met CUN Leu to Thr CGN Arg to unassigned AGR Arg to Ser to Stop/Gly etc..... But how can this happen? It should be disadvantageous.
Porifera Cnidaria Arthropoda Nematoda Lophotrochozoa Platyhelminthes Echinodermata Hemichordata Urochordata Cephalochordata Craniata Reassignments in Metazoa Loss of tRNA-Ile(CAU) but AUA remains Ile Loss of tRNA-Arg(UCU) and AGR : Arg -> Ser Loss of many tRNAs + import from cytoplasm AUA : Ile -> Met AGR : Ser -> Stop AGR : Ser -> Gly AAA : Lys -> Asn AAA : Lys -> unassigned
Example 1: AUA was reassigned from Ile to Met during the early evolution of the mitochondrial genome.
Example 2: UGA was reassigned from Stop to Trp many times (12 times in mitochondria).
GAIN Ambiguous codon. Selective disadvantage. LOSS New Code. Selective disadvantage because codons are used in wrong places Initial Code. No Problem. LOSS Unassigned codon. Selective disadvantage. GAIN New Code. Codons now used in right places. No Problem. Note – the strength of the selective disadvantage depends on the number of times the codon is used. There is no disadvantage if the codon disappears. The GAIN-LOSS framework (Sengupta & Higgs, Genetics 2005) LOSS = deletion or loss of function of a tRNA or RF GAIN = gain of a new tRNA or a gain of function of an existing one. Mutations in coding sequences
Four possible mechanisms of codon reassignment. 1. Codon Disappearance - The codon disappears. The order of the gain and loss is irrelevant. For the other three mechanisms the codon does not disappear. 2. Ambiguous Intermediate – The gain happens before the loss. There is a period when the gain is fixed in the population and translation is ambiguous. 3. Unassigned Codon – The loss happens before the gain. There is a period when the loss is fixed in the population and the codon is unassigned. 4. Compensatory Change – The gain and loss are fixed in the population simultaneously (although they do not arise at the same time). There is no intermediate period between the old and the new codes. - cf. theory of compensatory substitutions in RNA helices. Sengupta & Higgs (2005) showed that all four mechanisms work in a population genetics simulation
Summary of Codon Reassignments in Mitochondria CD mechanism explains disappearance of stop codons because they are rare initially. Only a few examples of CD for sense codons. UC and AI are important for sense codons.
Three examples in yeasts (Mutation pressure GC to AU) CUN is rare (replaced by UUR) CUN Leu to Thr CGN is rare (replaced by AGR) CGN Arg codons become unassigned. AUA and AUU common and AUC is rare Nevertheless AUA is reassigned to Met. Codon does not disappear
Leu and Arg codons in yeasts Codon Disappearance causes reassignments * CUN = Thr. Unusual tRNA-Thr present instead of tRNA-Leu ** CGN = unassigned. tRNA-Arg is deleted
AUA Ile to Met in Yeasts codon anticodon AUU Ile GUA AUC Ile “ AUA Ile K2CAU AUG Met CAU
5 7 9 11 13 Evolution of the canonical code - Before the LUCA The canonical code seems to be optimized to reduce the effects of translational and mutational errors. Neighbouring codons code for similar amino acids. C LI F W M Y V PT A SG HQ R NK E D Woese’s polar requirement scale Measure difference between amino acid properties by how far apart they are on this scale.
Principal Component Analysis Projects the 8-d space into the two ‘most important’ dimensions. Big Small Hydrophobic Hydrophilic
UUU UUC UUA UUG UCU UCC UCA UCG UAU UAC UAA UAG Phe Phe Leu Leu Ser Ser Ser Ser Tyr Tyr * * Cp UGU UGC UGA UGG Leu Leu eu CUU CUC CUA CUG CCU CCC CCA CCG His Hn Gln CAU CAC CAA CAG ACU ACC ACA ACG Ile Ile Ile Met Asn Ays Lys Ser Ser Arg Arg AGU AGC AGA AGG AUU AUC AUA AUG AAU AAC AAA AAG f ~ 10-6 one in a million codes is better (Freeland and Hurst) p(E) Ereal GCU GCC GCA GCG Ala Ala Ala Ala Asp Asp Gu GGU GGC GGA GGG GAU GAC GAA GAG f E Cost function g(a,b) for replacing amino acid a by amino acid b e.g. difference in Polar Requirement rij = rate of mistaking codon i for codon j = 1 for single position mistakes, 0 otherwise E = measure of error associated with a code Generate random codes by permuting the 20 amino acids in the code table E is smaller for the canonical code than for almost all random codes. Pro Pro Pro Pro CGU CGC CGA CGG Arg Ag GUU GUC GUA GUG Val Val Val Val
UUU UUC UUA UUG UCU UCC UCA UCG UAU UAC UAA UAG Phe Phe Leu Leu Ser Ser Ser Ser Tyr Tyr * * Cp UGU UGC UGA UGG Leu Leu eu CUU CUC CUA CUG CCU CCC CCA CCG His Hn Gln CAU CAC CAA CAG ACU ACC ACA ACG Ile Ile Ile Met Asn Ays Lys Ser Ser Arg Arg AGU AGC AGA AGG AUU AUC AUA AUG AAU AAC AAA AAG GCU GCC GCA GCG Ala Ala Ala Ala Asp Asp Gu GGU GGC GGA GGG GAU GAC GAA GAG The statistical argument shows that the code is highly non-random but it does not explain how the code evolved to be that way. Need a step-by-step evolutionary argument that leads from a proposed first stage of the code to today’s code. Random permutations – Not Possible Random swaps – seems unlikely The earliest code probably had few amino acids. Which were the first? Selection acts when new amino acids are added. Pro Pro Pro Pro Arg Ag CGU CGC CGA CGG GUU GUC GUA GUG Val Val Val Val
Dating of rocks and meteorites Microfossil evidence Stromatolites. Phylogenetic methods (divergence after LUCA) Last ocean- vaporizing impact. Lunar craters Isotopic evidence for life Time scale for the origin of life The origin of the genetic code is the end of the RNA World What preceded RNA? Another polymer? Metabolism only?
Prebiotic synthesis of organic molecules Miller-Urey experiment (1953) Began with a mixture of CH4 , NH3, H2O and H2. Energy source = electric spark or UV light. Obtained 10 amino acids.
Atmospheres and Chemistry reducing: CH4 , NH3, H2O, H2. or CO2, N2, H2 or CO, N2, H2 There is hydrogen gas and/or hydrogen is present combined with other elements (methane, ammonia, water) neutral: CO or CO2 , N2 , H2O no hydrogen or oxygen gas oxidizing: O2, CO2, N2 oxygen gas present Prebiotic chemists favour reducing atmospheres. Yields in Miller-Urey exp are higher and more diverse in reducing than in neutral atmospheres. Doesn’t work in oxidizing atmosphere.
Planetary Atmospheres Major element in universe is H (big bang) so doesn’t it make sense that atmosphere was reducing? Jupiter retains original mixture: H2, He + small amounts CH4, NH3, H2O Smaller planets lose H2 New atmosphere created by outgassing from interior Geologists & Astronomers favour an intermediate atmosphere. • Venus - 64 Earth atmospheres pressure! Mostly CO2 and N2 • Carbonates in sedimentary rocks on Earth suggest previously lots of CO2 So maybe Miller and Urey were wrong? :-( Current Earth: Mostly N2, O2 + small amounts of CO2 H2O – changed by life. Mars: very low pressure – mostly CO2 and N2
Alternative suggestion – Hydrothermal vents Sea water passes through vents. Heated to 350o C. Cools to 2o C in surrounding ocean. Supply of H2 H2S etc. Fierce debate as to whether these conditions favour formation or breakup of organic molecules (Miller & Lazcano, 1995)
Organic compounds in meteorites Most widely studied meteorite is the Murchison meteorite. Fell in Australia in 1969. Carbonaceous chondrite. Contained both biological and non-biological amino acids Both optical isomers (later shown to be not quite equal) Compounds are not contamination Just about all the building block molecules have now been found in carbonaceous meteorites (Sephton, 2002). Astrochemistry: molecular clouds; icy grains; parent bodies of meteorites.... Delivery by: dust particles; meteorites; comets.... Was external delivery an important source of organic molecules?
UUU UUC UUA UUG UCU UCC UCA UCG UAU UAC UAA UAG Phe Phe Leu Leu Ser Ser Ser Ser Tyr Tyr * * Cp UGU UGC UGA UGG Leu Leu eu CUU CUC CUA CUG CCU CCC CCA CCG His Hn Gln CAU CAC CAA CAG ACU ACC ACA ACG Ile Ile Ile Met Asn Ays Lys Ser Ser Arg Arg AGU AGC AGA AGG AUU AUC AUA AUG AAU AAC AAA AAG GCU GCC GCA GCG Ala Ala Ala Ala Asp Asp Gu GGU GGC GGA GGG GAU GAC GAA GAG The earliest code probably had few amino acids. Which were the first? Selection acts when new amino acids are added. Pro Pro Pro Pro Arg Ag CGU CGC CGA CGG GUU GUC GUA GUG Val Val Val Val
Prebiotic Synthesis of amino acids Higgs and Pudritz (2009) Astrobiology Amino acids are found in • Meteorites • Atmospheric chemistry experiments (Miller-Urey) • Hydrothermal synthesis • Icy dust grains in space Rank amino acids in order of decreasing frequency in 12 observations. Derive ranking.
Comparison of amino acid frequencies produced non-biologically concentrations normalized relative to Gly 10 amino acids are found in the Miller-Urey experiments. Very similar ones are also found in meteorites, an Ice grain analogue experiment, and other places. These are ‘early’ amino acids that were available for use by the first organisms. G A D E V S I L P T The other 10 are not seen. These are late amino acids that were only used when organisms evolved a means of synthesizing them biochemically. K R H F Q N Y W C M
The earliest amino acids are those that are cheapest to form thermodynamically
Positions of early and late amino acids.... What does this mean? F F Maybe only 2nd position was relevant initially. Late amino acids took over codons previously assigned to amino acids with similar properties. M
Propose that the four earliest amino acids were Val, Ala, Asp, Gly Four column code. (Higgs Biol. Direct. 2009) This is a triplet code but only the second base means anything. The second base is the most important for codon-anticodon recognition. Unlikely to make a mistake at second position. All first and third position mistakes are synonymous.
Code structure after addition of the 10 early amino acids. . Add new amino acids in positions that were formerly occupied by amino acids with similar properties. This minimizes disruption to existing gene sequences.
Summary of my argument - Selection acts at the time of addition of new amino acids to the code. The new amino acid is assigned to codons that formerly coded for an amino acid with similar properties. This minimizes disruption to existing genes. The result is that codons in the same columns end up assigned to amino acids with similar properties. The column structure is retained from the earliest code. Hence the code appears to minimize translational error with respect to randomly reshuffled codes, even though translational error was not the main factor being selected.
Pathways of amino acid synthesis in modern organisms (from Di Giulio 2008)
Other points – Column structure suggests that translational errors were more important than mutational errors (tRNA structure/RNA world) Precursor-product pairs tend to be neighbours (but doubts over statistical significance). Maybe late amino acids took over codons previously assigned to their biochemical precursors. Direct chemical interactions between RNA motifs and amino acids (“stereochemical theory”). In vitro selection experiments suggest binding sites of aptamers preferentially contain codon and anticodon sequences.
RNA World First hypothesis: There was a stage of evolution at when RNA molecules performed both genetic and catalytic roles. DNA later took over the genetic role and proteins took over the catalytic role. Translation depends on RNA: mRNA supplies the information for protein synthesis. Active ingredient of the ribosome is rRNA – 3d structures show site of peptidyl transferase reaction. Proteins probably added as a late addition to the ribosome. tRNAs also essential for translation. Second hypothesis: The RNA world arose de novo in the form of self replicating ribozymes. Almost certainly true The jury is still out
RNA world idea originated in 60’s as a theoretical solution to the chicken and egg problem of DNA and proteins. Self-splicing introns. First RNA catalysts to be discovered. Tom Cech (1982). ‘RNA World’ term coined by Walter Gilbert (1986).
Example of an RNA catalyst Hammerhead ribozyme Cleaves RNA at a specific point. Rolling circle mechanism of replication of virus-like RNAs in plants. Chops long strand into pieces.
What can ribozymes do? Ligases E’ T. A. Lincoln, G. F. Joyce, Science323, 1229 (2009)
An Autocatalytic Set Made from Ligases T. A. Lincoln, G. F. Joyce, Self-Sustained Replication of an RNA Enzyme, Science323, 1229, (2009) Given a supply of A, B, A’, B’, the E and E’ make more of themselves.
What can ribozymes do?Recombinases E.J. Hayden, G.v. Kiedrowski & N. Lehman, Angew. Chem. Int. Edit. (2008) 120, 8552 Catalyst is autocatalytic given a supply of W X Y Z. The non-covalent assembly is also a catalyst.
What can ribozymes do?Polymerases Black +Blue – ribozyme Red – template Orange – primer Primer extended by up to 14 nucleotides Johnstone et al. (2001) Science
Gradual improvement of Polymerases in the lab Wochner et al. (2011) Science - up to 95 nucleotides
What can ribozymes do?Nucleotide Synthetases Unrau and Bartel, (1998) Nature
Synthesis of nucleosides Phosphorylation Generation of NTPs Creation of activated nucleotides Stepwise polymerization An RNA organism must have had a metabolism. Hypothetical pathway for RNA catalyzed RNA synthesis (Joyce)
Clutter of RNA synthesis (Joyce) Why is this particular set of monomers used for nucleic acids? How is this set synthesized specifically? Where is the chemistry occurring? Earth, or space? Hydrothermal vents?
A new route to Pyrimidine ribonucleotide assembly. MW Powneret al. Nature459, 239-242 (2009) doi:10.1038/nature08013 Previously assumed synthesis of -ribocytidine-2',3'-cyclic phosphate 1 (blue; note the failure of the step in which cytosine 3 and ribose 4 are proposed to condense together) and the successful new synthesis described here (green). p, pyranose; f, furanose.
Chemical synthesis of monomers and polymers must have occurred before the origin of ribozymes. Ferris (2002) Orig. Life Evol. Biosph. Montmorillonite catalyzed synthesis of RNA oligonucleotides (30-50 mers) Rajamani et al. (2008) Orig. Life Evol. Biosph. Lipid assisted synthesis of RNA-like polymers from mononucleotides Costanzo et al. (2009) J. Biol. Chem. Synthesis of long RNA strands from cyclic nucleotides in water Rajamani et al. (2010) J. Am. Chem. Soc. Measurements of error rates in non-enzymatic RNA replication There are still some experimental issues… But this is a logical necessity!
How could the RNA world have got started? Getting from chemistry to biology…. RNA replicators must have emerged from prebiotic synthesis of random sequences
Jump-starting the RNA World Wu & Higgs (2009) J. Mol. Evol. Synthesis Precursors Monomers Activation catalyze catalyze Activatedmonomers Ribozymes catalyze catalyze Polymerization Long polymers Short polymers Polymerization
Are there alternatives to RNA? RNA a – Threose Nucleic Acid – TNA c – Glycerol derived nucleic acid b – Peptide nucleic acid – PNA d – Pyranosyl RNA RNA hybridizes with other nucleic acids. Information is not lost. DNA-RNA hybrids DNA takes over at end of RNA world. Maybe TNA or PNA preceded the RNA world. Information passed to RNA. Would need to show that the alternative was easier to synthesize than RNA.
Two scenarios from Segré & Lancet (2000) A – RNA first (strong RNA world hypothesis) B – Lipids first (lipid world hypothesis – compositional genomes – metabolism without genes)