452 likes | 810 Views
Codon Usage. Dan Graur. Because of the degeneracy of all genetic codes, 18-20 amino acids are encoded by more than one codon (2, 3, 4, or 6). If synonymous mutations are strictly neutral , they should be used randomly as dictated by genomic GC content. Codon-usage bias.
E N D
Codon Usage Dan Graur
Because of the degeneracy of all genetic codes, 18-20 amino acids are encoded by more than one codon (2, 3, 4, or 6).
If synonymous mutations are strictly neutral, they should be used randomly as dictated by genomic GC content.
The relative synonymous codon usage (RSCU) is the number of times a codon appears in a gene divided by the number of expected occurrences under equal codon usage. n = number of synonymous codons (1 n 6) for the amino acid under study, Xi = number of occurrences of codon i. If the synonymous codons of an amino acid are used with equal frequencies, their RSCU values will equal 1.
The codon adaptation index (CAI) measures the degree with which genes use preferred codons. We first compile a table of RSCU values for highly expressed genes. From this table, it is possible to identify the codons that are most frequently used for each amino acid. The relative adaptiveness of a codon (wi) is computed as where RSCUmax = the RSCU value for the most frequently used codon for an amino acid.
The CAI value for a gene is calculated as the geometric mean of wi values for all the codons used in that gene. where L = number of codons.
The effective number of codons (ENC) where Fi (i = 2, 3, 4, or 6) is the average probability that two randomly chosen codons for an amino acid with i codons will be identical. ENC values range from 20 (the number of amino acids), which means that the bias is at a maximum, and only one codon is used from each synonymous-codon group, to 61 (the number of sense codons), which indicates no codon-usage bias.
The genome hypothesis All genes in a genome tend to have the same coding strategy. That is, they employ the codon catalog similarly and show similar choices between synonymous codons. Different taxa have different coding strategies. Richard Grantham
Are there universal preferences? There are NO universally preferred or universally avoided codons. There may be some universal preferences and avoidances as far as codon neighbor pairs are concerned. For example, the pair NNG GNN, where N stands for all four possible nucleotides, seems to be preferred, while the pair NNG CNN seems to be avoided.
Biases in synonymous codon usage can be caused by: (1) mutational biases (2) selection favoring preferred codons (3) purifying selection against disfavored codons
Mutational Biases If the unequal codon-usage is due to biases in mutation patterns, then the expectation is that the magnitude and the direction of the bias will be more or less the same for all codon families and for all genes, regardless of function or expression levels.
Mutational Biases Let us assume that the mutation pattern in an organism tends to result in AT rich sequences. Under such a mutational regime, it is expected that all four-fold degenerate codon families will exhibit a preference for codons ending in A or T. Thus, the preferred codons for valine should be GTA and GTT and the preferred codons for arginine should be CGA and CGT.
Mutational Biases Some bacterial genomes (e.g., Mycoplasma capricolum), exhibit this type of consistent codon-usage bias.
Mutational Biases In Escherichia coli, there is no such consistent bias.
(2) positive selection favoring preferred codons (3) purifying selection against disfavored codons
(2) positive selection is expected to accelerate the rate of substitution (3) purifying selection is expected to slow down the rate of substitution
(2) positive selection is expected to accelerate the rate of substitution (3) purifying selection is expected to slow down the rate of substitution
(2) positive selection is expected to accelerate the rate of substitution (3) purifying selection is expected to slow down the rate of substitution There is a negative correlation between codon usage bias and rate of synonymous substitution.
positive selection is expected to accelerate the rate of substitution purifying selection is expected to slow down the rate of substitution There is a negative correlation between codon usage bias and rate of synonymous substitution.
Two selective factors have been convincingly invoked to explain codon usage bias. (1) translation optimization (2) folding stability of the mRNA
The translation efficiency of a codon is related to the relative quantity of tRNA molecules that recognize the particular codon.
Codon Usage is related to Translation Efficiency
Is codon usage bias uniform along the length of the mRNA? For many highly expressed genes, codons recognized by low abundance tRNAs are overrepresented in the 5’ region of the coding region. This pattern suggests that ribosomes translate more slowly over the initial 50 codons or so (the so-called ramp stage) and then translate the remainder of the mRNA at full speed.
What purpose does the ramp play in translation? Slowing translation elongation immediately after initiation effectively generates more uniform spacing between ribosomes further down the mRNA, which prevents ribosome congestion and translation stalling and termination.
Another potential role for the ramp involves protein folding. The length of the ramp corresponds well to the length of the polypeptide needed to fill the exit tunnel of the ribosome, so the nascent peptide chain should emerge from the ribosome as it transitions from the slow ramp stage to the fast stage of elongation. This raises the possibility that the slowdown in the ramp might increase the fraction of correctly folded product.
Folding stability of the mRNA RNA is synthesized as single strands of ribonucleotides. Intrastrand base pairing will produce two-dimensional (2D) structures.
Folding stability of the mRNA The stability of a secondary structure is quantified as the amount of free energy released or used to form it. Positive free energy requires work to form a structure. Negative free energy release stored work. The more negative the free energy of a structure, the more likely is formation of that structure, because more stored energy is released.
Folding stability of the mRNA Free energies are additive, so one can determine the total free energy of a secondary structure by adding all the component free energies. local folding energy = ΔG. along the mRNA sequence using a sliding window of 30 nucleotides (nt) in length, moving from the start codon to the downstream nucleotide in steps of 10 nt (for a total of 13 windows). To quantify the deviation from expectation given a gene's amino-acid sequence and codon usage bias, we also calculated for 1000 permuted mRNA sequences. We obtained permuted sequences by randomly reshuffling synonymous codons within each gene. We then calculated a -score, , by comparing the of the real mRNA segment to the distribution of values of the permuted sequences (see Materials and Methods). measures the extent to which local mRNA stability deviates from expectation. A positive means that local mRNA stability is reduced, and a negative means that it is increased. For each window, we calculated a genome-wide mean by averaging the corresponding values over all genes in a genome.
Folding stability of the mRNA ΔG = Local free energy of a sequence. Expectation = mean local free energy of 1000 permuted sequences. ZΔG = A measure of the extent to which a local ΔG value deviates from expectation.
Folding stability of the mRNA A positive ZΔG means that local mRNA stability is reduced. A negative ZΔG means that local mRNA stability is increased.
Codon arrangement along the mRNA The arrangement of different codons along the length of the mRNA influences translation efficiency. In the autocorrelated pattern, when an amino acid recurs in the protein, there is a strong propensity to use the same codon the second time as that for the first occurrence of the amino acid. In the anticorrelated pattern, when an amino acid recurs in the protein, there is a strong tendency to use a different codon the second time from that used in the first occurrence of the amino acid.
Some organisms display biased codon usage; others do not. Certain organisms, such as the bacterium Helicobacter pylori and humans present little evidence of translational selection, while others such as the bacterium Escherichia coli, the yeast Saccharomyces cerevisae, the nematode Caenorhabditis elegans, and the fly Drosophila melanogaster,show a marked codon bias due to selection.
A possible solution was suggested by dos Reis et al. (2004). dos Reis et al. (2004) discovered that tRNA-gene redundancy and genome size are interacting forces in determining translational selection and codon-usage bias. They suggested that an optimal combination of these factors exists for which the action of translational selection is maximal.
The magnitude of selection was maximal in genomes 1-30 Mb in size that contain 150-600 tRNA specifying genes. Both Helicobacter pylori and humans fall outside this range. The genome of Helicobacter pylori contains only 36 tRNA-coding genes (only one tRNA-gene having two copies). The haploid genome size of humans is approximately 3,500 Mb.