130 likes | 252 Views
More on translation. How DNA codes proteins. The primary structure of each protein (the sequence of amino acids in the polypeptide chains that make up a protein) is represented as a sequence of codons. Each codon is a triplet of nucleotides. Traditionally, codons are
E N D
How DNA codes proteins The primary structure of each protein (the sequence of amino acids in the polypeptide chains that make up a protein) is represented as a sequence of codons. Each codon is a triplet of nucleotides. Traditionally, codons are written in the A, G, C, U alphabet of RNA. Most organisms use the same genetic code, i.e., correspondence between codons and amino acids.
How DNA codes proteins There are 64 potential codons. Most amino acids can be coded by more than one codon. Three of the codons are STOP codons. They signal the end of a coding sequence. One codon (AUG) doubles as a START, or initiation codon that signals the beginning of a coding sequence.
URL of the day Codon Usage Database http://www.kazusa.or.jp/codon/ The frequency of codon usage in various organisms may be searched at this site.
Homework 2.1 • Look up the codon usage table for the following organisms: - Mycoplasma gallisepticum R - Haemophilus influenzae PittGG - Homo sapiens (nuclear genome) - Homo sapiens (mitochondrial genome) and use it to calculate the average length of a polypeptide chain in this organism. Be sure to explain the ideas behind your calculations.
Reading frames Suppose we are given the following nucleotide sequence: 5’ … AAGTTTCCGAGCTGACGGGACT… 3’ Provided this is a protein coding region, which amino acids does it code? Let us see how we could parse the above sequence. 5’ … AAG TTT CCG AGC TGA CGG GAC T… 3’ which translates into: ... Lys Phe Pro Ser STOP
Reading frames Alternatively, we could have: 5’ … A AGT TTC CGA GCT GAC GGG ACT … 3’ which translates into: ... Ser Phe Arg Ala Asp Gly Thr … or: 5’ … AA GTT TCC GAG CTG ACG GGA CT … 3’ which translates into: … Val Ser Glu Leu Thr Gly ...
More reading frames But what if the coding is done by the other strand? The reversecomplement to 5’ … AAGTTTCCGAGCTGACGGGACT… 3’ is: 5’ … AGTCCCGTCAGCTCGGAAACTT … 3’ This can be parsed as: 5’ … AGT CCC GTC AGC TCG GAA ACT T … 3’ which translates into: ... Ser Pro Val Ser Ser Glu Ala …
More reading frames Alternatively, we could have: 5’ … A GTC CCG TCA GCT CGG AAA CTT … 3’ which translates into: ... Val Pro Ser Ala Arg Lys Leu … or: 5’ … AG TCC CGT CAG CTC GGA AAC TT … 3’ which translates into: … Ser Arg Glu Leu Gly Thr ...
There are six different reading frames We have seen that each stretch of coding region could be translated in six different ways into amino acid sequences. These six different ways of parsing a coding sequence are called reading frames. If we search the genome for coding regions of genes, all six reading frames have to be considered.
Open reading frames (ORF’s) If an organism does not have introns, each reasonably long stretch between an initiation codon and a STOP codon (called an open reading frame, orORF) is potentially the coding region for a polypeptide chain. (But note that the initiation codon could also be the codon for methionine!) In organisms containing introns it is much more difficult to recognize coding regions. On top of being noncoding themselves, introns may also introduce frame shifts.
Homework 2.2 Suppose the nucleotide sequence 5’ … AGGCTTCAAGGGT … 3’ is part of a coding region. Find its translation in all six reading frames.