1 / 30

A weighted Version of Free Energy Ribosomal Algorithm

A weighted Version of Free Energy Ribosomal Algorithm. Using MG1655 E.Coli Genome. S. E 1. E 2. C 1. E 3. C 2. E 4. C 3. Correlator. C 4. E L-N. C (L-N). Free Energy Ribosomal Algorithm.

kin
Download Presentation

A weighted Version of Free Energy Ribosomal Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A weighted Version of Free Energy Ribosomal Algorithm Using MG1655 E.Coli Genome

  2. S E1 E2 C1 E3 C2 E4 C3 Correlator C4 . . . EL-N C(L-N) Free Energy Ribosomal Algorithm A sliding window is applied on the received noisy mRNA sequence to select sub-sequences (S) of length N and compare them with all (L-N) codewords in the codebook (L=13). The codeword that results in the minimum distance metric is selected and the metric value is saved. Biologically, the ribosome achieves this by means of the complementary principle. The energetics involved in the rRNA-mRNA interaction tell the ribosome when a signal is detected and, thus, when the start of the process of translation should take place. The codeword yielding the minimum free energy, i.e. most complementary, will be the valid codeword. The minimum energies are stored and plotted in order to show the performance of the algorithm.

  3. For N = 5 (codeword length = window length) AGAGAAAUAAAAAGUGAAACAUCUGC … GCGUCCAUCAGUUUGA In this model, the method of free energy doublets presented in [12] is adopted to calculate a free energy distance metric in kcal/mol instead of minimum distance (see Table). 5’ U A A G G A G G U G A U C ... 3’ [12] D. Rosnick, Free Energy Periodicity and Memory Model for Genetic Coding. PhD thesis, North Carolina State University, Raleigh, 2001.

  4. E = 0 (Initialization) U A C G G : 5-bases string of mRNA U A A G G : Codeword # 1 U A Match  E = E – 1.1 U A A C Mismatch  E = E – 1.1 – 0 – 0 – 2.9 A A C G Mismatch  E = E A G - 4 G G Match  E = E – 2.9 G G

  5. U A A G G : 5-bases string of mRNA U A A G G : Codeword # 1 U A Match  E = E – 1.1 U A A A Match  E = E – 0.9 – 1.1 – 0.9 – 2.3 – 2.9 A A A G Match  E = E – 2.3 A G - 7.2 G G Match  E = E – 2.9 G G

  6. Free Energy Ribosome Decoding Algorithm

  7. L = length(c); N = length(c{1}); E1(1:L) = 0; % Initialization E2(1:L) = 0; % Initialization for k = 1:L match = 0; for n = 1:N-1 if isequal({c{k}(n:n+1)},{s{1}(n:n+1)}) match = match + 1; FED = c{k}(n:n+1); E1(k) = E1(k) + uniform_energy(FED); if match == 1, factor = 1; end if match == 2, factor = 2; end if match == 3, factor = 5; end if match == 4, factor = 10; end E2(k) = E2(k) + factor * uniform_energy(FED); else E1(k) = E1(k); E2(k) = E2(k); end end end E_min1 = min(E1); E_min2 = min(E2);

  8. Remarks: Gene 1 in MG1655 E. Coli Genome: AUGAAACGCAUUAGCACCACCAUUACCACCACCAUCACCAUUACCACAGGUAACGGUGCGGGCUGA Why does not the ribosome stop translating at UAA above? Gene 2 in MG1655 E. Coli Genome: AUGCGAGUGUUGAAGUUCGGCGGUACAUCAGUGGCAAAUGCAGAACGUUUUCUGCGUGUUGCCGAUAUUCUGGAAAGCAAUGCCAGGCAGGGGCAGGUGGCCACCGUCCUCUCUGCCCCCGCCAAAAUCACCAACCACCUGGUGGCGAUGAUUGAAAAAACCAUUAGCGGCCAGGAUGCUUUACCCAAUAUCAGCGAUGCCGAACGUAUUUUUGCCGAACUUUUGACGGGACUCGCCGCCGCCCAGCCGGGGUUCCCGCUGGCGCAAUUGAAAACUUUCGUCGAUCAGGAAUUUGCCCAAAUAAAACAUGUCCUGCAUGGCAUUAGUUUGUUGGGGCAGUGCCCGGAUAGCAUCAACGCUGCGCUGAUUUGCCGUGGCGAGAAAAUGUCGAUCGCCAUUAUGGCCGGCGUAUUAGAAGCGCGCGGUCACAACGUUACUGUUAUCGAUCCGGUCGAAAAACUGCUGGCAGUGGGGCAUUACCUCGAAUCUACCGUCGAUAUUGCUGAGUCCACCCGCCGUAUUGCGGCAAGCCGCAUUCCGGCUGAUCACAUGGUGCUGAUGGCAGGUUUCACCGCCGGUAAUGAAAAAGGCGAACUGGUGGUGCUUGGACGCAACGGUUCCGACUACUCUGCUGCGGUGCUGGCUGCCUGUUUACGCGCCGAUUGUUGCGAGAUUUGGACGGACGUUGACGGGGUCUAUACCUGCGACCCGCGUCAGGUGCCCGAUGCGAGGUUGUUGAAGUCGAUGUCCUACCAGGAAGCGAUGGAGCUUUCCUACUUCGGCGCUAAAGUUCUUCACCCCCGCACCAUUACCCCCAUCGCCCAGUUCCAGAUCCCUUGCCUGAUUAAAAAUACCGGAAAUCCUCAAGCACCAGGUACGCUCAUUGGUGCCAGCCGUGAUGAAGACGAAUUACCGGUCAAGGGCAUUUCCAAUCUGAAUAACAUGGCAAUGUUCAGCGUUUCUGGUCCGGGGAUGAAAGGGAUGGUCGGCAUGGCGGCGCGCGUCUUUGCAGCGAUGUCACGCGCCCGUAUUUCCGUGGUGCUGAUUACGCAAUCAUCUUCCGAAUACAGCAUCAGUUUCUGCGUUCCACAAAGCGACUGUGUGCGAGCUGAACGGGCAAUGCAGGAAGAGUUCUACCUGGAACUGAAAGAAGGCUUACUGGAGCCGCUGGCAGUGACGGAACGGCUGGCCAUUAUCUCGGUGGUAGGUGAUGGUAUGCGCACCUUGCGUGGGAUCUCGGCGAAAUUCUUUGCCGCACUGGCCCGCGCCAAUAUCAACAUUGUCGCCAUUGCUCAGGGAUCUUCUGAACGCUCAAUCUCUGUCGUGGUAAAUAACGAUGAUGCGACCACUGGCGUGCGCGUUACUCAUCAGAUGCUGUUCAAUACCGAUCAGGUUAUCGAAGUGUUUGUGAUUGGCGUCGGUGGCGUUGGCGGUGCGCUGCUGGAGCAACUGAAGCGUCAGCAAAGCUGGCUGAAGAAUAAACAUAUCGACUUACGUGUCUGCGGUGUUGCCAACUCGAAGGCUCUGCUCACCAAUGUACAUGGCCUUAAUCUGGAAAACUGGCAGGAAGAACUGGCGCAAGCCAAAGAGCCGUUUAAUCUCGGGCGCUUAAUUCGCCUCGUGAAAGAAUAUCAUCUGCUGAACCCGGUCAUUGUUGACUGCACUUCCAGCCAGGCAGUGGCGGAUCAAUAUGCCGACUUCCUGCGCGAAGGUUUCCACGUUGUCACGCCGAACAAAAAGGCCAACACCUCGUCGAUGGAUUACUACCAUCAGUUGCGUUAUGCGGCGGAAAAAUCGCGGCGUAAAUUCCUCUAUGACACCAACGUUGGGGCUGGAUUACCGGUUAUUGAGAACCUGCAAAAUCUGCUCAAUGCAGGUGAUGAAUUGAUGAAGUUCUCCGGCAUUCUUUCUGGUUCGCUUUCUUAUAUCUUCGGCAAGUUAGACGAAGGCAUGAGUUUCUCCGAGGCGACCACGCUGGCGCGGGAAAUGGGUUAUACCGAACCGGACCCGCGAGAUGAUCUUUCUGGUAUGGAUGUGGCGCGUAAACUAUUGAUUCUCGCUCGUGAAACGGGACGUGAACUGGAGCUGGCGGAUAUUGAAAUUGAACCUGUGCUGCCCGCAGAGUUUAACGCCGAGGGUGAUGUUGCCGCUUUUAUGGCGAAUCUGUCACAACUCGACGAUCUCUUUGCCGCGCGCGUGGCGAAGGCCCGUGAUGAAGGAAAAGUUUUGCGCUAUGUUGGCAAUAUUGAUGAAGAUGGCGUCUGCCGCGUGAAGAUUGCCGAAGUGGAUGGUAAUGAUCCGCUGUUCAAAGUGAAAAAUGGCGAAAACGCCCUGGCCUUCUAUAGCCACUAUUAUCAGCCGCUGCCGUUGGUACUGCGCGGAUAUGGUGCGGGCAAUGACGUUACAGCUGCCGGUGUCUUUGCUGAUCUGCUACGUACCCUCUCAUGGAAGUUAGGAGUCUGA

  9. Gene 19 in MG1655 E.Coli Genome: CUCGUUGUGGAGAAUAACAAAAAUGGUCAUCUGGAGCUUACAGGUGGCCAUUCGUGGGACAGUAUCCCUGACAGCCUACAAAACGCAAUUGAAGAACGCGAGGCAUCGUCUUAACGAGGCACCGAGGCGUCGCAUUCUUCAGAUGGUUCAACCCUUAAGUUAGCGCUUAUGGGAUCACUCCCCGCCGUUGCUCUUACUCGGAUUCGUAAGCCGUGAAAACAGCAACCUCCGUCUGGCCAGUUCGGAUGUGAACCUCACAGAGGUCUUUUCUCGUUACCAGCGCCGCCACUACGGCGGUGAUACAGAUGACGAUCAGGGCGACAAUCAUCGCCUUAUGCUGCUUCAUUGCUCUCUUCUCCUUGACCUUACGGUCAGUAAGAGGCACUCUACAUGUGUUCAGCAUAUAGGAGGCCUCGGGUUGAUGGUAAAAUAUCACUCGGGGCUUUUCUCUAUCUGCCGUUCAGCUAAUGCCUGAGACAGACAGCCUCAAGCACCCGCCGCUAUUAUAUCGCUCUCUUUAACCCAUUUUGUUUUAUCGAUUCUAAUCCUGAAGACGCCUCGCAUUUUUGUGGCGUAAUUUUUUAAUGAUUUAAUUAUUUAACUUUAAUUUAUCUCUUCAUCGCAAUUAUUGACGACAAGCUGGAUUAUUUUUGAAAUAUUGGCCUAACAAGCAUCGCCGACUGACAACAAAUUAAUUAUUACUUUUCCUAAUUAAUCCCUCAGGAAUCCUCACCUUAAGCUAUGAUUAUCUAGGCUUAGGGUCACUCGUGAGCGCUUACAGCCGUCAAAAACGCAUCUCACCGCUGAUGGCGCAAAUUCUUCAAUAGCUCGUAAAAAACGAAUUAUUCCUACACUAUAAUCUGAUUUUAACGAUGAUUCGUGCGGGGUAAAAUAGUAAAAACGAUCUAUUCACCUGAAAGAGAAAUAAAAAGUGAAACAUCUGCAUCGAUUCUUUAGCAGUGAUGCCUCGGGAGGCAUUAUUCUUAUCAUUGCCGCUAUCCUGGCGAUGAUUAUGGCCAACAGCGGCGCAACCAGUGGAUGGUAUCACGACUUUCUGGAGACGCCGGUUCAGCUCCGGGUUGGUUCACUCGAAAUCAACAAAAACAUGCUGUUAUGGAUAAAUGACGCGCUGAUGGCGGUAUUUUUCCUGUUAGUCGGUCUGGAAGUUAAACGUGAACUGAUGCAAGGAUCGCUAGCCAGCUUACGCCAGGCCGCAUUUCCAGUUAUCGCCGCUAUUGGUGGGAUGAUUGUGCCGGCAUUACUCUAUCUGGCUUUUAACUAUGCCGAUCCGAUUACCCGCGAAGGGUGGGCGAUCCCGGCGGCUACUGACAUUGCUUUUGCACUUGGUGUACUGGCGCUGUUGGGAAGUCGUGUUCCGUUAGCGCUGAAGAUCUUUUUGAUGGCUCUGGCUAUUAUCGACGAUCUUGGGGCCAUCAUUAUCAUCGCAUUGUUCUACACUAAUGACUUAUCGAUGGCCUCUCUUGGCGUCGCGGCUGUAGCAAUUGCGGUACUCGCGGUAUUGAAUCUGUGUGGUGCACGCCGCACGGGCGUCUAUAUUCUUGUUGGCGUGGUGUUGUGGACUGCGGUGUUGAAAUCGGGGGUUCACGCAACUCUGGCGGGGGUAAUUGUCGGCUUCUUUAUUCCUUUGAAAGAGAAGCAUGGGCGUUCUCCAGCGAAGCGACUGGAGCAUGUGUUGCACCCGUGGGUGGCGUAUCUGAUUUUGCCGCUGUUUGCAUUUGCUAAUGCUGGCGUUUCACUGCAAGGCGUCACGCUGGAUGGCUUGACCUCCAUUCUGCCAUUGGGGAUCAUCGCUGGCUUGCUGAUUGGCAAACCGCUGGGGAUUAGUCUGUUCUGCUGGUUGGCGCUGCGUUUGAAACUGGCGCAUCUGCCUGAGGGAACGACUUAUCAGCAAAUUAUGGUGGUGGGGAUCCUGUGCGGUAUCGGUUUUACUAUGUCUAUCUUUAUUGCCAGCCUGGCCUUUGGUAGCGUAGAUCCAGAACUGAUUAACUGGGCGAAACUCGGUAUCCUGGUCGGUUCUAUCUCUUCGGCGGUAAUUGGAUACAGCUGGUUACGCGUUCGUUUGCGUCCAUCAGUUUGACAGGACGGUUUACCGGGGAGCCAUAAACGGCUCCCUUUUCAUUGUUAUCAGGGAGAGAA SD = 16963 IC = 17489 TC = 18655 This sequence starts at 16558 and ends at 18714

  10. 16558 16963 17489 18655 18714 SD IC TC 0 406 932 2098 2157 SD IC TC 0 90 101 398 500 A 500-bases long sequence with the SD being set at position 90, the initiation codon at position 101, and the termination codon at position 398: GGCGACAAUCAUCGCCUUAUGCUGCUUCAUUGCUCUCUUCUCCUUGACCUUACGGUCAGUAAGAGGCACUCUACAUGUGUUCAGCAUAUAGGAGGFCCUCGGUGAAACAUCUGCAUCGAUUCUUUAGCAGUGAUGCCUCGGGAGGCAUUAUUCUUAUCAUUGCCGCUAUCCUGGCGAUGAUUAUGGCCAACAGCGGCGCAACCAGUGGAUGGUAUCACGACUUUCUGGAGACGCCGGUUCAGCUCCGGGUUGGUUCACUCGAAAUCAACAAAAACAUGCUGUUAUGGAUAAAUGACGCGCUGAUGGCGGUAUUUUUCCUGUUAGUCGGUCUGGAAGUUAAACGUGAACUGAUGCAAGGAUCGCUAGCCAGCUUACGCCAGGCCGCAUUUCCAGCAGUUUGACAGGACGGUUUACCGGGGAGCCAUAAACGGCUCCCUUUUCAUUGUUAUCAGGGAGAGAAAUGAGCAUGUCUCAUAUCAAUUACAACCACUUGUAUUACUUCUGG

  11. Gene 129 in MG1655 E.Coli Genome: GGCUAUUUCCUCUCCUCUGGAUUUGGGGGAGAGGAGUUUUGACGGCUAUCACCCUUUAUCAACAAUGGUCAGGGUAGACUGAUUUUCGGCUAAGGAGGAAGGCGAUGUUAGGUUGGGUAAUUACCUGUCACGAUGACCGGGCGCAAGAGAUACUGGAUGCGCUGGAGAAAAAACAUGGGGCACUUCUUCAGUGCCGGGCCGUGAAUUUCUGGCGCGGAUUAAGCUCUAAUAUGCUCAGCCGCAUGAUGUGCGAUGCUCUGCAUGAAGCGGACUCUGGUGAGGGUGUCAUCUUCUUAACCGAUAUAGCCGGAGCGCCACCGUAUCGCGUGGCUUCAUUAUUAAGCCACAAACACUCCCGUUGCGAAGUGAUUUCUGGUGUCACGUUACCGUUAAUUGAACAGAUGAUGGCUUGCCGUGAAACCAUGACCAGUUCAGAGUUUCGCGAGCGUAUUGUCGAACUGGGUGCGCCGGAGGUGAGUAGUCUUUGGCACCAACAACAAAAAAAUCCGCCUUUCGUCCUCAAACAUAAUUUGUAUGAGUAUUAACCCGCGAUUCUGAUGGCGCUUUUGCUACAAUAAAAGCGUUGUUUCACCCUCGGUUAUUUUUUCA SD = 144565 IC = 144577 TC = 145017 This sequence starts at 144473 and ends at 145081

  12. N = 5

  13. N = 6

  14. N = 7

  15. N = 8

  16. N = 9

  17. N = 10

  18. N = 11

  19. Remarks • Number of genes in MG1655 E. Coli Strain having the Shine-Dalgarno in the 5’ to 3’ direction (out of 2094 genes): - “AGGAGG” : 90 positions ( 4.30 % ) - “AGGAG” : 454 positions ( 21.68 %) - “GGAGG” : 335 positions ( 16.00%) - “GAGG” : 1313 positions ( 62.70%) - “GGAG” : 1518 positions ( 72.49%) - “AGGA” : 1836 positions ( 87.68%)

  20. SD = 'GGAGG' SD = 'GGAGG' a = 3 a = 3 SD = 'GGAGG' 'UCUCUGGAGGGUGUUU‘ 85 ………………………..100 'GGUCUGGUGAU‘ 'GGUCUGGUGAU‘ 335 …………….345 'GGUCUGGUGAU‘ 'GGUCUGGUGAU‘ 335 …………….345

  21. SD = 'GGAGG' a = 2 'UCUCUGGAGGGUGUUU‘ 85 ………………………..100 'GGUCUGGUGAU‘ 'GGUCUGGUGAU‘ 335 …………….345

  22. a = 2.5 a = 2.5 a = 2.5 SD = ‘AGGAG' SD = ‘AGGAG' SD = ‘AGGAG' 'CUGGGAUGGAGGUCAC‘ 'CUGGGAUGGAGGUCAC‘ 235 ……………………….250 'CUGGGAUGGAGGUCAC‘ 'CUGGGAUGGAGGUCAC‘ 235 ……………………….250 'CUGGGAUGGAGGUCAC‘ 'CUGGGAUGGAGGUCAC‘ 235 ……………………….250 'GAAAAAGGAGAAAUUC‘ 85 ……………………….100 'GAAAAAGGAGAAAUUC‘ 85 ……………………….100 'GAAAAAGGAGAAAUUC‘ 85 ……………………….100

  23. SD = ‘AGGAG' SD = ‘AGGAG' SD = ‘AGGAG' SD = ‘AGGAG' SD = ‘AGGAG' a = 2.5 a = 2.5 a = 2.5 a = 2.5 a = 2.5 'GAGCGAGGAGAACCGU‘ 85 ……………………….100 'GAGCGAGGAGAACCGU‘ 85 ……………………….100 'GAGCGAGGAGAACCGU‘ 85 ……………………….100 'CUCACAGGAGC‘ 40 ………………50 'CUCACAGGAGC‘ 40 ………………50 'CUCACAGGAGC‘ 40 ………………50 'CUCACAGGAGC‘ 40 ………………50

  24. SD = ‘GAGG' SD = ‘GAGG' SD = ‘GAGG' SD = ‘GAGG' SD = ‘GAGG' a = 2.5 a = 2.5 a = 2.5 a = 2.5 a = 2.5 'GGGAAGAGGUAGGGGG‘ 85 ……………………….100 'GGGAAGAGGUAGGGGG‘ 85 ……………………….100 'GGGAAGAGGUAGGGGG‘ 85 ……………………….100 'UUCAUAAGGAU‘ 65 …………..…75 'UUCAUAAGGAU‘ 65 …………..…75 'UUCAUAAGGAU‘ 65 …………..…75 'UUCAUAAGGAU‘ 65 …………..…75

  25. SD = ‘GGAG' a = 2.5 'AUUAUGGAGAAAAAUG‘ 85 ……………………….100 ‘CUGGGAUGGAGGUCAC’ ‘CUGGGAUGGAGGUCAC’ 235………………………250

  26. Shine-Dalgarno Sequence • The Shine-Dalgarno Sequence (AGGAGGU) is the signal for initiation of protein biosynthesis in bacterialmRNA. It is located 5' of the first coding AUG, and consists primarily, but not exclusively, of purines. • The complementary sequence (ACCUCCU), rich in pyrimidines, is called the Anti-Shine-Dalgarno Sequence and is located at the 3' end of the 16S rRNA in the ribosome. • Mutations in the Shine-Dalgarno Sequence can reduce translation. This reduction is due to a reduced mRNA-ribosome pairing efficiency, as evidenced by the fact that complementary mutations in the Anti-Shine-Dalgarno Sequence can restore translation. • The role of this sequence was first proposed by Australian scientists John Shine and Lynn Dalgarno.

  27. Shine-Dalgarno sequence vs. ribosomal S1 protein • In Gram-negative bacteria, however, Shine-Dalgarno sequence presence is not obligatory for ribosome to locate initiator codon, since deletion of Anti-Shine-Dalgarno sequence from 16S rRNA doesn't lead to translation initiation at non-authentic sites. • Moreover, numerous prokaryotic mRNAs don't possess Shine-Dalgarno sequences at all. • What principally attracts ribosome to mRNA initiation region is apparently ribosomal protein S1, which binds to AU-rich sequences found in many prokaryotic mRNAs 15-30 nucleotides upstream of start-codon. • It should be noted, that S1 is only present in Gram-negative bacteria, being absent from Gram-positive species.

  28. Hi, Mohammad, As for the initiation of E.Coli translation, the small subunit of the ribosome binds to a site "upstream" (on the 5' side) of the start of the message. It proceeds downstream (5' -> 3') until it encounters the start codon AUG. (The region between the cap and the AUG is known as the 5'-untranslated region [5'-UTR].) Here it is joined by the large subunit and a special initiator tRNA. The initiator tRNA binds to the P site (shown in pink) on the ribosome. There's a book called "Gene function : E. coli and its heritable elements", written by Robert E. Glass. It's an old book, but some fundamental ideas from it are rather illuminating. As for new information, I suggest you turn to papers (especially reviews) instead of books. Maybe we can schedule a meeting sometime, if you like. Siyun

  29. I also find another target you might be interested: Expression of the pyrC gene in Escherichia coli K-12 is regulated by a translational control mechanism. The pyrC ribosome binding site is unusual in that it contains two potential SD sequences, designated SD1 and SD2, which are located 11 and 4 nucleotides upstream of the translational initiation codon, respectively. Mutations that inactivate either SD1 or SD2 were constructed and incorporated separately into a pyrC::lacZ protein fusion. The effects of the mutations on pyrC::lacZ expression, regulation, and transcript levels were determined. The results indicate that SD1 is the only functional pyrC SD sequence. The SD2 mutation did cause a small reduction in expression, but this effect appeared to be due to a decrease in transcript stability. Below is the nucleotide sequence of the translational initiation region.The ATG translational initiation codon is enclosed in a box. SD1 and SD2 are underlined.

More Related