80 likes | 279 Views
BioPhysics 101 Assignment 3B Anugraha (ANU) Raman. Answers Problem 1,2 ,3 answers …. Page 2 Problem 3 answer continued …. Page 3 Problem 4 answer …. Page 4 Screenshots Screenshot for problems 1,2,3 output html file …. Page 6 Screenshot for problem 4 output plot …. Page 7
E N D
BioPhysics 101 Assignment 3BAnugraha (ANU) Raman Answers • Problem 1,2 ,3 answers …. Page 2 • Problem 3 answer continued …. Page 3 • Problem 4 answer …. Page 4 Screenshots • Screenshot for problems 1,2,3 output html file …. Page 6 • Screenshot for problem 4 output plot …. Page 7 • Screenshot for problem 4 output html file …. Page 8
Problem 1 Ans: GC % is: 52.9411764706 Problem 2 Ans: DNA reverse complement of p53seg is: caccacgttggccacgctggtctggaactcctatcctcaagtaatccgcccgcctcggcctcccaaagtgcaggcgtgagccaaagtgcaggcgtgagccacagcgcccatcctgattccattctatatgaagttctccaacaggcaaaatggttatggagatcaaaataaaggtggggtcgggaatcgactgggaagagacgtgatgaaacgtttctgggacgatgaaaagggtctgtgacttggtaggcatcacggagcggttaggggccaaaactcatcttcctgtgcacttgctgtgtgcactggcgctgtgtgtaaatgccacctcgatttaggaaaaagatgacgtaagtacggcacaaagtggccggtacgcggcaggtgcatgggaagaaactgcggaatgaaacaaccgcgagctaagagatggggcagcgggagaaatgaattcgagttccgcctcctaccaggaagaaccggctcgggccgagggctgcacggaggaccacacggacgcctgcgggcccgccccttccgcttcacgacgttcagcctgcgtctggaactggaatggcctagcccaaagctagataacaggtagattgtttttcccgacaaattatcaaacgacccatcattgcactctttcaaaatttcattctcagacgtaccattcttttttttttttctccgggaagatgagatatgctcattcttgaaagtgcctccgggcttgccttctgcacacttctttccctccctgtctacgccatggtagcgtccgcctaggttgcaggcgacccggggggtggggcacaccattcaaagaaggggagggattgaggtttgcatcaaaacaaatacccctgcctttgcaaaggccataactaagtaatccagaaaaagaaatgcaggcggagaatagcagcctccctctgccaagtaagaggaaccggcctaaaggacattttctctctctctcctcccctctcatcgggtgaatagtgagctgctccg Problem 3* Ans: (+1) frame translation is: RSSSLFTR*EGRRERENVL*AGSSYLAEGGCYSPPAFLFLDYLVMAFAKAGVFVLMQTSIPPLL*MVCPTPRVACNLGGRYHGVDREGKKCAEGKPGGTFKNEHISSSRRKKKKNGTSENEILKECNDGSFDNLSGKTIYLLSSFGLGHSSSRRRLNVVKRKGRARRRPCGPPCSPRPEPVLPGRRRNSNSFLPLPHLLARGCFIPQFLPMHLPRTGHFVPYLRHLFPKSRWHLHTAPVHTASAQEDEFWPLTAP*CLPSHRPFSSSQKRFITSLPSRFPTPPLF*SP*PFCLLENFI*NGIRMGAVAHACTLAHACTLGGRGGRIT*G*EFQTSVANVV (+2) frame translation is: GAAHYSPDERGGEREKMSFRPVPLTWQREAAILRLHFFFWIT*LWPLQRQGYLF*CKPQSLPFFEWCAPPPGSPAT*ADATMA*TGRERSVQKASPEALSRMSISHLPGEKKKRMVRLRMKF*KSAMMGRLIICREKQSTCYLALG*AIPVPDAG*TS*SGRGGPAGVRVVLRAALGPSRFFLVGGGTRIHFSRCPIS*LAVVSFRSFFPCTCRVPATLCRTYVIFFLNRGGIYTQRQCTQQVHRKMSFGP*PLRDAYQVTDPFHRPRNVSSRLFPVDSRPHLYFDLHNHFACWRTSYRMESGWALWLTPALWLTPALWEAEAGGLLEDRSSRPAWPTW? (+3) frame translation is: EQLTIHPMRGEERERKCPLGRFLLLGRGRLLFSACISFSGLLSYGLCKGRGICFDANLNPSPSLNGVPHPPGRLQPRRTLPWRRQGGKEVCRRQARRHFQE*AYLIFPEKKKKEWYV*E*NFERVQ*WVV**FVGKNNLPVI*LWARPFQFQTQAERREAEGAGPQASVWSSVQPSARAGSSW*EAELEFISPAAPSLSSRLFHSAVSSHAPAAYRPLCAVLTSSFS*IEVAFTHSASAHSKCTGR*VLAPNRSVMPTKSQTLFIVPETFHHVSSQSIPDPTFILISITILPVGELHIEWNQDGRCGSRLHFGSRLHFGRPRRADYLRIGVPDQRGQRG? BioPhysics 101 Assignment 3B: Problems 1,2,3 answers *Please note: ? Represents insufficient number to complete the codon translation to amino acid
Problem 3* Ans. (continued): (-1) frame translation is: HHVGHAGLELLSSSNPPASASQSAGVSQSAGVSHSAHPDSILYEVLQQAKWLWRSK*RWGRESTGKRRDETFLGR*KGSVTW*ASRSG*GPKLIFLCTCCVHWRCV*MPPRFRKKMT*VRHKVAGTRQVHGKKLRNETTAS*EMGQREK*IRVPPPTRKNRLGPRAARRTTRTPAGPPLPLHDVQPASGTGMA*PKAR*QVDCFSRQIIKRPIIALFQNFILRRTILFFFSPGR*DMLILESASGLAFCTLLSLPVYAMVASA*VAGDPGGGAHHSKKGRD*GLHQNKYPCLCKGHN*VIQKKKCRRRIAASLCQVRGTGLKDIFSLSPPLSSGE**AAP (-2) frame translation is: TTLATLVWNSYPQVIRPPRPPKVQA*AKVQA*ATAPILIPFYMKFSNRQNGYGDQNKGGVGNRLGRDVMKRFWDDEKGL*LGRHHGAVRGQNSSSCALAVCTGAVCKCHLDLGKR*RKYGTKWPVRGRCMGRNCGMKQPRAKRWGSGRNEFEFRLLPGRTGSGRGLHGGPHGRLRARPFRFTTFSLRLELEWPSPKLDNR*IVFPDKLSNDPSLHSFKISFSDVPFFFFFLREDEICSFLKVPPGLPSAHFFPSLSTPW*RPPRLQATRGVGHTIQRRGGIEVCIKTNTPAFAKAITK*SRKRNAGGE*QPPSAK*EEPA*RTFSLSLLPSHRVNSELL? (-3) frame translation is: PRWPRWSGTPILK*SARLGLPKCRREPKCRREPQRPS*FHSI*SSPTGKMVMEIKIKVGSGIDWEET**NVSGTMKRVCDLVGITERLGAKTHLPVHLLCALALCVNATSI*EKDDVSTAQSGRYAAGAWEETAE*NNRELRDGAAGEMNSSSASYQEEPARAEGCTEDHTDACGPAPSASRRSACVWNWNGLAQS*ITGRLFFPTNYQTTHHCTLSKFHSQTYHSFFFFSGKMRYAHS*KCLRACLLHTSFPPCLRHGSVRLGCRRPGGWGTPFKEGEGLRFASKQIPLPLQRP*LSNPEKEMQAENSSLPLPSKRNRPKGHFLSLSSPLIG*IVSCS? BioPhysics 101 Assignment 3B: Problems 3 answer continued *Please note: ? Represents insufficient number to complete the codon translation to amino acid
Problem 4 Ans: • The script does single base-pair mutations to the p53seg gene at a rate of 1% (i.e. ~1 mutation every 100 base pairs) . It then runs a simulation of 100 generations of a complete p53seg sequence mutation. ( I have simulated 800 sets of these 100-generation runs to get the aggregated plot shown on page 7.) It seems there are about four to five (4.65) premature terminations for every 1020 mutations looking at the graph on page 7. • Result below document the changes to the protein sequence during a simulation. • [Sim #: A |Run #: B] Stop Locations 8 19 64 255 Premature Termination @: 267 285 288 298 327 329 should be read as follows: • This is Simulation A (for example shown in answers this can take value from 0 to 799); B can take value from 0 to 99 for • the example shown in answer. • Run#B corresponds to the B’th evolution of the protein sequence • If output is not explicitly presented it means that the previous Single base-pair mutations (as discussed above) did not result in premature termination or a missing “Stop” • Stops appearing in location 8,19,664 and 255 appear in the original protein (sans any mutation) • Stop appearing at 267 is due to mutation creating a premature termination • Stops appearing in location 285, 288, 298, 327 and 329 appear in the original protein (sans any mutation) • Mutation generation 0 through 9 did not generate a premature termination scenario • [Sim #: 0 |Run #: 10] Stop Locations 8 19 64 255 Premature Termination @: 267 285 288 298 327 329 • RSSSLFTR*EGRRERENVL*AGSSYLAEGGCYSPPAFLFLDYLVMAFAKAGVFVLMQTSIPPLL*MVCPTPRVACNLGGRYHGVDREGKKCAEGKPGGTFKNEHISSSRRKKKKNGTSENEILKECNDGSFDNLSGKTIYLLSSFGLGHSSSRRRLNVVKRKGRARRRPCGPPCSPRPEPVLPGRRRNSNSFLPLPHLLARGCFIPQFLPMHLPRTGHFVPYLRHLFPKSRWHLHTAPVHTASAQEDEFWPLTAP*CLPSHRPFSSS*KRFITSLPSRFPTPPLF*SP*PFCLLENFI*NGIRMGAVAHACTLAHACTLGGRGGRIT*G*EFQTSVANVV • Mutation generation 11 through 16 did not generate a premature termination scenario • [Sim #: 0 |Run #: 17] Stop Locations 8 19 Premature Termination @: 30 64 255 285 288 298 327 329 • RSSSLFTR*EGRRERENVL*AGSSYLAEGG*YSPPAFLFLDYLVMAFAKAGVFVLMQTSIPPLL*MVCPTPRVACNLGGRYHGVDREGKKCAEGKPGGTFKNEHISSSRRKKKKNGTSENEILKECNDGSFDNLSGKTIYLLSSFGLGHSSSRRRLNVVKRKGRARRRPCGPPCSPRPEPVLPGRRRNSNSFLPLPHLLARGCFIPQFLPMHLPRTGHFVPYLRHLFPKSRWHLHTAPVHTASAQEDEFWPLTAP*CLPSHRPFSSSQKRFITSLPSRFPTPPLF*SP*PFCLLENFI*NGIRMGAVAHACTLAHACTLGGRGGRIT*G*EFQTSVANVV • [Sim #: 0 |Run #: 18] Stop Locations 8 19 64 Premature Termination @: 112 255 285 288 298 327 329 • RSSSLFTR*EGRRERENVL*AGSSYLAEGGCYSPPAFLFLDYLVMAFAKAGVFVLMQTSIPPLL*MVCPTPRVACNLGGRYHGVDREGKKCAEGKPGGTFKNEHISSSRRKK*KNGTSENEILKECNDGSYDNLSGKTIYLLSSFGLGHSSSRRRLNVVKRKGRARRRPCGPPCSPRPEPVLPGRRRNSNSFLPLPHLLARGCFIPQFLPMHLPRTGHFAPYLRHLFPKSRWHLHTAPVHTASAQEDEFWPLTAP*CLPSHRPFSSSQKRFITSLPSRFPTPPLF*SP*PFCLLENFI*NGIRMGAVAHACTLAHACTLGGRGGRIT*G*EFQTSVANVV • … BioPhysics 101 Assignment 3B: Problems 4 answer
BioPhysics 101 Assignment 3B: Screenshot Problems 1,2,3 Output
BioPhysics 101 Assignment 3B: Screenshot Problem 4 Output The length of the given sequence is 1020 base pairs. For every 100 base pairs the script randomly tries to mutate a to t/g/c, t to a/g/c etc. with a probability of 0.01. Then the script ran this sequence mutation a 100 times (= a single simulation). To arrive at the aggregated results shown in the plot here the script did 800 such simulations. As we can see the output plot after 800 simulations shows about four to five (~4.65) premature terminations occurring for every 1020 mutations. about four to five (~4.65) premature terminations for every 1020 mutations
BioPhysics 101 Assignment 3B: Screenshot Problem 4 HTML File Output