290 likes | 426 Views
CS 451 / 558. Week 3, Tue. In class exercises – Perl 3. Write a program to reverse transcribe RNA to DNA (4.5 from the book). In class exercises – Perl 4. Read two files of data (sequence1.fa and sequence2.fa). Print the contents of the first, then the contents of the 2 nd .
E N D
CS 451 / 558 Week 3, Tue
In class exercises – Perl 3 • Write a program to reverse transcribe RNA to DNA (4.5 from the book)
In class exercises – Perl 4 • Read two files of data (sequence1.fa and sequence2.fa). Print the contents of the first, then the contents of the 2nd. (4.6 from the book)
In class exercises – Perl 5 • Write a program to read a file, then print its lines in reverse order (last line first). • Options • reverse • push/pop/shift/unshift (possibly with loops) (4.7 from the book)
Homework / Projects • Homework due next Thurs • Projects • See website (discuss in class / see dates) • Grad / undergrad projects are distinct
Homology • a similarity often attributable to common origin • likeness in structure between parts of different organisms due to evolutionary differentiation from a corresponding part in a common ancestor
Homology • a similarity often attributable to common origin • likeness in structure between parts of different organisms due to evolutionary differentiation from a corresponding part in a common ancestor • Bat’s wing and human’s arm are homologous • NOT bee’s wing
Homology • a similarity often attributable to common origin • likeness in structure between parts of different organisms due to evolutionary differentiation from a corresponding part in a common ancestor • Bat’s wing and human’s arm are homologous • NOT bee’s wing
Homology • a similarity often attributable to common origin • likeness in structure between parts of different organisms due to evolutionary differentiation from a corresponding part in a common ancestor • Bat’s wing and human’s arm are homologous • NOT bee’s wing
Sequence Homology acgt
Sequence Homology acat g a acgt
Sequence Homology acat g a acgt c agt
Sequence Homology acacct + cc acat g a acgt c agt
Sequence Homology acacct + cc acat g a acat acgt agt c agt a g ggt
Sequence Homology acacct + cc acat g a acat acacct aca--t a-g--t g-g--t acgt agt c agt a g ggt
Sequence Homology acacct + cc acat g a acat acacct aca--t a-g--t g-g--t acgt agt c agt a g ggt
Comparing sequences AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAGATC
Comparing sequences AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAG-AT-C AATCTATA AA-G-ATC AATCTATA AA--GATC
Comparing sequences AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAG-AT-C AATCTATA AA-G-ATC AATCTATA AA--GATC --AATCTATA AAGATC----
Comparing sequences AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAG-AT-C AATCTATA AA-G-ATC AATCTATA AA--GATC --AATCTATA AAGATC---- AATCTATA---- -----A-AGATA
Comparing sequences AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAGATC AATCTATA AAG-AT-C AATCTATA AA-G-ATC AATCTATA AA--GATC --AATCTATA AAGATC---- AATCTATA---- -----A-AGATA terminal gaps internal gaps
Metrics on Strings • Hamming distance • number of positions at which the corresponding symbols are different • Edit distance (Levenshtein distance) • minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other .
Metrics on Strings • Hamming distance • number of positions at which the corresponding symbols are different • Edit distance (Levenshtein distance) • minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other • Contrast this with similarity scores, such as +1/0 or +1/-1.
How to find best alignment(the one with the best similarity score) • Dot Plot
How to find best alignment(the one with the best similarity score) • Dot Plot ACGTAAA Where are the A’s? (pseudocode / perl)
How to find best alignment(the one with the best similarity score) • Dot Plot ACGTAAA CGTACGT Print 2D array of single-letter matches
How to find best alignment(the one with the best similarity score) • Dot Plot ACGTAAA Where are the AA’s? (pseudocode / perl)
How to find best alignment(the one with the best similarity score) • Dot Plot ACGTAAA CGTACGT Print 2D array of two-letter matches (Homework 1)