260 likes | 283 Views
Sequence comparison: More dynamic programming. Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble. One-minute responses. WAY TOO FAST. Please walk around more during sample problems. I was completely lost.
E N D
Sequence comparison: More dynamic programming Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble
One-minute responses • WAY TOO FAST. Please walk around more during sample problems. I was completely lost. • Today I felt a bit lost. Most times I was still trying to figure out one slide or problem, while the class was on the next one. • It was fast today, but after the reading I was prepared to take things more quickly and I understood things much better today. • I enjoyed class today. I thought it moved at a great pace. • I thought the pace was good today. • I liked the pace of the lecture – even though you said we spent too much time on the dynamic programming, it gave me time to understand. • The pace is great and gives me time to explore. • I thought this lecture built nicely on the last lecture. I struggled last class but it clicked today.
One-minute responses • The matrix exercise was very helpful, even though I’m not fully clear on how it works yet. • I found today’s class time much more understandable. • I struggled a little bit to grasp the matrix, but by the end I had it. The pace and numerous examples helped. • The DP matrix was simple to grasp after computing one or two matrix values, so the portion of the lecture could go faster. • I like the sample problems. • Dynamic programming reminded me of sudoku, which was fun. • Going through the alignment table helped a lot. • It was nice to do examples with DNA sequences. • I’m feeling a lot better about it all. I really like going through examples. • Again, the small steps with programming problems helped, although the first problem was overly challenging (when explained in a different way it was fine). • I was a little confused when writing the program. I think more practice is required. The practice problems will help. • Today’s class was much better since we had appropriate reading first. The sample problems were interesting since they actually relate to biology.
One-minute responses • Is there a place to get more samples of simple code to use to help see patterns of how this works? Or is there plenty in the book? • There are lots of examples in the book. And of course, you can easily find lots of examples on the web. For a reference book with examples, try Python Cookbook, by Martelli, Ravenscroft and Ascher. • I’m a little fuzzy about how dynamic programming differs from other sorts of programming, but everything else was really clear. • The term “dynamic programming” predates computers. There is no relationship between this use of the word “programming” and what we are learning to do in Python.
Three legal moves • A diagonal move aligns a character from the left sequence with a character from the top sequence. • A vertical move introduces a gap in the sequence along the top edge. • A horizontal move introduces a gap in the sequence along the left edge.
GA-ATC CATA-C DP matrix
GAAT-C CA-TAC DP matrix
GAAT-C C-ATAC DP matrix
GAAT-C -CATAC DP matrix
Multiple solutions • When a program returns a sequence alignment, it may not be the only best alignment. GA-ATC CATA-C GAAT-C CA-TAC GAAT-C C-ATAC GAAT-C -CATAC
DP in equation form • Align sequence x and y. • F is the DP matrix; s is the substitution matrix; d is the linear gap penalty.
A simple example Find the optimal alignment of AAG and AGC. Use a gap penalty of d=-5.
A simple example Find the optimal alignment of AAG and AGC. Use a gap penalty of d=-5.
A simple example Find the optimal alignment of AAG and AGC. Use a gap penalty of d=-5.
A simple example Find the optimal alignment of AAG and AGC. Use a gap penalty of d=-5.
Traceback • Start from the lower right corner and trace back to the upper left. • Each arrow introduces one character at the end of each aligned sequence. • A horizontal move puts a gap in the left sequence. • A vertical move puts a gap in the top sequence. • A diagonal move uses one character from each sequence.
A simple example Find the optimal alignment of AAG and AGC. Use a gap penalty of d=-5. • Start from the lower right corner and trace back to the upper left. • Each arrow introduces one character at the end of each aligned sequence. • A horizontal move puts a gap in the left sequence. • A vertical move puts a gap in the top sequence. • A diagonal move uses one character from each sequence.
A simple example Find the optimal alignment of AAG and AGC. Use a gap penalty of d=-5. • Start from the lower right corner and trace back to the upper left. • Each arrow introduces one character at the end of each aligned sequence. • A horizontal move puts a gap in the left sequence. • A vertical move puts a gap in the top sequence. • A diagonal move uses one character from each sequence. AAG- AAG- -AGC A-GC
Traceback problem #1 Write down the alignment corresponding to the circled score.
GA CA Solution #1 Write down the alignment corresponding to the circled score.
Traceback problem #2 Write down three alignments corresponding to the circled score.
Solution #2 GAATC CA--- Write down three alignments corresponding to the circled score.
Solution #2 GAATC C-A-- GAATC CA--- Write down three alignments corresponding to the circled score.
Solution #2 GAATC -CA-- GAATC C-A-- GAATC CA--- Write down three alignments corresponding to the circled score.