200 likes | 294 Views
Sequence Alignment. Oct 9, 2002 Joon Lee Genomics & Computational Biology. Dynamic Programming. Optimization problems: find the best decision one after another Subproblems are not independent Subproblems share subsubproblems Solve subproblem, save its answer in a table.
E N D
Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology
Dynamic Programming • Optimization problems: find the best decision one after another • Subproblems are not independent • Subproblems share subsubproblems • Solve subproblem, save its answer in a table Genomics & Computational Biology
Four Steps of DP • Characterize the structure of an optimal solution • Recursively define the value of an optimal solution • Compute the value of an optimal solution in a bottom-up fashion • Construct an optimal solution from computed information Genomics & Computational Biology
Sequence Alignment Sequence 1: G A A T T C A G T T A Sequence 2: G G A T C G A Genomics & Computational Biology
Align or insert gap G A A T T C A G T T A | | | | | | G G A _ T C _ G _ _ A G _ A A T T C A G T T A | | | | | | G G _ A _ T C _ G _ _ A Genomics & Computational Biology
Three Steps of SA • Initialization: gap penalty • Scoring: matrix fill • Alignment: trace back Genomics & Computational Biology
Step 1: Initialization Genomics & Computational Biology
Step 2: Scoring • A = a1a2…an, B = b1b2…bm • Sij : score at (i,j) • s(aibj) : matching score between ai andbj • w : gap penalty figure source Genomics & Computational Biology
Step 2: Scoring • Match: +2 • Mismatch: -1 • Gap: -2 Genomics & Computational Biology
Step 2: Scoring 0 + 2 = 2 -2 + (-2) = -4 -2 + (-2) = -4 Genomics & Computational Biology
Step 2: Scoring -2 + (-1) = -3 -4 + (-2) = -6 2 + (-2) = 0 Genomics & Computational Biology
Step 2: Scoring -2 + 2 = 0 2 + (-2) = 0 -4 + (-2) = -6 Genomics & Computational Biology
Step 2: Scoring Genomics & Computational Biology
Step 3: Trace back Genomics & Computational Biology
Step 3: Trace back G A A T T C A G T T A G G A _ T C _ G _ _ A G A A T T C A G T T A G G A T _ C _ G _ _ A Genomics & Computational Biology
Excercise • Match: +2 • Mismatch: -1 • Gap: -2 Genomics & Computational Biology
Excercise • Match: +2 • Mismatch: -1 • Gap: -2 G C A T C C G G A T C G G A T C G G A T C G Genomics & Computational Biology
Amino acids • Match/mismatch → Substitution matrix Genomics & Computational Biology
Global & Local alignment • Global: Needlman-Wunsch Algorithm • Local: Smith-Waterman Algorithm From Mount Bioinformatics Chap 3 Genomics & Computational Biology
References • Sequence alignment with Java applet • http://linneus20.ethz.ch:8080/5_4_5.html Genomics & Computational Biology