Project 3: Dynamic Programming

Project 3: Dynamic Programming Optimal String Alignment

Development of a Dynamic Programming Algorithm • Characterize the structure of a solution • Recursively define the value of a solution • Compute the value of a solution in a bottom-up fashion(using a table) • Construct a solution from computed information

Example: Optimal String Alignment • Images taken from http://www.sbc.su.se/~per/molbioinfo2001/dynprog/dynamic.html • Want to find optimal (best) alignment between two strings where matches count as 1, mismatches count as 0, and “gaps” count as 0. • Example: • G A A T T C A G T T A • G G A T C G A • G _ A A T T C A G T T A • G G _ A _ T C _ G _ _ A • Notice that every alignment can start in exactly one of three ways: • Two non-gap characters • Non-gap above, gap below • Gap above, non-gap below Score == 6

Example: Optimal String Alignment • Notice in an optimal alignment of (s1,s2,…,sn) with (t1,t2,…,tm) one of these must be true: • S2..n must be optimally aligned with T2..m • S2..n must be optimally aligned with T1..m • S1..n must be optimally aligned with T2..m • The score of each is then • Score(S2..n, T2..m) + Score(S1,T1) • Score(S2..n, T1..m) + Score(S1,gap) • Score(S1..n, T2..m) + Score(T1,gap) • We want to select the maximum of these, giving • Score(S1..n, T1..m) = • max(Score(S2..n, T2..m) + Score(S1,T1), • Score(S2..n, T1..m) + Score(S1,gap), • Score(S1..n, T2..m) + Score(gap,T1)) • base case given by scoring function Only gives score of optimal, not actual alignment

Example: Optimal String Alignment • Let’s build a table, M, of values such that M[i][j] is the optimal score for S1..i aligned with T1..j. • If we have this, M[n][m] is the overall optimal score • M is zero-indexed to allow for beginning gap • Notice by our recurrence relation, • M[i][j] = max(M[i-1][j-1] + Score(Si,Tj), M[i][j-1] + Score(gap,Tj), M[i-1][j] + Score(Si,gap))

Example: Optimal String Alignment Base Case M[i][0] = Score(Si,gap) M[0][j] = Score(gap,Sj)

Example: Optimal String Alignment M[i][j] = max(M[i-1][j-1] + Score(Si,Tj), M[i][j-1] + Score(gap,Tj), M[i-1][j] + Score(Si,gap))

Example: Optimal String Alignment • Still need to find actual alignment that results in maximum score • Can be done by tracing back from optimal value to find where it must have originated • representing earlier optimal alignment • G _ A A T T C A G T T A • G G _ A _ T C _ G _ _ A

Project 3: Dynamic Programming