430 likes | 593 Views
Lecture 7. Dynamic Programming. Topics. Reference: Introduction to Algorithm by Cormen Chapter 15: Dynamic Programming. Longest Common Subsequence (LCS). Biologists need to measure similarity between DNA and thus determine how closely related an organism is to another.
E N D
Lecture 7 • Dynamic Programming Topics Reference: Introduction to Algorithm by Cormen Chapter 15: Dynamic Programming Data Structure and Algorithm
Longest Common Subsequence (LCS) • Biologists need to measure similarity between DNA and thus determine how closely related an organism is to another. • They do this by considering DNA as strings of letters A,C,G,T and then comparing similarities in the strings. • Formally, they look at common subsequences in the strings. • Example X = ABCBDAB, Y=BDCABA • Subsequences may be: ABA, BCA, BCB, BBA BCBA BDAB etc. • But the Longest Common Subsequences (LCS) are BCBA and BDAB. • How to find LCS efficiently? Data Structure and Algorithm
Longest Common Subsequence( Brute Force Approach ) • if |X| = m, |Y| = n, then there are 2m subsequences of X; we must compare each with Y (n comparisons) • So the running time of the brute-force algorithm is O(n 2m) Making it impractical for long sequences. • LCS problem has optimal substructure: solutions of subproblems are parts of the final solution. • Subproblems: “find LCS of pairs of prefixes of X and Y” Data Structure and Algorithm
LCS: Optimal Substructure Data Structure and Algorithm
LCS: Setup for Dynamic Programming • First we’ll find the length of LCS, along the way we will leave “clues” on finding subsequence. • Define Xi, Yj to be the prefixes of X and Y of length i and j respectively. • Define c[i,j] to be the length of LCS of Xi and Yj • Then the length of LCS of X and Y will be c[m,n] Data Structure and Algorithm
LCS recurrence • Notice the issues…This recurrence is exponential. • We don’t know max ahead of time. • The subproblems overlap, to find LCS we need to find LCS of c[i, j-1] and of c[i-1, j] Data Structure and Algorithm
LCS Algorithm • First we’ll find the length of LCS. Later we’ll modify the algorithm to find LCS itself. • Recall we want to let Xi, Yjto be the prefixes of X and Y of length i and j respectively • And that Define c[i,j] to be the length of LCS of Xiand Yj • Then the length of LCS of X and Y will be c[m,n] Data Structure and Algorithm
LCS Recursive Solution • We start with i = j = 0 (empty substrings of x and y) • Since X0 and Y0 are empty strings, their LCS is always empty (i.e. c[0,0] = 0) • LCS of empty string and any other string is empty, so for every i and j: c[0, j] = c[i,0] = 0 Data Structure and Algorithm
LCS Recursive Solution • When we calculate c[i,j], we consider two cases: • First case:x[i]=y[j]: one more symbol in strings X and Y matches, so the length of LCS Xi and Yjequals to the length of LCS of smaller strings Xi-1 and Yi-1 , plus 1 Data Structure and Algorithm
LCS Recursive Solution • Second case:x[i] != y[j] • As symbols don’t match, our solution is not improved, and the length of LCS(Xi , Yj) is the same as before (i.e. maximum of LCS(Xi, Yj-1) and LCS(Xi-1,Yj) Data Structure and Algorithm
LCS Example We’ll see how LCS algorithm works on the following example: X = ABCB Y = BDCAB What is the Longest Common Subsequence of X and Y? LCS(X, Y) = BCB X = A BCB Y = B D C A B Data Structure and Algorithm
LCS Example (0) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 A 1 B 2 3 C 4 B X = ABCB; m = |X| = 4 Y = BDCAB; n = |Y| = 5 Allocate array c[6,5] Data Structure and Algorithm
LCS Example (1) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 A 1 0 B 2 0 3 C 0 4 B 0 for i = 1 to m c[i,0] = 0 Data Structure and Algorithm
LCS Example (2) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 B 2 0 3 C 0 4 B 0 for j = 0 to n c[0,j] = 0 Data Structure and Algorithm
LCS Example (3) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 B 2 0 3 C 0 4 B 0 case i=1 and j=1 A != B but, c[0,1]>=c[1,0] so c[1,1] = c[0,1], and b[1,1] = Data Structure and Algorithm
LCS Example (4) j 0 12 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 B 2 0 3 C 0 4 B 0 case i=1 and j=2 A != D but, c[0,2]>=c[1,1] so c[1,2] = c[0,2], and b[1,2] = Data Structure and Algorithm
LCS Example (5) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 B 2 0 3 C 0 4 B 0 case i=1 and j=3 A != C but, c[0,3]>=c[1,2] so c[1,3] = c[0,3], and b[1,3] = Data Structure and Algorithm
LCS Example (6) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 B 2 0 3 C 0 4 B 0 case i=1 and j=4 A = A so c[1,4] = c[0,3]+1, and b[1,4] = Data Structure and Algorithm
LCS Example (7) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 3 C 0 4 B 0 case i=1 and j=5 A != B this time c[0,5]<c[1,4] so c[1,5] = c[1, 4], and b[1,5] = Data Structure and Algorithm
LCS Example (8) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 3 C 0 4 B 0 case i=2 and j=1 B = B so c[2, 1] = c[1, 0]+1, and b[2, 1] = Data Structure and Algorithm
LCS Example (9) j 0 12 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 3 C 0 4 B 0 case i=2 and j=2 B != D and c[1, 2] < c[2, 1] so c[2, 2] = c[2, 1] and b[2, 2] = Data Structure and Algorithm
LCS Example (10) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 3 C 0 4 B 0 case i=2 and j=3 B != D and c[1, 3] < c[2, 2] so c[2, 3] = c[2, 2] and b[2, 3] = Data Structure and Algorithm
LCS Example (11) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 3 C 0 4 B 0 case i=2 and j=4 B != A and c[1, 4] = c[2, 3] so c[2, 4] = c[1, 4] and b[2, 2] = Data Structure and Algorithm
LCS Example (12) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 4 B 0 case i=2 and j=5 B = B so c[2, 5] = c[1, 4]+1 and b[2, 5] = Data Structure and Algorithm
LCS Example (13) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 4 B 0 case i=3 and j=1 C != B and c[2, 1] > c[3,0] so c[3, 1] = c[2, 1] and b[3, 1] = Data Structure and Algorithm
LCS Example (14) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 4 B 0 case i=3 and j= 2 C != D and c[2, 2] = c[3, 1] so c[3, 2] = c[2, 2] and b[3, 2] = Data Structure and Algorithm
LCS Example (15) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 4 B 0 case i=3 and j= 3 C = C so c[3, 3] = c[2, 2]+1 and b[3, 3] = Data Structure and Algorithm
LCS Example (16) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 4 B 0 case i=3 and j= 4 C != A c[2, 4] < c[3, 3] so c[3, 4] = c[3, 3] and b[3, 3] = Data Structure and Algorithm
LCS Example (17) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 case i=3 and j= 5 C != B c[2, 5] = c[3, 4] so c[3, 5] = c[2, 5] and b[3, 5] = Data Structure and Algorithm
LCS Example (18) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 1 case i=4 and j=1 B = B so c[4, 1] = c[3, 0]+1 and b[4, 1] = Data Structure and Algorithm
LCS Example (19) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 1 1 case i=4 and j=2 B != D c[3, 2] = c[4, 1] so c[4, 2] = c[3, 2] and b[4, 2] = Data Structure and Algorithm
LCS Example (20) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 1 1 2 case i=4 and j= 3 B != C c[3, 3] > c[4, 2] so c[4, 3] = c[3, 3] and b[4, 3] = Data Structure and Algorithm
LCS Example (21) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 1 1 2 2 case i=4 and j=4 B != A c[3, 4] = c[4, 3] so c[4, 4] = c[3, 4] and b[3, 5] = Data Structure and Algorithm
LCS Example (22) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 3 4 B 0 1 1 2 2 case i=4 and j=5 B= B so c[4, 5] = c[3, 4]+1 and b[4, 5] = Data Structure and Algorithm
LCS Algorithm LCS-Length(X, Y) m = length(X), n = length(Y) for i = 1 to m do c[i, 0] = 0 for j = 0 to n do c[0, j] = 0 for i = 1 to m do for j = 1 to n do if ( xi = = yj ) then c[i, j] = c[i - 1, j - 1] + 1 else if c[i - 1, j]>=c[i, j - 1] then c[i, j] = c[i - 1, j] else c[i, j] = c[i, j - 1] return c and b Data Structure and Algorithm
LCS Algorithm Running Time • LCS algorithm calculates the values of each entry of the array c[m,n] • So the running time is clearly O(mn) as each entry is done in 3 steps. • Now how to get at the solution? • We use the arrows we created to guide us. • We simply follow arrows back to base case 0 Data Structure and Algorithm
Finding LCS j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 3 4 B 0 1 1 2 2 Data Structure and Algorithm
Finding LCS (2) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 3 4 B 0 1 1 2 2 B C B LCS: Data Structure and Algorithm
Finding LCS (3) • Print_LCS (X, i, j) • if i = 0 or j = 0 then return • if b[i, j] = “ “ then • Print_LCS (X, i-1, j-1) • Print X[i] • elseif b[i, j] = “ “ then • Print_LCS (X, i-1, j) • else • Print_LCS (X, i, j-1) Cost: O(m+n) Data Structure and Algorithm
Element of Dynamic Programming • Optimal Substructure • Overlapping Subproblems Data Structure and Algorithm
Optimal Substructure • A problem exhibits optimal substructure if an optimal solution contains optimal solutions to its subproblems. • Build an optimal solution from optimal solutions to subproblems • solutions of subproblems are parts of the final solution. • Example :Longest Common Subsequence - An LCS contains within it optimal solutions to the prefixes of the two input sequences. • Common with Greedy Solution Data Structure and Algorithm
Overlapping Subproblems • Divide-and-Conquer is suitable when generating brand-new problems at each step of the recursion. • Dynamic-programming algorithms take advantage of overlapping subproblems by solving each subproblem once and then storing the solution in a table where it can be looked up when needed, using constant time per lookup Data Structure and Algorithm
Dynamic VS Greedy • Dynamic programming uses optimal substructure in a bottom-up fashion • First find optimal solutions to subproblems and, having solved the subproblems, we find an optimal solution to the problem • Greedy algorithms use optimal substructure in a top-down fashion • First make a choice – the choice that looks best at the time – and then solving the resulting subproblem Data Structure and Algorithm