120 likes | 232 Views
CSC 213. Lecture 19: Dynamic Programming and LCS. Subsequences ( § 11.5.1). A subsequence of a string x 0 x 1 x 2 …x n-1 is a string of the form x i 1 x i 2 …x i k , where i j < i j+1 This is not the same things as a substring! Subsequences can skip letters in the string
E N D
CSC 213 Lecture 19:Dynamic Programmingand LCS
Subsequences (§ 11.5.1) • A subsequence of a string x0x1x2…xn-1 is a string of the form xi1xi2…xik, where ij < ij+1 • This is not the same things as a substring! • Subsequences can skip letters in the string • Substrings must use consecutive letters • Example string: ABCDEFGHIJK • Subsequence (& substring): DEFGH • Subsequence (& NOT substring): ACEFHJ • Not subsequence or substring: DAGH
Longest Common Subsequence (LCS) Problem • Given two strings X and Y, find longest subsequence in both X and Y • Applications in DNA testing (S={A,C,G,T}) • Example the LCS for: ABCDEFGandXZACKDFWGH is ACDFG
Longest Common Subsequence (LCS) Problem • Given two strings X and Y, find longest subsequence in both X and Y • Applications in DNA testing (S={A,C,G,T}) • Example the LCS for: ABCDEFGandXZACKDFWGH isACDFG
Dynamic Programming • Some problems appear hard • There does not seem to be a simple solutions • Require a brute force approach --- evaluate every solution • This means constantly reevaluating a lot of options • Ultimately, this takes exponential time -- O(2n) • For a class of problems, however, the solution is: Dynamic Programming
Dynamic Programming • Works from problems with: • Simple subproblems: can be defined using only a few simple variables • Subproblem optimality: can define how to solve problem using the subproblem solutions • Subproblem overlap: subproblems overlap such that the solution to a first subproblem can (help) solve later subproblems
How Not to Solve LCS in your Lifetime • Brute-force solution: • List all subsequences of X • Check each subsequence to see if it is also a subsequence of Y • Return the longest one of these • Analysis: • If X has length n, it has 2n subsequences • While waiting, you can not only get coffee, but could first fly to Columbia and pick the beans!
How to Solve LCS Quickly • If X and Y are 1 character, LCS is 0 or 1 • If we then add 1 character to X and Y, LCS increases by at most 1 • Note that we do not need to rescan the first character
Dynamic-Programming Solution • Use an array Lto hold solution to subproblems • L[i,j] stores LCS of X[0..i] and Y[0..j] • Define array to include an index of -1 • L[-1,*] computes LCS for X[0..-1] = “” • L[*,-1] computes LCS for Y[0..-1] = “” • L[-1,*] = 0 and L[*,-1] = 0 since there are no characters to match!
Dynamic-Programming Solution • Solve for remaining L[i,j] as follows: • If xi=yj, then L[i,j] = L[i -1, j -1] +1 • E.g., one more than previous solution • If xi≠yj, then L[i,j] = max(L[i -1, j], L[i, j -1]) • E.g. use however good we did before • Final result will be stored in L[n,m] Case 1: Case 2:
LCS Algorithm AlgorithmLCS(String X, String Y): fori 1 ton-1 L[i,-1] 0 for j 0 tom-1 L[-1, j]0 for i0 ton-1 forj 0 to m-1 ifxi= yjthen L[i, j] L[i-1, j-1] + 1 else L[i, j] max(L[i-1, j], L[i, j-1]) returnL