80 likes | 389 Views
Longest common subsequence (LCS) Problem. Notation: let X k be the sequence: x 1 ,x 2 ,…, x k Given 2 sequences X m =<x 1 ,x 2 ,…, x m > and Y n =<Y 1 ,Y 2 ,…, Y n >, find a longest subsequence common to them: Elements in the common subsequence appear in each of X m and Y n
E N D
Longest common subsequence (LCS) Problem • Notation: let Xk be the sequence: x1,x2,…, xk • Given 2 sequences Xm=<x1,x2,…, xm> and Yn=<Y1,Y2,…, Yn>, • find a longest subsequence common to them: • Elements in the common subsequence appear in each of Xm and Yn • These elements must appear in the same order but not necessarily consecutively • Example • X7= A B C B D A B , Y6= B D C A B A , • The sequence <B,C,A> is a common subsequence of X7, Y6 • Another subsequence <B,C,B,A> • Another one <B,D,A,B> • What is the longest? • The longer the common subsequence is, the more similar the two subsequences are
Characterizing the problem • Find LCS of the two sequences: Xm = <x1, x2, …, xm> & Yn = <y1, y2, …, yn> • Let Zk= <z1, z2, …, zk> be any LCS of Xm and Yn • If xm= yn , then zk= xm= yn we must find a LCS Zk-1 of Xm-1 and Yn-1 (appending xm= yn to this LCS yields a LCS of Xm and Yn) 2. If xm yn , then we must solve two sub-problems: • Finding a LCS of Xm-1 and Yn (if zk xm) • Finding a LCS of Xm and Yn-1 (if zk yn) Whichever of these two sequences is longer is a LCS of Xm, Yn
c[m-1][ n-1]+1 if x[m]=y[n] c[m][ n]= Max ( c[m][ n-1], c[m-1][ n] ) otherwise C[3,4] C[2,3] C[2,4] C[3,3] C[1,2] C[1,3] C[2,2] C[1,3] C[1,4] C[2,3] C[2,2] C[2,3] C[3,2] C[1,2] C[1,3] C[2,2] Computing the length of a LCS • Let c[m][n] be the length of a LCS of Xm and Yn
A recursive solution LCS (Array X, Array Y, int m, int n) { if ((m == 0) || (n == 0)) return 0; if (X[m] == Y[n]) return LCS(X, Y, m-1, n-1) + 1; else return max(LCS(X, Y, m, n-1), LCS(X, Y, m-1, n)); }
A DP solution LCS-Length(X, Y, c){ m length[X]; n length[Y]; for (i= 1 to m) c[i][0] 0; for (j= 1 to n) c[0][j] 0; for i= 1 to m for j= 1 to n If xi = = yj c[i][j] =c[i-1][j-1]+1; else c[i][j] = max(c[i-1][j], c[i,][j-1]); }
Computing a LCS The algorithm always decrement m or n or both so, in the worst case, its run time O(m+n) LCS-print (X, m, n, c) { if (c[m][n]==0) return if (c[m][n]==c[m-1][n]) LCS-print (X, m-1, n, c) if (c[m][n]==c[m][n-1]) LCS-print (X, m, n-1, c) else { LCS-print (X, m-1, n-1, c) Print xm } }
Example let X7=<ABCBDAB> & Y6=<BDCABA> find LCS of X7, Y6?