180 likes | 380 Views
A Linear Space Algorithm for Computing Maximal Common Subsequences. Author : D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen Chen Date: 2010/04/07. Outline. Introduction Algorithm A Algorithm B Algorithm C. Introduction.
E N D
A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen Chen Date:2010/04/07
Outline • Introduction • Algorithm A • Algorithm B • Algorithm C
Introduction • LCS (Longest Common Subsequence) of two strings has been solved in quadratic time and space. • We present an algorithm which will solve this problem in quadratic time and in linear space.
Algorithm A Input string A1m and B1n output matrix L
Analysis of Algorithm A • Time Complexity : execute m*n times → O(mn) • Space Complexity : input arrays m + n output array (m+1)*(n+1) space require → O(mn)
Algorithm B (I) • Space require : O(m+n) • It can output the max common length but cannot record the max common subsequence.
Algorithm B (II) Input string A1m and B1n output matrix LL
Analysis of Algorithm B • Time Complexity : execute m*n times → O(mn) • Space Complexity : input arrays m + n output array n+1 space require → O(m+n)
ALG B ALG B Algorithm C • Divide and conquer String B B1j n 1 Find j 1 String A A1i i=m/2 Ai+1,m m Bj+1,n
Algorithm C • L(i,j) j=0 … n the maximum lengths of common subsequence A1i and B1j • L*(i,j) j=0 … n the maximum lengths of common subsequence Am,i+1 and Bn,j+1 • Define M(i) = max{ L(i,j) + L*(i,j) } 0 ≦ j ≦n • Theorem M(i) = L(m,n) • Proof: • for all L(i,j) + L*(i,j) ≦ L(m,n) S(i,j) : any maximal common subsequence of A1i and B1j S*(i,j) : any maximal common subsequence of Ai+1,m and Bj+1,n Then C= S(i,j) || S*(i,j) is a common subsequence of A1m and B1n of length M(i). Thus L(m,n) ≧ L(i,j) + L*(i,j)
Algorithm C • exist some L(i,j) + L*(i,j) ≧ L(m,n) S(m,n) : any maximal common subsequence of A1m and B1n S(m,n) is a subsequence of A1m so S(m,n) = S1 || S2 that S1 is a subsequence of A1i , S2 is a subsequence of Ai+1,m Also S(m,n) is a subsequence of B1n so there exists j such that S1 is a subsequence of B1j and S2 is a subsequence of Bj+1,n By definition of L and L* , |S1| ≦ L(i,j) and |S2| ≦ L*(i,j) Thus L(m,n) = |S(m,n)| = |S1| + |S2| ≦ L(i,j) + L*(i,j) So M(i) = max{ L(i,j) + L*(i,j) } = L(m,n)
Algorithm C m,i+1
Analysis of Algorithm C (I) • Time analysis: • O(mn) + O(1/2mn) + O(1/4mn) + … = O(mn(1+1/2+1/4+…)) = O(mn)
Analysis of Algorithm C (II) • Space analysis: we calls ALG B use temporary storage which is m and n. Exclusive of recursive calls to ALG C, ALG C uses a constant amount of memory space. There are 2m-1 calls to ALG C, so ALG C require memory space O(m+n).
Proof 2m-1 calls to ALG C • Let m≦2r • m=1 there are 2*1 – 1 = 1 call to ALG C • Assume m ≦ 2r = M there are 2m-1 calls to ALG C • For m’ = 2r+1 = 2M. First call ALG C to partition 2 part, each calls call 2m-1 times ALG C. So there are 1 + (2m-1) + (2m-1) = 4m - 1 = 2m’ – 1 calls.