1 / 36

Simple and fast linear space computation of L ongest c ommon s ubsequences

Simple and fast linear space computation of L ongest c ommon s ubsequences. Claus Rick, 1999. A. What is the LCS problem?. A A B A C. A B C. …Finding a sequence of greatest possible length that can be obtained

thom
Download Presentation

Simple and fast linear space computation of L ongest c ommon s ubsequences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999

  2. A What is the LCS problem? A A B A C A B C …Finding a sequence of greatest possible length that can be obtained From both A and B by deleting zero or more (not necessarily adjacent) symbols.

  3. A Some boring history…

  4. A Pre-Info • Divide and conquer • Midpoint

  5. A Some basic terms Ordered Pair (i,j) A A B A C A B C (2,3)= (A,C)

  6. A Some basic terms Match A A B A C A B C

  7. A Some basic terms Chain A A B A C A B C

  8. A Rank k A A B A C A B C

  9. A Some basic terms c b a b b a c a c Matching Matrix a b a c b c b a

  10. A Some basic terms Dominant matches All Upper-left matches in each rank

  11. A Dominant matches c b a b b a c a c a b a c b c b a 1 2 3 4 5

  12. A A A B A C A B C

  13. A c b a b b a c a c a b a c b c b a

  14. A Backward contours (BC) a b a c b c b a 5 4 3 2 1 c b a b b a c a c

  15. A Some last basic terms FCk BCk

  16. A Forward contours (FC) c b a b b a c a c a b a c b c b a 1 2 3 4 5

  17. A Backward contours (BC) a b a c b c b a 5 4 3 2 1 c b a b b a c a c

  18. A Lemma 1 Let p be the length of an LCS between strings A and B. Then for every match (i,j) the following holds: • There is an LCS containing (i,j) if and only if (i,j) is on the kth forward contour and on the (p-k+1)st backward contour.

  19. A Lemma 1- proof P |BC|- (p-k+1) |FC|= (k) K <(p-k+1) <(p-k+1) P

  20. A Start calculating FC1 BC1 FC2 BC2 Sooner or later…

  21. A Really really last terms Define sets Mi as: M0= M M1= M0\FC1 M2= M1\BC1 M2i-1=M2(i-1) \FCi M2i=M2i-1\BCi

  22. A c b a b b a c a c a b a c b c b a a b a c b c b a M c b a b b a c a c

  23. A c b a b b a c a c a b a c b c b a a b a c b c b a M1 M2 M3 M4 M5 c b a b b a c a c

  24. A Let call the first empty Mi…. M p’

  25. A Lemma 2 • The Length of an LCS is p’ and each match in M(p’-1) is a possible midpoint

  26. A Lemma 2- proof K K-1 K-2 1 0 K=p M k-1 M 0 M 2 M 1 M k

  27. A Little problem… • We can`t keep tracks of each set- very expensive

  28. A c b a b b a c a c a b a c b c b a a b a c b c b a c b a b b a c a c

  29. A What do we do? Keep only dominant matches… When we see a dominant match below- done.

  30. A c b a b b a c a c a b a c b c b a a b a c b c b a c b a b b a c a c

  31. A Lets define: • FCf’ , BCb’ the minimal indices as stated above

  32. A Lemma 3 • The Length of an LCS is b’ + f’ -1.

  33. A Complexity Finding the dominant matches each contour: O(min(m, (n-p)) Number of contours: P O(Min(pm, p(n-p)

  34. A The End

  35. Simple and fast linear space computation of longest common subsequence Written by: Claus Rick,1999 Based on algorithm by: D.Hirschberg, 1975 Cast: Matrices Lines Arrows Squares Blue Red Brown Grey Black String A String B Presentation: Uri Scheiner No Dominant Matches were harmed during the making of this presentation

  36. Appendix What is the LCS Lemma 1 Divided And Conquer Define M… Match Lemma 2 Chain Keep just Dominant… Dominant Matches FC Lemma 3 BC Complexity

More Related