180 likes | 195 Views
The Galil-Giancarlo algorithm. On the exact complexity of string matching: upper bounds , SIAM Journal on Computing , Vol. 21 , No. 3 , 1992 , pp. 407-437 . Galil, Z. and Giancarlo, R. Advisor: Prof. R. C. T. Lee Speaker: S. Y. Tang.
E N D
The Galil-Giancarlo algorithm On the exact complexity of string matching: upper bounds , SIAM Journal on Computing , Vol. 21 , No. 3 , 1992 , pp. 407-437 . Galil, Z. and Giancarlo, R. Advisor: Prof. R. C. T. Lee Speaker: S. Y. Tang
The Galil-Giancarlo algorithm is an algorithm which solves the string matching problem. • String matching problem: Input: a text string T of length n and a pattern string P of length m. Output: all occurrences of P in T.
The Galil-Giancarlo algorithm(GG algorithm for short) is an algorithm which improves the worst case of the Colussi algorithm. • There are two phases in the GG algorithm which are preprocessing and searching. • The preprocessing phase is the same as the Colussi algorithm. • The GG algorithm adds 5 cases to determine how to jump in the searching phase and this is the difference between GG algorithm and Colussi algorithm.
Case:1 Text k = 2 If l>k Pattern l = 5 shift If l=k ; p[l+1]≠t[j+k] Case:2 Text k = 3 Pattern l = 3 shift If l<k ; p[l+1]≠t[j+k] Case:3 Text k = 5 Pattern l = 2 shift
Case: 4 Text k = 3 If l=k ; p[l+1]= t[j+k] ; Pattern l = 3 Do not need to shift. Case: 5 Text k = 5 If l<k ; p[l+1]= t[j+k] Pattern l = 3 shift
Example(1/7) T P mismatch shift Shift[4] = 4 We first compare noholes by using phase 1 of Colussi algorithm and shift by using the Shift[i].
Example(2/7) T P match
Example(3/7) T P mismatch shift Shift[0] = 5 After all noholes are matched, we compare holes by using phase 2 of Colussi algorithm and shift by using the Shift[i].
Example(4/7) T k = 2 P l = 3 shift In this case, we use the Case 1 of the GG algorithm to shift because this case satisfies the condition overlay < lof using the GG algorithm and l > k.
Example(5/7) T P All noholes are match mismatch shift Shift[2] = 5 After comparing the cases of the GG algorithm, We return to use the Colussi algorithm.
Example(6/7) T k = 2 P l = 3 shift In the case, we use the Case 5 of the GG algorithm to shift because this case satisfies the condition of using the GG algorithm and l < k.
Example(7/7) T P Exact match After comparing the cases of the GG algorithm, We return to use the Colussi algorithm.
The cases under which the GG algorithm is not used. • Case1: The pattern has only one period. The entire window is skipped. There is no way to know whether there is a prefix in the window equal to a prefix of the pattern. • Example: T: GCAGCGGGAC P: GGAGC GGAGC mismatch shift
Case2: A prefix of the pattern is already known to be equal to a prefix of the window. T: GGACGGAACGCA P: GGAGGGA GGAGGGA T: GCAGGAGCAGCA P: GGAGGAG GGAGGAG mismatch shift mismatch shift
Time complexity • preprocessing phase in O(m) time and space complexity. • searching phase in O(n) time complexity. • performs (4/3)n text character comparisons in the worst case.
Conclusion • The Galil-Giancarlo algorithm is very similar to Colussi algorithm. The Colussis algorithm performs very badly if the pattern starts and ends with a sequence of repetitions of the same symbol. For these patterns Colussis algorithm shifts by a single position and (3/2)n comparisons are actually performed. Galil and Giancarlo devised a way to avoid these shifts by a single position.
References • [B92] BRESLAUER, D., Efficient String Algorithmics, Ph. D. Thesis, Report CU-024-92, Computer Science Department, Columbia University, New York, NY, 1992. • [GG92] On the exact complexity of string matching: upper bounds , Galil, Z. and Giancarlo, R. , SIAM Journal on Computing , Vol. 21 , No. 3 , 1992 , pp. 407-437 .