150 likes | 433 Views
String Matching: Knuth-Morris-Pratt algorithm. Heather Takeguchi. What is String Matching?. Used in word find in document, as well as in the spell checker and in internet keyword searches Looking for an exact string match
E N D
String Matching: Knuth-Morris-Pratt algorithm Heather Takeguchi
What is String Matching? • Used in word find in document, as well as in the spell checker and in internet keyword searches • Looking for an exact string match • Reality of algorithms are more complicated; search string ‘string’ results in ‘String’ as well as ‘stringbean’
How do you match strings? • Finite-State-Automota • Brute-Force • Knuth-Morris-Pratt (KMP) • visualization tool for Brute Force and KMP www.dcc.ufmg.br/~cassia/smaa/english/
Virus Detection • Detection of virus is simply searching for a pattern string in a larger text. • ) viral signature (contagious seg.) matching • ) code enumeration (cmp. to old known file) • ) checksum methods (see size of file)
Variation-tolerant matching • Fast substring matching • approximate string matching • voice recognition • dna sequencing
Summary • Exact string matching good for grep & sed • String matching used in word find and in internet key word searches • KMP alg. is slightly better than Brute Force • approximate string matching and fast substring matching can be used for a wider use to practical applications.
Acknowledgements • Virus detection: www.cse.uta.edu/~holder/courses/cse5311/lectures/applets/je/a24.html • Speech recognition: www.kom.e-technik.tu-darmstadt.de/pr/workshop/chair/ACMMM98/electronic_proceedings/robertson/ • Approximate string matching: http://www-igm.univ-mlv.fr/~lecroq/seqcomp/node3.html • Cormen, chaper 34