90 likes | 234 Views
Plagiarism detection. Yesha Gupta. String Matching Algorithms:. KMP LCSS Rabin-Karp fingerprints an algorithm of choice for multiple pattern search. Testing text file information:. 21 Lines Each line(treated as pattern) is of different length Max line size: 370
E N D
Plagiarism detection Yesha Gupta
String Matching Algorithms: • KMP • LCSS • Rabin-Karp fingerprints • an algorithm of choice for multiple pattern search
Testing text file information: • 21 Lines • Each line(treated as pattern) is of different length • Max line size: 370 • Minimum line size: 85
LCSS is performing very slowRabin Karp performed better than KMP Why? Efficient use of Hashing techniques
KMP generated optimum output. Rabin Karp did not. Why? Because of fixed length patterns in a text
Testing text file information: • 21 Lines • Each line(treated as pattern) is of same length
Result of RabinKarp and KMP is the same Why? Each pattern has same length