110 likes | 261 Views
Modern Information Retrieval. Chapter 8 Indexing and Searching. Sequential searching brute force approach. a b a c. a b a c. Knuth-Morris-Pratt approach Left-to-right scan Shifting rule. a b a b a b a c. a b a c. a b a c. a b a c. Boyer-Moore approach Right-to-left scan
E N D
Modern Information Retrieval Chapter 8 Indexing and Searching
Sequential searching • brute force approach
a b a c a b a c • Knuth-Morris-Pratt approach • Left-to-right scan • Shifting rule a b a b a b a c a b ac a b ac a b a c
Boyer-Moore approach • Right-to-left scan • Bad character shift rule • Good suffix shift rule • Sub-linear time method • Examines fewer than m+n characters
Right-to-left scan • Shift one place when a mismatch occurs • O(nm) xpbctbxabpqx tpabxab
Bad character rule • Right-most position in P of each character • R(T(k)) K R(T(k))=R(y) y y x R(y) i y x R(y) < i, shift i-R(y) positions i-R(y)
Bad character rule K R(T(k))=R(y) y x i x y R(y) > i , Shift 1 positions x R(y) = 0, shift n-i+1 positions n-i+1
The strong good suffix rule x t z t’ y t z t’ x t
The strong good suffix rule x t y t y t y t
Shift-Or approach An example of the shift-or algorithm for p=aab and s=abcaaab T a b c a 0 1 1 0 1 1 1 0 1 a b E S(E) T[a] E S(E) T[b] E S(E) T[c] E S(E) T[a] E E S(E) T[a] E S(E) T[a] E S(E) T[b] a a b 1 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 0 1 1 1 0 1 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1