130 likes | 271 Views
Parameterized Matching Amir, Farach, Muthukrishnan. Orgad Keller Modified by Ariel Rosenfeld. Parametrized Match Relation.
E N D
Parameterized MatchingAmir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld
Parametrized Match Relation • Definition: Two strings over the alphabet , parametrized match (p-match) if the following 3 conditions apply : Orgad Keller - Algorithms 2 - Recitation 9
Conditions Orgad Keller - Algorithms 2 - Recitation 9
Example • We can see it as a bijection : Orgad Keller - Algorithms 2 - Recitation 9
Parametrized Matching • Input: • Output: All locations where p-matches . Orgad Keller - Algorithms 2 - Recitation 9
Observation • Given we’ll define : In linear time… Orgad Keller - Algorithms 2 - Recitation 9
Observation • Now is over and is over and . • We get the algorithm for p-match: • Create • Find all the places appears in (using KMP) (cond. 1+2) • Find all the places m-matches in (We’ll show later how) (cond. 3) • Return Orgad Keller - Algorithms 2 - Recitation 9
Exercise • Why is that enough? • In other words: Prove there is a p-match at location iff . (HW) • We are left with the question: How do we solve step 3 efficiently? Orgad Keller - Algorithms 2 - Recitation 9
M-match Ariel Rosenfeld- Algorithms 2 - Recitation 9
When is the last occurrence? • We’ll build an array : • So, if , we know hasn’t appeared before. Otherwise, we’ll know exactly where it had appeared last. • Can we do this efficiently? Orgad Keller - Algorithms 2 - Recitation 9
Building the Array • We’ll hold a Balanced Binary Search Tree for the symbols of the alphabet. Initially it will be empty. • We’ll go over the pattern. For each symbol, if it isn’t in the tree, we’ll add it with it’s index and update . Otherwise, we know exactly where it had last appeared, so we’ll update and then update the symbol in the tree with the new index. • Time: where . Orgad Keller - Algorithms 2 - Recitation 9
The Matching Itself • We move forward if either • and . • We’ll hold and update a balanced BST as we go over the text as well. • Time: • So overall algorithm time is • Can we improve this further? Orgad Keller - Algorithms 2 - Recitation 9
The Trick • We’ll split the text into overlapping segments of size like this: • So every match in the text must appear in whole in one of the segments. • We’ll run the algorithm for each such segment. Time: where . • Overall for all segments: Orgad Keller - Algorithms 2 - Recitation 9