140 likes | 249 Views
Parametrized Matching Amir, Farach, Muthukrishnan. Orgad Keller. Parametrized Match Relation. Definition: Two strings over the alphabet , parametrized match (p-match) if the following 3 conditions apply :.
E N D
Parametrized MatchingAmir, Farach, Muthukrishnan Orgad Keller
Parametrized Match Relation • Definition: Two strings over the alphabet , parametrized match (p-match) if the following 3 conditions apply : Orgad Keller - Algorithms 2 - Recitation 9
Conditions Orgad Keller - Algorithms 2 - Recitation 9
Example • We can see it as a bijection : Orgad Keller - Algorithms 2 - Recitation 9
Parametrized Matching • Input: • Output: All locations where p-matches . Orgad Keller - Algorithms 2 - Recitation 9
Observation • We can reduce the problem, to the same problem with (m-match). • Given we’ll define : Orgad Keller - Algorithms 2 - Recitation 9
Observation • Now is over and is over and . • We get the algorithm for p-match: • Create • Find all the places appears in (using KMP) • Find all the places m-matches in (We’ll show later how) • Return Orgad Keller - Algorithms 2 - Recitation 9
Exercise • Why is that enough? • In other words: Prove there is a p-match at location iff . • We are left with the question: How do we solve step 3 efficiently? Orgad Keller - Algorithms 2 - Recitation 9
M-match • Is m-match transitive? • We can use KMP-like automaton method • For each index in pattern, we want to find the longest suffix that m-matches the prefix. • For instance: Orgad Keller - Algorithms 2 - Recitation 9
Failure Links • Where to link the failure link from ? • In KMP it is simple: If then link to . Otherwise go back again and repeat. • In our case: • If never appeared before, i.e. We link if . • Otherwise, we link if such that , it holds that . Orgad Keller - Algorithms 2 - Recitation 9
Failure Links • Can we do this efficiently? • We’ll build an array : • So, if , we know hasn’t appeared before. Otherwise, we’ll know exactly where it had appeared last. Orgad Keller - Algorithms 2 - Recitation 9
Building the Array • We’ll hold a Balanced Binary Search Tree for the symbols of the alphabet. Initially it will be empty. • We’ll go over the pattern. For each symbol, if it isn’t in the tree, we’ll add it with it’s index and update . Otherwise, we know exactly where it had last appeared, so we’ll update and then update the symbol in the tree with the new index. • Time: where . Orgad Keller - Algorithms 2 - Recitation 9
The Matching Itself • We go forward in the automaton if either • and . • We’ll hold and update a balanced BST as we go over the text as well. • Time: • So overall algorithm time is • Can we improve this further? Orgad Keller - Algorithms 2 - Recitation 9
The Trick • We’ll split the text into overlapping segments of size like this: • So every match in the text must appear in whole in one of the segments. • We’ll run the algorithm for each such segment. Time: where . • Overall for all segments: Orgad Keller - Algorithms 2 - Recitation 9