80 likes | 239 Views
A Regular Expression Matching Algorithm Using Transition Merging. Author: Jiekun Zhang, Dafang Zhang, Kun Huang Publisher: IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2009) Presenter: Sih-An Pan Date: 2014/6/18.
E N D
A Regular Expression Matching Algorithm Using Transition Merging Author:Jiekun Zhang, Dafang Zhang, Kun Huang Publisher: IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2009) Presenter: Sih-An Pan Date: 2014/6/18 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
Introduction • The authors in [1] propose a novel method to reduce the DFA memory requirement and still provide worst-case speed guarantees, called State Merging DFA (SM-DFA). • SM-DFA results in large memory reductions. But this algorithm only considers the reduction of states, while adding auxiliary information on the transitions at the same time. • But the transitions have not been reduced, which increase the memory requirement.
STATE MERGING DFA • The transition [g-i]/0, j/1 indicates that the same next state, in this case state 5, is reached from state 3_4 upon receiving input characters g, h, i with label 0 or input character j with label 1.
STATE MERGING DFA • The transition a.0/0,1 from state 3_4 to state 1_2 means: • The transition carries with it a label 0 that tells its destination state, 1_2 that the transition is meant for underlying original state 1. • The transition is taken when its source state 3_4 receives labels 0 or 1.
TRANSITION MERGING DFA String:”acgacik” 0->1-2->3-4->5-> 1-2->3-4->5->6
EXPERIMENTAL RESULTS • TM-DFA matching algorithm ensures the speed of pattern matching, and reduces the memory consumption by 30% compared to SM-DFA, while compared to the original DFA, it reduces the memory consumption by 42%.
EXPERIMENTAL RESULTS • TM-DFA continues reducing the memory consumptionand we can see from Fig. 8 and Fig. 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger.