300 likes | 447 Views
High-Speed Regular Expression Matching Engine Using Multi-Character NFA. Authors: Norio Yamagaki, Reetinder Sidhu , Satoshi Kamiya Publisher: International Conference on Field Programmable Logic and Applications 2008(FPL ' 2008) Present : Chen-Rong Chang Date: April , 15, 2009.
E N D
High-Speed Regular Expression Matching Engine Using Multi-Character NFA Authors: Norio Yamagaki, Reetinder Sidhu , Satoshi Kamiya Publisher: International Conference on Field Programmable Logic and Applications 2008(FPL ' 2008) Present:Chen-Rong Chang Date:April, 15, 2009 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
Outline • Proposed Method • Phase 1: Regular Expression to 1-Character NFA • Phase 2: 1-character to Multi-character NFA • Performance Comparison
Proposed Method • NFA construction task consists of two sub-tasks which are described in the following sections: • Phase 1: • Conversion of a regex in post-order into its NFA graph that processes a single character every clock cycle. • Phase 2: • Conversion of the above 1-character NFA graph into an NFA graph that can process 2k characters (for desired natural number k) characters every clock cycle.
Phase 1: Regular Expression to 1-Character NFA(2/3) Regex : “a(bc)*(d|e)” Post-order : “abc·*·de|·”
N=0 i n j 4 XX 0 1 2 3 6 7 5 Example of NFA construction for the regex “a(bc)*(d|e)”
N=1 n j Xa 4 i XX 0 1 2 3 6 7 Xa 5 Xa
N=2 Xa ab 4 cb XX 0 1 2 3 6 7 Xa 5 Xa
N=3 Xa bc bc ab 4 cb XX 0 1 2 3 6 7 bc Xa 5 Xa
N=4 Xa bc bc ab 4 cb XX cd 0 1 2 3 6 7 bc ad Xa 5 Xa
N=5 Xa bc bc ab 4 cb XX cd,ce 0 1 2 3 6 7 bc ad,ae Xa 5 Xa
N=6 Xa dX bc bc ab 4 cb XX cd,ce 0 1 2 3 6 7 bc ad,ae Xa 5 eX Xa Each final state corresponding to a character position, enabling multiple matches at different positions in the same clock cycle to be accurately reported.
Non-match mode Xa dX bc bc ab 4 cb XX cd,ce 0 1 2 3 6 bc ad,ae Xa 5 eX Xa If only information about whether the input string matches a regex or not is required.
4-character NFA dXXX XXab XadX Xabc 4 bcbc bcdX abcb cbcb cdXX, ceXX bcbc XXXX cbcd, cbce 0 1 2 3 6 8 7 9 Xabc bcbc bceX Xabc abcd, abce 5 XaeX eXXX XXad, XXae adXX, aeXX
Range Matching (1/2) Consider an n-bit input I composed of bits xn-1 (MSB) to x0 (LSB). Now consider the Boolean function I ≦ C (C is an n-bit constant, 0≦ C≦ 2n-1) which is 1 for all I ≦ C and 0 otherwise. For I ≧ C: For I ≦ C:
Range Matching (2/2) 0 1 0 0 1 1 0 1
N=0 4 XXXX 0 1 2 3 6 8 7 9 5
N=1 XXab 4 XXXX 0 1 2 3 6 8 7 9 5 XXad, XXae
N=2 XXab Xabc 4 bcbc bcbc XXXX 0 1 2 3 6 8 7 9 bcbc Xabc Xabc 5 XXad, XXae
N=3 XXab Xabc 4 bcbc cbcb abcb bcbc XXXX cbcd, cbce 0 1 2 3 6 8 7 9 bcbc abcd, abce Xabc Xabc 5 XXad, XXae
N=4 XXab XadX Xabc 4 bcdX bcbc abcb cbcb bcbc XXXX cbcd, cbce 0 1 2 3 6 8 7 9 bcbc abcd, abce Xabc Xabc 5 XXad, XXae
N=5 XXab XadX Xabc 4 bcdX bcbc abcb cbcb bcbc XXXX cbcd, cbce 0 1 2 3 6 8 7 9 abcd, abce Xabc bcbc Xabc bceX 5 XaeX XXad, XXae
N=6 XXab XadX Xabc 4 bcdX bcbc abcb cbcb cdXX, ceXX bcbc XXXX cbcd, cbce 0 1 2 3 6 8 7 9 abcd, abce Xabc bcbc Xabc bceX 5 XaeX XXad, XXae adXX, aeXX
N=7 dXXX XXab XadX Xabc 4 bcdX bcbc abcb cbcb cdXX, ceXX bcbc XXXX cbcd, cbce 0 1 2 3 6 8 7 9 abcd, abce Xabc bcbc Xabc bceX 5 XaeX XXad, XXae adXX, aeXX eXXX