130 likes | 302 Views
Bit-Splitting method for regular expression. Presented by Tamer Abuhmed. Contents. Introduction Applying the Bit-split method on the Aho-Corasick Algorithm. The efficiency of bit-splitting. Applying bit-splitting on regular expression Examples about bit-splitting on Reg.Ex . Analysis
E N D
Bit-Splitting method for regular expression Presented by Tamer Abuhmed Information Security Research Laboratory http://seclab.inha.ac.kr/
Contents • Introduction • Applying the Bit-split method on the Aho-Corasick Algorithm. • The efficiency of bit-splitting. • Applying bit-splitting on regular expression • Examples about bit-splitting on Reg.Ex. • Analysis • Discussion
start state accept state accept state accept state accept state Classical Aho-Corasick (AC) DFA: example 1 • A set of keywords • {he, her, him, his} Failure edges back to state 1 are shown as dash line. Failure edges back to state 0 are not shown.
Bit-AC DFA (Lin’s Bit-Split) Need 8 bit-DFA Simple one bit example Lin Tan,”Bit-split string-matching engines for intrusion detection and prevention”, ACM Transactions on Architecture and Code Optimization 2006.
Bit-AC DFA (Lin’s Bit-Split) Con't 4th5th Bits - DFA
Comparison AHO-Memory Implementation AHO-(bit split) Memory Implementation
Benefits of Bit-split method • Eliminate the non-output states so the result binary tree has less number of states than the original DFA. • Reduce the next stat pointer size(15bit-9bit) • Reduce the transition records from 27000 to 300
Regular Expression Types DFA: Reg. Exp ^ABCD(Explicit ) DFA: Reg. Exp ^AB.*CD (with wildcards .* ) ASCII CHAR. Fang Yu,Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection, ACM, ANCS 2006
Regular Expression Types Con't • At least RE AB.{2+}CD • At mostRE AB.{0,2}CD
Regular Expression Types Con't • Exact arbitrary char. RE.A.{2}CD
Statistics Sailesh Kumar,Algorithms to accelerate multiple regular expressions matching for deep packet inspection. SIGCOMM 2006