Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Fang Yu Microsoft Research, Silicon Valley Work was done in UC Berkeley, jointly with Zhifeng Chen (Google Inc); Yanlei Diao (Umass, Amherst); T. V. Lakshman (Bell Labs); Randy H. Katz (UC Berkeley)

Regular Expressions • Flexible way to describe pattern • Example: for detecting yahoo messenger traffic ^(ymsg|ypns|yhoo).?.?.?.?.?.?.? [lwt].*\xc0\x80 • Used in many payload scanning applications • L7-filter: protocol identifiers • Bro: intrusion patterns • SNORT: • No regular expression in April 2003 • 1131 out of 4867 intrusion rules contain regular expressions as of Jan 2006

Challenges • Features specific to packet scanning applications • Large set of patterns, order of 100s or 1000s

Design Space Automata-based Approaches DFA-based NFA-based • A group of states can be activated simultaneously • Only one state is activated Repeated Scan One Pass Scan (Space Problem) • High percentage of wildcards • NFA-based approaches can be slow, sometimes less than 1Mb/s • Start scanning from one position, if no match, start again at the next position • Good for parsers • Packets may not contain any patterns • No guarantee of high speed • Scan the input only once • Fast and deterministic throughput • Add .* before patterns • Some patterns generate very large DFA m Individual DFA for m patterns One composite DFA for m patterns • O(m) processing complexity for each input character • O(1) processing complexity for each input character Contributions Patterns (A|B)C and (A|D)E • Selectivelygroup patterns into k groups (e.g., k=3) • Avoid exponential memory growth • Further speed up matching process • Rewrite techniques to reduce memory usage • Make DFA-based approach feasible

DFA Sizes of Regular Expressions • Typical patterns in network payload scanning applications Rewrite Rule 1 Rewrite Rule 2 Focus of this talk

Design Considerations • Completeness of matching results for one pattern • Complete matching • Report all the possible substrings • E.g., a pattern ab* and an input abbb • Four possible matches, i.e., a, ab, abb, and abbb • Non-overlapping matching • Common practice: left-most longest match, shortest match results • In most payload scanning applications, for one pattern, reporting non-overlapping matching result is sufficient

ε A U T H \s [\^n] [\^n] [\^n] [\^n] 100 states Patterns with Exponential DFA Sizes • Often for detecting buffer overflow attempts, e.g., .*AUTH\s[^\n]{100} • DFA needs to remember all the possible AUTH\s • A second AUTH\s can either match [^\n]{100} or be counted as a new match of the start of the pattern AUTH\s • Generate a DFA of >100,000 states • Can’t be efficiently processed by an NFA-based approach either Input AUTH\sAUTH\sAUTH\s\s AUTH\s\s\s … NFA for .*AUTH\s[^\n]{100}

Rewriting Intuition • Only the first AUTH\s matters • If there is a ‘\n’ within the next 100 bytes • None of the AUTH\s matches the pattern • Otherwise, the first AUTH\s and the following characters have already matched the pattern Rewrite the pattern to: ([^A]|A[^U]|AU[^T]|AUT[^H]|AUTH[^\s]|AUTH\s[^\n]{0,99}\n)*AUTH\s[^\n]{100} generates a DFA of only 106 states • This rewritten pattern • Report different numbers of matches from the original pattern in identifying complete matches • Equivalent in identifying non-overlapping patterns

Rewriting Effect on the SNORT Rule Set v

Rewriting Effect on the SNORT Rule Set • Created scripts to automatically rewrite patterns • After rewriting, patterns in SNORT and Bro can be compiled into DFAs

Design Choices Automata-based Approaches DFA-based NFA-based Repeated Scan One Pass Scan m Individual DFA for m patterns One composite DFA for m patterns • O(m) processing complexity for each input character • O(1) processing complexity for each input character Contributions • Selectivelygroup patterns into k groups (e.g., k=3) • Further speedup matching process • Avoid exponential memory growth • Rewrite techniques to reduce memory usage • Make DFA-based approach feasible

State Explosion Problem • Randomly adding patterns from the L7-filters into one DFA

Interactions of Regular Expressions • Some patterns generate DFA of exponential sizes • E.g., A DFA for pattern .*AB.*CD and .*EF.*GH

Grouping Algorithms • Fixed local memory limitation(NPU or multi-core architectures) • Compute pair-wise interactive results, form a graph • Keep adding patterns until reaching limit • Pick a pattern with the fewest interactions to the new group • Fixed total memory limitation(General single-core CPU architecture) • First compute the DFA of individual patterns and compute the leftover memory size • Distribute the leftover memory evenly among ungrouped expressions

Experimental Setup • Regular expression pattern sets • Linux application layer filer (L7-filter): 70 regular expressions • Pattern sets from Bro intrusion detection systems • HTTP related patterns: 648 patterns • Payload related patterns: 223 patterns • Packet traces: • MIT dump: with viruses and worms • Berkeley dump: normal traffic • Scanners: • Generated one pass scanning DFA scanner • A NFA-based scanner Pcregrep • A repeated scanning DFA parser generated by flex

Grouping Results for Patterns in L7-filter (70 patterns) Results of grouping algorithms for fixed total memory No grouping Sum of individual DFAs No extra memory cost 70/12=5.83 times less processing per character 70/3=23.3 times less processing per character 6.83MB of memory

Throughput Analysis • For Linux L7-filter (70 patterns) • Using PCs with 3Ghz single core CPU and 4GB memory

Comparisons to Other Approaches • DFA OP is • 48 to 704 times faster over the NFA implementation • 12-42 times faster than the commonly used DFA-based parser • Use 2.6 to 8.4 times memory NFA—Pcregrep DFA RP – Flex generated DFA-based repeated scan engine DFA OP – Our DFA one pass scanning engine

Conclusions • High speed regular expression matching scheme • Proposed two rewrite rules • DFA-based approach is possible with our rewriting rules • Can rewrite complicated patterns from our pattern sets • In other pattern sets, there may be patterns not covered by our rewriting rules. • Developed grouping algorithm to selectively group patterns together • Orders of magnitude faster than existing solutions • Can be applied to FPGA or ASIC based approaches as well

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection