Series DFA for Memory-Efficient Regular Expression Matching

Series DFA for Memory-Efficient RegularExpression Matching Author:TingwenLiu, Yong Sun, Li Guo, and BinxingFang Publisher: CIAA 2012( International Conference on Implementation and Application of Automata) Presenter: Sih-An Pan Date: 2014/5/7 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

Introduction We focus on state reduction by cutting complex RegExesinto well-designed and ordered RegEx fragments that can be compiled into compact DFAs. We propose Series DFA (SDFA) that concatenates the compact DFAs with epsilon transitions in the order of their appearance. Computer & Internet Architecture Lab

State Complexity for RegExes Computer & Internet Architecture Lab

Main Idea of SDFA RegEx1: ba[â]*bad.{2}cd RegEx2: de[ê]{3} It first locates all unconstrained and constrained repetitions in the two RegExes, and then cut them into five fragments. Fragment1: ba Fragment2:^[â]*bad Fragment3: ^.{2}cd Fragment4:de Fragment5: ^[ê]{3} Computer & Internet Architecture Lab

Main Idea of SDFA We call a RegEx as its fragments’ father, each fragment as its son. For a given RegEx, the first (last) fragment is called its eldestson(youngestson), correspondingly other fragments are non-eldestsons(non-youngestsons). Fragments ba and de, which are the eldestsons of the two RegExes, are compiled into a composite DFA. Computer & Internet Architecture Lab

Main Idea of SDFA Computer & Internet Architecture Lab

Main Idea of SDFA RegEx1: ba[^a]*bad.{2}cd RegEx2: de[^e]{3} Computer & Internet Architecture Lab

Optimization in Cutting Process Cutting at the repetitions of any character range will have low memory size but high memory bandwidth as each fragment is too short. In contrast, cutting only at the repetitions of wildcards will have low memory bandwidthbut high memory size. We introduce a threshold μ: if the size of a character range is more than μ, we think the range is large enough to be cut at the positions of its repetitions. Computer & Internet Architecture Lab

Optimization in Matching Process This specialty can be exploited to decrease memory bandwidth. As left-most matching is enough to know the fired RegExes, once a RegEx is reported it is safe to set its all non-eldestson DFAs inactive forever. SDFA is able to ensure that the fragment DFAs of one RegEx will never be accessed by other RegExes. Computer & Internet Architecture Lab

Experimental Results Computer & Internet Architecture Lab

Series DFA for Memory-Efficient Regular Expression Matching

Series DFA for Memory-Efficient Regular Expression Matching

Presentation Transcript

TFA: A Tunable Finite Automaton for Regular Expression Matching

Memory-Efficient Regular Expression Search Using State Merging

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Memory Efficient Regular Expression Search Using State Merging

Regular Expression Matching for Reconfigurable Packet Inspection

Hardware Architecture for High-Performance Regular Expression Matching

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

An Improved DFA for Fast Regular Expression Matching

StriD2FA Scalable Regular Expression Matching for Deep Packet Inspection

Regular Expression: Pattern Matching

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Deep Packet Inspection with Regular Expression Matching

Efficient Pattern Matching Algorithm for Memory Architecture

Regular Expression Matching for Reconfigurable Constraint Repetition Inspection

Pattern-Based DFA for Memory-Efficient Multiple Regular Expression Matching

Differential Encoding of DFAs for Fast Regular Expression Matching

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Reorganized and Compact DFA for Efficient Regular Expression Matching

Regular Expression

An adaptable FPGA-based System for Regular Expression Matching