StriD2FA Scalable Regular Expression Matching for Deep Packet Inspection

StriD2FA Scalable Regular Expression Matching for Deep Packet Inspection Author：Xiaofei Wang, Junchen Jiang, Yi Tang ,Yi Wang ,Bin Liu Xiaojun Wang Publisher：IEEE ICC2011 Presenter：Wen-Tse Liang Date：2011/11/16

Outline • INTRODUCTION • Stride-DFA (StriD2FA). • SYSTEM DESIGN PRINCIPLES AND CHALLENGES • Converting input stream into SL stream • An Example of StriD2FA • Benefits of LBM • Challenges • OPTIMIZATION OF FALSE POSITIVE • EVALUATION

Introduction A novel length-based matching (LBM) is presented for accelerating regex matching. In LBM, a packet as a byte stream is first converted into a much shorter stride-length (SL) stream (i.e., integer stream)before sending to StriD2FA. Therefore, the shorter the SL stream is, the higher the speedup can be achieved (in our system, 10 to 15 times speedup is achievable).

Introduction StriD2FA is not directly built from regex, but is built according to different kinds of SL streams. Therefore, the fundamental difference between StriD2FA and DFA is that in DFA a transition records a byte while in StriD2FA it records a length (i.e., integer).

Converting input stream into SL stream some specific characters are chosen, called a tag. A tag is used to get the distances between two adjacent tags which are called stride lengths (SL).

An Example of StriD2FA • regex rule is “.*abba.{2}caca” • Fa(.*abba) is (1 | 2 | 3)+3. • Since the length of the first a in “caca” relies on two arbitrary characters which could be [â][â], a[â], [â]a, or aa. • So possibly the SL stream of Fa(.{2}caca) = • 3 1 2 ([â] [â]caca) • 1 3 2 (a [â]caca) • 2 2 2 ([â] acaca) • 1 1 2 2 (aacaca)

An Example of StriD2FA .*abba.{2}caca (1 | 2 | 3)+ 3 (3 1 2 | 1 3 2 | 2 2 2 | 1 1 2 2)

Benefits of LBM • Increased speed • If the lengths between tags (properly chosen) are relatively long, the method can achieve quite a high speedup. • Small memory consumption: • Firstly, the number of states is generally less than traditional DFA • Secondly, the fanout of each state is controlled by the window size, and which is generally far smaller than the fanout of a traditional DFA (256 in standard ASCII).

OPTIMIZATION OF FALSE POSITIVE Intuitively, a rule is covered by one or more tags c1, . . . , ckwhen it is of very high probability that the rule is matched if StriD2FA1, . . . , StriD2FAkall report a match. it is easy to find that choosing “frequent” characters in a rule as tags can greatly reduce false positive rate. # of occurrences of c in regex r over the sum of lengths of all fixed substrings in r:

EVALUATION

StriD2FA Scalable Regular Expression Matching for Deep Packet Inspection

StriD2FA Scalable Regular Expression Matching for Deep Packet Inspection

Presentation Transcript

Advanced Algorithms for Fast and Scalable Deep Packet Inspection

Deep Packet Inspection Which Implementation Platform?

Network Forensics Deep Packet Inspection

StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection

Cache-Based Scalable Deep Packet Inspection with Predictive Automaton

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Regular Expression Matching for Reconfigurable Packet Inspection

Hardware Architecture for High-Performance Regular Expression Matching

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Regular Expression: Pattern Matching

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Deep Packet Inspection with Regular Expression Matching

Regular Expression Matching for Reconfigurable Constraint Repetition Inspection

Hardware Implementation for Scalable Lookahead Regular Expression Detection

Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection

Packet Scheduling for Deep Packet Inspection on Multi-Core Architectures

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

GPEP : Graphics Processing Enhanced Pattern-Matching for High-Performance Deep Packet Inspection

Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection

SI-DFA: Sub-expression Integrated Deterministic Finite Automata for Deep Packet Inspection

Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection

A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching