230 likes | 364 Views
Another CDFA Based Multi-Pattern Matching Algorithm and Architecture for Packet Inspection. Presenter : Shi- qu Yu Date : 2011/09/21. Introduction.
E N D
Another CDFA Based Multi-Pattern Matching Algorithm and Architecture for Packet Inspection Presenter : Shi-quYu Date : 2011/09/21
Introduction • Present a method to optimize the potential memory usage of DFA based algorithms for multi-pattern expression matching by the combining DFA’s paths (Named isomorphic path combination IMPC) • Propose a novel multi-pattern matching algorithm,calledACS
Cached DFA(CDFA) • Cached DFA was firstly proposed by T. Song [1], which is a simple extended model of DFA by adding one or more buffers (cache) • The extension is elegant and promising as a better basic theory for pattern matching algorithms.
The key contributions can be summarized as follows • The lower boundary of traditional DFA based pattern matching algorithms is presented and analyzed. • Isomorphic path combination (IMPC), an idea to optimize pattern matching algorithms, is addressed. • Cached DFA (CDFA) based method is designed to achieve IMPC. Operational details are also addressed. • A novel pattern matching algorithm, ACS, which are based on CDFA and IMPC is proposed. The related hardware design model is also presented. • Experimental results show that 78.6% states can be saved by using ACS algorithm than DFA based solution.
PROBLEM ANALYSIS(DFA) • a matching only occurs when the pattern begins at a predefined location within the text to be matched-anchored matching
PROBLEM ANALYSIS(DFA) • patterns may begin anywhere in the text for the cases such as payload checking or spam filtering-anywhere matching
PROBLEM ANALYSIS(DFA) • Four categories: • basic transitions • cross transitions • failure transitions • restartable transitions. For anywhere matching, basic and cross transitions can cause memory’s explosion.
IMPC IDEA • M ={K,Σ,δ , s0, F}
Implicit State Coloring • There are two ways to represent the states’ colors. One is to explicitly use another several bits for each state, which may cause memory overhead. The other is to take advantage of current information and to implicitly color the states.
ACS ALGORITHM AND ARCHITECTURE • The algorithm is similar to AC algorithm addition with the method of how to find isomorphic paths. We do not aim to find all isomorphic paths but the efficient ones.
Find Isomorphic Path • For easy implementation, some rules are given for finding isomorphic paths. They are not strictly prerequisite for IMPC but the experienced ones for simplifying the issue. The basic idea is that allisomorphic paths are not overlapped and not confused for judging the next step on the diverging state.
Rules for Finding Isomorphic Path • R1: The first character of all patterns is never counted as part of isomorphic path. • R2: For each converging state, there is only one corresponding diverging state. That is, for an isomorphic path, the only exit corresponds to all entrances. • R3: For one pattern, there may be many potential isomorphic paths with other patterns, while only those chosen to be combined are called isomorphic paths.
Rules for Finding Isomorphic Path • R4: For one pattern, there may be several isomorphic paths to be combined with other patterns. However, any two of them are not overlapped. • R5: For one pattern, any of its isomorphic paths does not include another one. • R6: Along the isomorphic paths, there is no branch until the diverging state. • R7: Potential isomorphic paths can be over-lapped and included by others. The algorithm for choosing isomorphic paths from potential ones is discussed in next section.