200 likes | 353 Views
LaFA: Lookahead Finite Automata for Scalable Regular Expression Detection. Masanori Bando N. Sertac Artan H. Jonathan Chao. Introduction. LaFA. Performance. Conclusion. Introduction.
E N D
LaFA: Lookahead Finite Automata for Scalable Regular Expression Detection Masanori Bando N. SertacArtan H. Jonathan Chao PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion Introduction • Regular Expression (RegEx) has been widely used in NIDPS (Network Intrusion Detection and Prevention System) • Number of RegEx signatures in NIPDS has been increasing • E.g., Snort RegEx (2006: 1131 2009: 2290) • RegEx detection needs to catch up with the backbone network speed • E.g., Existing 10/40 Gbps, and 100 Gbps in the near future (100Gbps Ethernet expected in 2009 or 2010.) Needs Scalable High-Speed RegEx Detection System PRESENTATION TO ANCS2009 http://www.snort.org
Introduction LaFA Performance Conclusion Traditional RegEx Detection • NFA • Pro: Compact • Con: Slow due to Concurrent Active States • DFA • Pro: Fast • Con: Large Memory RegEx1: abc RegEx2: xy[^x]*z Input sequence: xyabc PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion State-of-the-art in RegEx Detection Most research effort has been focusing on reducing DFA memory We focus on reducing the concurrent active states in NFA. PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion LaFA achieves • Efficient resource usage for better scalability • NFA-based finite automata • Optimized detection blocks • High-speed inspection • Reduce simultaneous operations • Small memory usage => • LaFA can be implemented with on-chip memory • Flexible RegEx (Rule) updates • Memory based architecture • No hardware reconfiguration PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion 1. Lookahead - Reduce concurrent states R1: abc[a-z]op R2: bc[a-z]qr R3: bc[a-z]st Input sequence: “abczop” Lookahead Original 7 Active States VS 5 Active States PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion 2. Hardware Sharing • Only one [a-z] detection is needed (for R1, R2 or R3) • → [a-z] detection can be shared. Input sequence: “abczop” PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion 3. String-based detection • I.Character-based operation • V.S. • II.Exact String-based operation Take advantage of previous proposed exact string detector [INFOCOM’07] PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion Overview of the LaFA Architecture • Eventbased detection • Separate • Detection Block • Correlation Block • How about variable strings? PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion Overview of the LaFA Architecture • Eventbased detection • Separate • Detection Block • Correlation Block • How about variable strings? PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion RegEx detection (Buffered Lookup 1) RegEx: abc[a-z]op Input: abc zop Time: 123 456 Received Char 1 2 3 4 5 6 7 8 a b S1: abc S2: op S1 c TLM z o p TLM:Time Lookup Module ` A I PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion RegEx detection (Buffered Lookup 1) Result RegEx 1 Matched RegEx: abc[a-z]op Input: abc zop Time: 123 456 Received Char 1 2 3 4 5 6 7 8 a b S1: abc S2: op c S2 TLM z o p TLM:Time Lookup Module Time 4 [a-z] A I PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion RegEx detection (Buffered Lookup 2) Timestamp RegEx: abc[^q]{3}op Input: abc deq op Time: 123 456 78 a b c de f … o p q … 1 2 3 4 S1: abc S2: op S1 5 CLM 7 CLM:Character Lookup Module 8 ` A I 6 PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion RegEx detection (Buffered Lookup 2) Result RegEx 2 No Matched Timestamp RegEx: abc[^q]{3}op Input: abc deq op Time: 123 456 78 a b c de f … o p q … 1 2 3 4 S1: abc S2: op 5 S2 CLM 7 CLM:Character Lookup Module Time 4-6 [^q]{3} 8 A I 6 PRESENTATION TO ANCS2009
Introduction Background LaFA Summary LaFA Hardware PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion Number of Concurrent Operations • Even for the largest rule set (SnortComb), number of operations per event is up to nine. • The data are taken from rule sets. (Worst case is considered) PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion Memory Requirements • Memory requirements based on the simulation result. • Even for a large set of RegEx, LaFA needs small memory. ESD: Exact String Detector PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion Comparison (Memory Requirements) • Memory requirements increase almost in linear. PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion Speed • Prototype on Xilinx Virtex-4 • Operate at 250MHz • Throughput: 2Gbps per engine • With multiple engines • Virtex-6 (36Mbits Block RAM) • Expected to be 34Gbps with more than 1000 RegExes. PRESENTATION TO ANCS2009
Introduction LaFA Performance Conclusion Conclusion • Today’s networks require scalable high-speed NIDPS with RegEx detection • NFA-based scalable RegEx detection architecture is proposed • LaFA uses 3 techniques • State reordering (Lookahead FA) • Special optimized hardware to detect variable strings • Event driven RegEx detection • One order of magnitude smaller than state-of-the-art RegEx Detection Approaches • LaFA achieves competitive speed PRESENTATION TO ANCS2009