100 likes | 351 Views
Deep Packet Inspection: Where are We? CCW’08. Michela Becchi. Assumption. The packet payload is not encrypted and can therefore be inspected. Background: Rule-set complexity. Practical rule-sets: Snort, as of November 2007 8536 rules, 5549 Perl Compatible Regular Expressions
E N D
Deep Packet Inspection: Where are We?CCW’08 Michela Becchi
Assumption • The packet payload is not encrypted and can therefore be inspected
Background: Rule-set complexity • Practical rule-sets: • Snort, as of November 2007 • 8536 rules, 5549 Perl Compatible Regular Expressions • 99% with character ranges ([c1-ck],\s,\w…) • 16.3 % with dot-star terms (.*, [^c1..ck]* • 44 % with counting constraints (.{n.m}, [^c1..ck]{n,m}) • Rule-set proposals: • [R. Sommer and V. Paxson, CCS 2003] • [J. Newsome et al, Security and Privacy Symposium 2005] • [Y. Xie et al, SIGCOMM 2008] Deep packet inspection Regular expression matching at line rate Finite Automata based techniques =
Target Architectures Regex-Matching Engine Memory-centric architectures FPGA logic Generalpurpose processors Network processors FPGA / ASIC + memory available parallelism
Challenges DFA Memory-centric architectures NFA Generalpurpose processors Network processors FPGA / ASIC + memory FPGA logic • Logic cell utilization • Clock frequency • Memory space • Memory bandwidth
Directions for DFA-based solutions Memory-centric architectures DFA Generalpurpose processors Network processors FPGA / ASIC + memory • Generality • Covered regex classes • Automatability • Suitable memory architecture • Average and worst case bound COMPRESSION • Default transitions (D2FA) • Alphabet reduction + ENCODING STATE EXPLOSION • Multiple-DFA • Hybrid-FA • History-based-FA • XFA
Multiple Flow Handling • Multiple-DFA • Hybrid-FA • History-based-FA • XFA Memory-centric architectures NFA Can we aggregate throughput over multiple flows ? Generalpurpose processors Can we face denial of service attacks based on multiple flows? Network processors FPGA / ASIC + memory FPGA logic • Peak performance on single flow • No intrinsic multiple-flow support • Amount of per-flow state • Active states • Counters • History bits • …
Some Results • About 500 complex regex from Snort NIDS • FPGA logic (NFA) – Xilinx Virtex 5 – 1 flow • 6.1 Gpbs, using 3.371 slices (46% utilization on XC5VLX50) • Note: XC5VLX330 device has 51,840 slices • FPGA/ASIC + memory (projected) • Multiple-DFA: 13 DFAs, < 1MB footprint each • 2Gbps on single flow assuming SRAM @ 500 MHz • NP: Intel IXP2800 • 5 MicroEngines @ 1.4 GHz, 5 flows • 1 KB scratchpad, 5MB SRAM, 128 MB DRAM • Multiple-DFA (13 DFAs): 20-25 Mbps • Hybrid-FA: 95-100 Mbps
Discussion • FPGA offer an easier way to support large data-sets of complex regular expressions • On memory based architectures • high parallelism, large memory bandwidth and low memory latency necessary to guarantee high throughput • complex rule-sets bring data-structure/algorithmic challenges • Multiple flow support necessary • Finite state machines performance bottleneck: • One input character processed at each iteration • Open question: less complex patterns allowing tokenizers • Payload encryption • Anomaly detection and probabilistic based methods • Deep packet inspection still available as filtering/classification tool within private networks
Code available at: http://regex.wustl.edu