1 / 25

Scalable High-Performance Parallel Design for NIDS on Many-Core Processors

Haiyang Jiang, Gaogang Xie , Kave Salamatian and Laurent Mathy. Scalable High-Performance Parallel Design for NIDS on Many-Core Processors. Background & Motivation Our Approach Evaluation Conclusion. Outline. Signature based NIDS (de-facto standard)

anaya
Download Presentation

Scalable High-Performance Parallel Design for NIDS on Many-Core Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Haiyang Jiang, GaogangXie,KaveSalamatian and Laurent Mathy Scalable High-Performance Parallel Design for NIDS on Many-Core Processors

  2. Background & Motivation Our Approach Evaluation Conclusion Outline

  3. Signature based NIDS (de-facto standard) • Deep Packet Inspection(DPI) is a crucial component of NIDS • Consumes 70%-80% processing time Network Intrusion Detection Systems

  4. Due to increase in traffic and ruleset Performance Challenges Traffic ↑ Ruleset ↑

  5. Beyond Single Core Processor • Due to powerful parallelism Many-core Processors The Mother of All CPU Charts 2005/2006, Bert Töpelt, Daniel Schuhmann, Frank Völkel, Tom's Hardware Guide, Nov. 2005.

  6. Many-core Processor-based NDIS • Higher flexibility and lower cost • But lower performance than other solutions The State of the ART

  7. Two kinds of parallel models for NIDS • Data parallelism • Advantages • Thread isolation • Disadvantages • Memory consumption • Reference Locality Limitations of Prior Art

  8. Two kinds of parallel models for NIDS • Function parallelism • Advantages • Fine-grained • Reference locality • Disadvantages • Stage contentions • Message transfer among stages Limitations of Prior Art

  9. Communication Contention Bottleneck Parallel Design Issues

  10. Dozens of cores (TILERAGX with 36 cores) • Accelerated hardware modules • mPIPE: packet capturing engine • User Dynamic Network (UDN): communication chip among cores Features of Many-core Processors Example many-core processor (TILERAGX 36)

  11. Goal: • High-performance • Flexible • Scalable • Inexpensive • Two Schemes • Hybrid parallel scheme • Hybrid load balancing scheme Our Approach Performance • Inflexible • Expensive • Unscalable Hardware Designs • Flexible • High performance • Inexpensive • Scalable • Flexible • Inexpensive Software Designs Flexibility

  12. Combination of two models • Data parallel among Packet Processing Modules (PPM) • Function parallel in PPM Hybrid Parallel Scheme reference

  13. Shared Resource among PPMs • Message (MSG) pool Hybrid Parallel Scheme reference

  14. Due to the lock of MSG pool • Exploit mPIPE to access to MSG pool in parallel • Each packet has an individual MSG structure MSG POOL Contentions The Lock for MSG pool is eliminated as each RAW packet has its corresponding MSG

  15. Due to MSG propagation among stages • Exploit UDN to transfer MSG • Higher bandwidth and lower latency MSG Propagation Contentions

  16. First level: PPMs • Flow based hashing for load balancing in mPIPE • Second level: Protocol processing threads • Flow based hashing for load balancing in pipeline • Third level: Detection engine threads • Rule partition balancing (RPB) Hybrid load balancing SCHEME

  17. Each engine works on a sub-ruleset • Offline partition • Small detection engine • Packet skipping • If one engine finds any intrusion in a packet, the other engines can skip over it. • See the details in our paper Rule partition balancing (RPB)

  18. 1.5 Mpps with 9 cores • 1 Packet Capture thread • 2 Protocol Processing threads • 6 Detection Engine threads Optimal Thread allocation for Each PPM

  19. Background & Motivation Our Approach Evaluation Conclusion Outline

  20. TILERAGX36 processor • 1.2GHZ * 36 • Suricata (Open Source NIDS) implementation • Snort Ruleset • 7571 rules • Synthetic traffic generator Evaluation platform

  21. 7.2Gbps (100 Bytes packet) Throughput (9 cores per PPM, 4 PPMs)

  22. Comparision

  23. 17.40 Mbps/$ • 8 times larger than MIDeA • 3 times larger than Kargus Throughput-Cost

  24. Two parallel designs • Hybrid parallel scheme • Hybrid load balancing scheme • NIDS Evaluation on TILERAGX 36 • High throughput per dollar cost Conclusion

  25. Thank you!

More Related