1 / 29

Intrusion Detection Processor with Packet Content Matching

Intrusion Detection Processor with Packet Content Matching. JC Ho ECE 594. Topics. Background Algorithm and Data Structure Memory Architecture Processor Design. Background. String Matching Algorithms. Boyer-Moore Good for single-pattern Wu-Manber Best average-case performance

roth-roth
Download Presentation

Intrusion Detection Processor with Packet Content Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intrusion Detection Processor with Packet Content Matching JC Ho ECE 594

  2. Topics • Background • Algorithm and Data Structure • Memory Architecture • Processor Design

  3. Background

  4. String Matching Algorithms • Boyer-Moore • Good for single-pattern • Wu-Manber • Best average-case performance • Aho-Corasick • O(n) worst-case performance

  5. Data Structure for Aho-Corasick • Unoptimized • 1028 bytes per node, 53MB • Bitmap Compression • 41 Bytes per node, 2.8MB • Path Compression • 20 Bytes per node (average), 1.1MB • Data structure size is reduced w/out rules

  6. adaptation • Aho-Corasick with Bitmap Compression • Separation of signature and rules database in different storage units • Smaller next node, failure, and rules pointers • 24 bits each • Result • 41 bytes per node • Same performance 32 byte bitmap next node pointer failure pointer rules pointer

  7. Complete signature Partial signature Partial signature Considerations • Complete or partial match No match

  8. Considerations—Cont. • Case 1: • Failure pointers eventually go to the root • Tag as safe No match

  9. Complete signature Considerations—Cont. • Case 2: • Easy to handle • Start from the beginning of packet • Failure pointers goes back to the root • Mark root node visited • Beginning of signature eventually goes to the right path • Traverse entire path and tag as full match

  10. Partial signature Considerations—Cont. • Case 3: • Similar to Case 2 • Beginning of the signature eventually goes down the right path • Mark root node visited • When end of packet reached, tag as partial match

  11. Partial signature Considerations—Cont. • Case 2: • Very different from cases 2 and 3 • Needs to start from the middle of the data structure • Needs to find the first instance of the first byte in the data structure • Traverse the path of the signature to reach the leaf, mark as partial match since root is not visited

  12. Considerations—Cont. • Result • Case 4 can be the general case • Cases 1, 2, and 3 are special situations of case 4 • Start from the middle of the data structure every time for each packet • Cases 1, 2, and 3 will eventually be redirected back to the root and will operate as if they started from the root

  13. Memory Architecture • Guarantee worst-case performance • On-chip storage for data structure • Similar to cache design • Wide word reference • For ASIC design, memory reference can use node addressable scheme to reduce pointer size further

  14. 0 40 63 Address 23:6 Memory Architecture—Cont. • Node size = Line width • 64 bytes in theory • 41 bytes in reality • Remaining bytes are not constructed

  15. Preprocessing Load Data Effective Memory Address Resolution Address Check Signature Storage Unit Access Bitmap Processing Next Node Address Calculation Data Check Match Check Next Round Preparation Post-processing Processor Design

  16. Processor Design—Preprocessing • Multiple packets are buffered • Contents are loaded to queues on-chip • Each byte of the content is accessed sequentially • Head and tail pointers required for enqueue and dequeue • Start and end pointers required to indicate start and end of packet

  17. Processor Design—Preprocessing Cont. • Packets are assumed to be independent • Data from the same packet always occupies the same queue • Number of queues are proportional to number of stages in data path • Size of queues can be inversely proportional to number of queues

  18. Processor Design—Core • Load data • A counter determines from which queue data is loaded • 1 byte is loaded from a different queue each cycle • No data dependency in the data path • Counter value is passed to the pipeline register along with data byte to keep track of queue

  19. Processor Design—Core Cont. • Effective memory address resolution • Check start pointer to determine whether this is the starting byte of a packet • Starting byte of a packet • Use byte to index into a table to find the address of the first instance of this byte in data structure • Reset all flags associated with this queue • Not the starting byte • Use the next node address computed from previous byte

  20. Processor Design—Core Cont. • Address check • Determine if effective address is root node • Set root flag (RF)

  21. Processor Design—Core Cont. • Signature storage unit access • Bitmap loaded into 8 bitmap registers (BMR0-7), each 32 bits • Next node pointer loaded to next node register (NNR), 24 bits • Failure pointer loaded to failure register (FR) • Rules pointer loaded to rules register (RR)

  22. Processor Design—Core Cont. • Bitmap processing • 8 independent popcount units to count the 1’s in BMR0-7 • Bits 0-4 of current data byte is used to load a bit from each BMR • Bits 5-7 of current data byte is used to select the proper bit and load value of this BMR to PCR • Check if bit is 1 and set BMF (flag) value

  23. Processor Design—Core Cont. • Next node address calculation • If (BMF = 0) next node address = FR • If (BMF = 1) • Perform popcount on PCR to the proper bit (based on bits 0-5 of current byte) • Sum all popcount values up to proper bit • Next node address = (this sum * node size ) + NNR • Use saturated add • Value is stored back to NNR

  24. Processor Design—Core Cont. • Data check • Check end pointer to determine if current byte is end of packet • Set end flag (EF) • Check NNR value to determine if leaf node is reached • Set match flag (MF)

  25. Processor Design—Core Cont. • Match check • Case 2: if (RF = 1) and (MF = 1) • Set complete match flag (CMF) • Case 4: If (RF = 0) and (MF = 1) • Set partial match flag (PMF) • Case 3: If (EF = 1) and (current node != root node) and (NNR != FR) • Set PMF

  26. Processor Design—Core Cont. • Next round preparation • Route NNR value back to load data stage • If (CMF = 1) • Set flush flag (FF) to signal to preprocessing unit to load new packet to this queue • If ignore flag (IF) is set • Ignore processing result • Reset CMF, PMF, EF

  27. Processor Design—Post-processing • If (CMF = 1) or (PMF = 1) • Use RR value to access rules database • Perform actions according to rule • If (EF = 1) and (CMF = 0) and (PMF = 0) • Release packet to router • If (FF = 1) • Set IF to invalidate subsequent data from this queue • Reset FF

  28. Preliminary Results • 2MB signature storage unit • 3.6 ns access time using CACTI • Assume storage unit access is critical path • Translate to 250 MHz conservatively • Support up to 2Gbps

  29. Conclusion • Algorithm is optimized for hardware implementation • Memory requirements can be met by current technology • Implementation is feasible

More Related