1 / 19

Parallel tree search: An algorithmic approach for multi-field packet classification

Parallel tree search: An algorithmic approach for multi-field packet classification. Authors: Derek Pao and Cutson Liu. Publisher: Computer communications 2007 Present: Chen-Yu Lin ( 林呈俞 ) Date: Oct, 25, 2007. Outline. Introduction Packet classification algorithm

lexine
Download Presentation

Parallel tree search: An algorithmic approach for multi-field packet classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel tree search: An algorithmic approach for multi-field packet classification Authors:Derek Pao and Cutson Liu. Publisher: Computer communications 2007 Present:Chen-Yu Lin (林呈俞) Date: Oct, 25, 2007

  2. Outline • Introduction • Packet classification algorithm • Filter decomposition algorithm • Architecture of the search engine • Performance of the search engine

  3. Introduction • Adopt the two-stage classification approach: • 1. Determine the best matching source and destination prefix pair. • 2. Determine the highest priority matching filter by comparing the remaining field against a short list of candidate rules in parallel. • Filter decomposition • Reduce the 2-D search problem to a pseudo 1-D problem. • The decomposed filters are organized as a height-balanced search tree. • This method is scalable to large filter databases and IPv6.

  4. Packet classification algorithm • 2-stage classification approach • 1. Determine the best matching source-destination prefix pair. <ps, pd> • 2. Search the list of filters associated with the selected prefix pair for the highest priority filter • Convert the 2-D search problem to 1-D problem. • Source axis is partitioned into disjointsegments, called bands. • The decomposed filters are organized as a height-balanced search tree.

  5. Packet classification algorithm • A decomposed filter is represented by a pair of end points. • Lower-left corner (LC) • Padding 0 to both source & destination prefixes. • Lower-right corner (RC) • Padding 0 to source prefix and padding 1 to destination prefix. • End point definition • <ps, ls, pd, ld, type>  5 – tuple • ps (pd): extended source (destination) prefix. • ls (ld): original length of source (destination) prefix. • type: type of end point. • Given a source-destination address pair, the search algorithm will • Determine the unique matching source band. • Find the best matching filter.

  6. Packet classification algorithm Conflicting filters = <0010*, 0110*> = <00110*, 0110*> = <00111*, 0110*> 01100000 01101111

  7. Packet classification algorithm If the LC- and RC-points of a filter are adjacent to each other in the sorted list, then the pair of points is replaced by a “merge” point, whose value is equal to LC- point.

  8. Packet classification algorithm • Composite key store in a node is <ps, pd>, where the length of two prefixes be ls and ld. • An input address pair (S, D) is compared with the composed key.

  9. Packet classification algorithm

  10. Packet classification algorithm • External node • An external node provides the reference to the best matching filter, if any, when the search operation falls off the tree. • The role of the external node is similar to the pre-computed best matching prefix in the range search approach.

  11. Packet classification algorithm

  12. Filter decomposition algorithm • Filter decomposition algorithm • Extract all non-wild card source prefixes from the filter table and put them into a list. • A prefix whose length is less than the full address length is then replaced by its lower and higher end-point.

  13. Filter decomposition algorithm • Let List of end-points regions source bands 00100000 L 00100000 ~ 00101111 0010* 00110000 L 00110000 ~ 00110111 00110* 00110111 H 00111000 ~ 00111111 00111* 00111111 H

  14. Architecture of the search engine • To speed up the search operation is by pipelining • Nodes of the search tree are stored in separate memory modules according to their level number. • Linear pipeline can achieve optimal throughput • Drawback: Incremental updates cannot be supported. • Rebalanced a search tree may need to shift up to 75% nodes. IPv4  134 bits IPv6  330 bits

  15. Architecture of the search engine • Flag • If left (right) flag is set  left (right) pointer = left (right) external node. • Otherwise, left (right) pointer = left (right) subtree. • The amount of memory to support a 256K nodes search tree is around 4.2MB and 10.3MB for IPv4 and IPv6. • The depth of a height-balanced search tree with 256K nodes is no more than 19. • There are only 255 nodes in the first eight levels. • A tag is assigned to a task when it’s submitted to the search engine.

  16. Architecture of the search engine Comparator + fast buffer (SRAM) 255 entries DRAM 32K entries

  17. Architecture of the search engine • Output buffer of PUs • L1 and L2 are equipped with double output buffer. • Input buffer of Pus • L1 have a single slot buffer. • L2 equipped with a multiple slots dual-write port buffer (FIFO). • Assume the size of the L2 PUs input buffer is B slots. • If the number of concurrent tasks is larger than 2B+3 Deadlock happen. • To resolve the deadlock problem • Limiting the number of tags no more than 2B+3. • Implement a deflection routing scheme.

  18. Performance of the search engine • Simulation • k and M = 8 • ACLX-5 with about 46K rules • Depth of search tree = 17 • L2 PU input buffer size = (B)

  19. Performance of the search engine • Best performance • Select buffer size of B = 32. • Limiting the number of tags to 2B+3

More Related