1 / 28

Packet Classification using Extended TCAMs

Packet Classification using Extended TCAMs. Edward W. Spitznagel, Jonathan S. Turner, David E. Taylor Supported by NSF ANI-9813723, DARPA N660001-01-1-8930. Packet Classification Problem. Filter. Source Address. Destination Address. Source Port. Destination Port. Protocol. Action.

Download Presentation

Packet Classification using Extended TCAMs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Packet Classification usingExtended TCAMs Edward W. Spitznagel, Jonathan S. Turner, David E. Taylor Supported by NSF ANI-9813723, DARPA N660001-01-1-8930

  2. Packet Classification Problem Filter Source Address Destination Address Source Port Destination Port Protocol Action fwd 7 fwd 2 d a c b 11xx 0101 1101 01xx 0010 01xx 101x xxxx - 2-4 3 3-15 - 3-15 * 0-15 UDP ICMP TCP * deny fwd 5 • Suppose you are a firewall, or QoS router, or network monitor ... • You are given a list of rules (filters) to determine how to process incoming packets, based on the packet header fields • Some fields in the rules are specified with bit masks; others with ranges • Goal: when a packet arrives, find the first rule that matches the packet’s header fields

  3. Packet Classification Problem Filter Source Address Destination Address Source Port Destination Port Protocol Action fwd 7 fwd 2 d a c b 11xx 0101 1101 01xx 0010 01xx 101x xxxx - 2-4 3 3-15 - 3-15 * 0-15 UDP ICMP TCP * deny fwd 5 • Example: packet arrives with header (0101, 0010, 3, 5, UDP) • classification result: filter b is matched • filter c also matches, but, b occurs before c in the list • Easy to do when we have only a few rules; very difficult when we have 100,000 rules and packets arrive at 40 Gb/s

  4. Geometric Representation Source Port 6 c b a Filter 010 Source Address xx1 xxx 2-3 0-7 7 Source Port 4 Source Address 2 0 0 2 4 6 • Filters with K fields can be represented geometrically in K dimensions • Example: b c c c c a

  5. Related Work • TCAM-based parallel classification • CoolCAMs (Narlikar, Basu, Zane) for IP lookup • SRAM-based sequential classification • Recursive Flow Classification (Gupta, McKeown) • HiCuts (Gupta, McKeown) • Extended Grid of Tries (Baboescu, Singh, Varghese) • HyperCuts (Singh, Baboescu, Varghese, Wang) • SRAM: 6 transistors per bit (vs. 16 for TCAM), but the SRAM approaches use more bits per filter

  6. Ternary CAMs • Most popular practical approach to high-performance packet classification • Hardware compares query word (packet header) to all stored words (filters) in parallel • each bit of a stored word can be 0, 1, or X (don’t care) • Very fast, but not without drawbacks: • High power consumption limits scalability • inefficient representation of ranges

  7. Ternary CAM - Example Src. Addr. Dest. Addr. Packet: Query: 1110 0110 11100110 TCAM c b Filter a Source Address 11xx xxxx 0xxx Destination Address xxxx 0110 01xx Address Contents 0 11xxxxxx Match! 11100110 1 0xxx01xx Doesn’t Match 11100110 2 xxxx0110 Match! 11100110 Entry 0 (filter a) is the first matching filter

  8. Range Matching in TCAMs Destination Port 6 Filter F Source Port 1-4 3-5 Destination Port 4 Source Port 2 0 0 2 4 6 • Convert ranges intosets of prefixes • 1-4 becomes 001, 01*, and 100 • 3-5 becomes 011 and 10* F

  9. Range Matching in TCAMs Destination Port 6 Filter a b f d e c 01* 100 01* 001 100 Source Port 001 Destination Port 011 10* 10* 011 011 10* 4 Source Port 2 0 0 2 4 6 • With two 16-bit range fields,a single rule could require upto 900 TCAM entries! • Typical case: entire filter setexpands by a factor of 2 to 6 a b c d e f

  10. Extended TCAMs • Extend standard TCAM architecture to enable classification with larger rulesets • Partitioned TCAM, for reduced power • inspired by CoolCAMs • differences in indexing, search and partitioning algorithms • Support range matching directly in hardware

  11. Use of Partitioned TCAM • Main component of power use in TCAM search is proportional to number of entries searched • Partitioning the TCAM: • divide TCAM into blocks of entries • each block is enabled for search via an associated index filter

  12. Use of Partitioned TCAM filter blocks: index filters: 9-10, xxxx 7-7, 110x 1-13, 001x 0-5, 1110 0-15, 0xxx 0-14, 1010 0-6, 1xxx 13-14, 11xx 1-2, 11xx 2-3, 00xx 11-14, 011x 7-15, 1xxx 11-15, 111x 12-12, 01xx 0-15, xxxx • Example: suppose we are given the following filters: a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxx1 d. 11-14, 011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h. 0-5, 1110 i. 1-2, 1x1x j. 13-14, 11xx k. 11-15, 111x A real Extended TCAM would have more blocks, and more filters per block.

  13. Use of Partitioned TCAM 0-15, 0xxx 9-10, xxxx 1-13, 001x 0-5, 1110 7-7, 110x 13-14, 11xx 2-3, 00xx 1-2, 11xx 0-14, 1010 0-6, 1xxx 7-15, 1xxx 11-14, 011x 11-15, 111x 12-12, 01xx 0-15, xxxx • Example: classify packet with header values (2, 1010) • index block: second andfourth filters match • search second and fourthfilter blocks • find matching filters(1-2, 1x1x) and (0-14, 1010) filter blocks: index filters:

  14. Use of Partitioned TCAM 0-15, 0xxx 9-10, xxxx 1-13, 001x 0-5, 1110 7-7, 110x 13-14, 11xx 2-3, 00xx 1-2, 11xx 0-14, 1010 0-6, 1xxx 7-15, 1xxx 11-14, 011x 11-15, 111x 12-12, 01xx 0-15, xxxx • The key to minimizing power consumption: Organize filters so that only a few TCAM blocks must be searched to find the filters matching a packet. • Use a filter grouping algorithm filter blocks: index filters:

  15. i c k j h 14 g 12 f 10 8 d e 6 Index entry filters 0-15, 0xxx a, b, d, e 4 a b 2 0 0 2 4 6 8 10 12 14 a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14, 011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h. 0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 111x 29 October 201415

  16. i c k j h 14 g 12 f 10 8 6 Index entry filters 0-15, 0xxx a, b, d, e 4 0-6, 1xxx h, i 2 0 0 2 4 6 8 10 12 14 a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14, 011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h. 0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 111x 29 October 201416

  17. c k j 14 g 12 f 10 8 6 Index entry filters 0-15, 0xxx a, b, d, e 4 0-6, 1xxx h, i 7-15, 1xxx g, j, k 2 0 0 2 4 6 8 10 12 14 a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14, 011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h. 0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 111x 29 October 201417

  18. c 14 12 f 10 8 6 Index entry filters 0-15, 0xxx a, b, d, e 4 2 0 0-15, xxxx c, f 0 2 4 6 8 10 12 14 a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14, 011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h. 0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 111x 0-6, 1xxx h, i 7-15, 1xxx g, j, k Next phase: 29 October 201418

  19. 14 12 10 8 6 Index entry filters 0-15, 0xxx a, b, d, e 4 2 0 0-15, xxxx c, f 0 2 4 6 8 10 12 14 a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14, 011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h. 0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 111x 0-6, 1xxx h, i 7-15, 1xxx g, j, k Next phase: 29 October 201419

  20. Creating a set of partitions • At most k filters per region (k = block size) • Regions within the same partition do not overlap • Total number of regions equals the index size

  21. Range Matching Store a pair of values (lo , hi ) for each range match field Range check circuitry compares query values against lo and hi to determine if query is in range Transistors per bit of range field is twice that of ordinary TCAM But, for typical IPv4 applications, this results in just a 22% increase in overall transistor count

  22. Performance Metrics • Power Fraction = • a measure of power usage, relative to a standard TCAM • smaller is better • Storage Efficiency = • higher is better; 1 is optimal index size + (# of partitions)(block size) number of filters number of filters index size + (# of blocks)(block size)

  23. Different Block Sizes Block size=128 Block size=256 Block size=64 Block size =32 Block size=16

  24. Results: Power Fraction Basic Algorithm Refined Blocksize = 256 Block size = 32 Block size = 64 Block size = 128

  25. Results: Storage Efficiency Refined Basic Algorithm Blocksize = 256 Block size = 32 Block size = 64 Block size = 128

  26. Current/Future Work • Computational complexity of filter grouping problem • Filter updates (add/delete operations) • Multi-level indices • Different partitioning algorithms • Application to SRAM/DRAM-based classification techniques

  27. Summary • Packet Classification is important for many advanced network services • TCAMs scale poorly due to power consumption and inefficient range match representations • Extended TCAMs: solve these issues by using partitioned TCAM and hardware support for range matching • power consumption greatly reduced (typically to 5% or less of power used by a standard TCAM) • range match hardware: avoid inefficiency in representing ranges

  28. Questions? ?

More Related