340 likes | 562 Views
Address Lookup and Classification. EE384Y May 25, 2006. Pankaj Gupta Principal Architect and Member of Technical Staff, Netlogic Microsystems pankaj@netlogicmicro.com http://klamath.stanford.edu/~pankaj. Outline. Routing Lookups Packet Classification Motivation and problem definition
E N D
Address Lookup and Classification EE384Y May 25, 2006 Pankaj Gupta Principal Architect and Member of Technical Staff, Netlogic Microsystems pankaj@netlogicmicro.com http://klamath.stanford.edu/~pankaj
Outline • Routing Lookups • Packet Classification • Motivation and problem definition • Classification algorithms • Linear search • Associative search (TCAM) • Trie-based techniques • Crossproducting • Tradeoffs in classification • Heuristic algorithms • References
Motivation: Desire for Additional Services E1 Y ISP3 X NAP ISP1 ISP2 Z Other examples: Accounting & billing, rate-limiting, etc.
Special Processing Requires Identification of Flows • All packets of a flow obey a pre-defined rule and are processed similarly by the router • E.g. a flow = (src-IP-address, dst-IP-address), or a flow = (dst-IP-prefix, protocol) etc. • Router needs to identify the flow of every incoming packet and then perform appropriate special processing based on negotiated service agreements Classification Rules or policies (aka ACL entries, filters)
Flow-aware Router: Basic Architectural Components Control Routing, resource reservation, admission control, SLAs Datapath: (per-packet processing) Packet classification Special processing Switching Routing lookup Scheduling
Example: packet (5.168.3.32, 152.133.171.71, …, TCP) Multi-field Packet Classification L3-DA L3-SA L4-PROT Packet Classification: Find the action associated with the highest priority rule matching an incoming packet header.
Formal Problem Definition • Given a classifier C with N rules, Rj, 1 j N, where Rj consists of three entities: • A regular expression Rj[i], 1 i d, on each of the d header fields, • A number, pri(Rj), indicating the priority of the rule in the classifier, and • An action, referred to as action(Rj). For an incoming packet P with the header considered as a d-tuple of points (P1, P2, …, Pd), the d-dimensional packet classification problem is to find the rule Rm with the highest priority among all the rules Rj matching the d-tuple; i.e., pri(Rm) > pri(Rj), j m, 1 j N, such that Pi matches Rj[i], 1 i d. We call rule Rm the best matching rule for packet P.
Routing Lookup: Instance of 1D Classification • One-dimension (destination address) • Forwarding table classifier • Routing table entry rule • Outgoing interface action • Prefix-length priority
P1 P2 Geometric Interpretation Packet classification problem: Find the highest priority rectangle containing an incoming point R7 R6 R2 R1 R4 R5 R3 e.g. (128.16.46.23, *) Dimension 2 e.g. (144.24/24, 64/16) Dimension 1
Outline • Routing Lookups • Packet Classification • Motivation and problem definition • Classification algorithms • Linear search • Associative search (TCAM) • Trie-based techniques • Crossproducting • Tradeoffs in classification • Heuristic algorithms • References
Metrics for Classification Algorithms • Speed • Storage requirements • Ability to handle large classifiers • Low preprocessing time • Update time • Scalability in the number of header fields • Flexibility in rule specification
Size/Update-rate of Classifier? • Micro-flow recognition • 128K-1M flows in a metro/edge router • Also requires high update rate (but have few wildcards) • Firewall applications • <2K rules per interface • Requires low update rate (usually configured at start-up/boot-up time) • Depends heavily on the type of router
Linear Search • Keep rules in a linked list • O(N) storage, O(N) lookup time, O(1) update complexity
Ternary Match Operation • Each TCAM entry stores a value, V, and mask, M • Hence, two bits (Vi and Mi) for each bit position i (i=1..W) • For an incoming packet header, H = {Hi}, the TCAM entry outputs • a match if Hi matches Vi in each bit position for which Mi equals ‘1’. Optional Exercise: What is the logic equation for Z (boolean variable denoting whether a TCAM entry matched)? Optional Exercise: What is the logic equation for Z (boolean variable denoting whether a TCAM entry matched), if instead of (Vi, Mi) we store (Ai,Bi) where (0,0) = always match, (1,1) = always mismatch, (0,1) = match0, and (1,0) = match1
For LPM P32 P31 P8 Lookups/Classification with Ternary CAM TCAM RAM Memory array Action Memory 1.23.11.3, tcp 0 0 1 1 2 0 3 0 Priority Packet Action encoder Header M 1.23.x.x, x 1
Range-to-prefix Blowup Maximum memory blowup = factor of (2W-2)d Luckily, real-life does not see too many arbitrary ranges.
TCAMs • Advantages • Extensible to multiple fields • Fast: 6-8 ns today (133-150 searches per second) going to 250 Msps • Simple to understand and use • Disadvantages • Inflexible: range-to-prefix blowup • Power: ~15-20W @ 100Msps • Cost: $200-$250 for ~2MByte • Density: largest available in 2006 is ~2MB, i.e., 128K x 128 (can be cascaded) • Tough memory soft-error problem
R3 R4 R6 Dimension SA R5 R2 R1 R7 Hierarchical Tries Search (000,010) Dimension DA 1 0 0 0 O(NW) memory O(W2) lookup
Set-pruning Tries [Tsuchiya, Sri98] Search (000,010) Dimension DA 1 0 0 0 O(N2) memory O(2W) lookup R4 R3 R6 Dimension SA R7 R2 R1 R5 R7 R2 R1 R7 R7
0 0 0 0 Grid-of-Tries [Sri98] Search (000,010) Dimension DA 1 0 0 0 O(NW) memory O(2W) lookup R3 R4 R6 Dimension SA R5 R2 R1 R7
Advantages • Good solution for two dimensions • Disadvantages • Difficult to carry out updates • Not easily extensible to more than two dimensions Grid-of-Tries 20K 2D rules: 2MB, 9 memory accesses (with prefix-expansion)
P1 Crossproducting [Sri98] (8,4) 6 5 R2 R1 R3 4 R4 (1,3) 3 2 1 1 2 3 4 5 6 7 8 9
Crossproducting Need: d 1-D lookups + 1 memory access, O(Nd) space 50 rules: 1.5MB, need caching (on-demand crossproducting) for bigger classifiers • Advantages • Fast accesses • Suitable for multiple fields • Disadvantages • Large amount of memory • Need caching for bigger classifiers (> 50 rules)
Outline • Routing Lookups • Packet Classification • Motivation and problem definition • Classification algorithms • Linear search • Associative search (TCAM) • Trie-based techniques • Crossproducting • Tradeoffs in classification • Heuristic algorithms • References
Classification Algorithms: Speed vs. Storage Tradeoff Lower bounds for Point Location in N regions with d dimensions from Computational Geometry O(log N) time with O(Nd) storage, or O(logd-1N) time with O(N) storage N = 100, d = 4, Nd = 100 MBytes and logd-1N = 350 memory accesses
Hierarchy (to at least some level) • Structure Properties of real-life classifiers: One Solution: Heuristics that “seem to work well in real-life” • Recursive Flow Classification [Gupta, McKeown 1999] • Generalization of crossproducting to conserve storage • Hierarchical Intelligent Cuttings [Gupta, McKeown 1999] • Aggregated Bit-vector [Baboescu, Varghese 2001] • HyperCuts [Singh, Baboescu, Varghese2003] • Good heuristics do better than worst-case bounds for real-life datasets.
How Well Do Heuristics Do? • Very well at low speeds • E.g., Hypercuts can process ~20K rules in five dimensions using about 9Mb of memory in ~20 memory accesses (i.e., ~15 Million searches per second) • At high speeds, occupy too much (and classifier-dependent) storage • E.g., RFC can process ~1K rules in five dimensions using ~16Mb memory in ~6 memory accesses (i.e., ~50 million searches per second)
Classification: What’s Used Out There? • Majority of hardware platforms: TCAMs • High performance, cost, power, determinstic worst-case • Some others: Modifications of RFC • Low speed, low cost DRAM-based, heuristic • Works well in software platforms • Some others: HyperCuts/HiCuts • Others: nothing/linear search/simulated-parallel-search etc.
Lookup: What’s Used Out There? • Overwhelming majority of routers: • Modifications of multi-bit tries (h/w optimized trie algorithms) • DRAM (sometimes SRAM) based, large number of routes (>0.25M) • Parallelism required for speed/storage becomes an issue • Others mostly TCAM based • Allows sharing the same TCAM for both lookup and classification
Packet Classification: References • F. Baboescu and G. Varghese, “Scalable packet classification,” Proc. Sigcomm 2001 • [Lak98] T.V. Lakshman. D. Stiliadis. “High speed policy based packet forwarding using efficient multi-dimensional range matching”, Sigcomm 1998, pp 191-202 • K. Lakshminarayanan, A. Rangarajan and S. Venkatachary. “Algorithms for advanced packet classification with Ternary CAMs”, Sigcomm 2005. • [Sri98] V. Srinivasan, S. Suri, G. Varghese and M. Waldvogel. “Fast and scalable layer 4 switching”, Sigcomm 1998, pp 203-214 [Grid-of-tries, crossproducting] • V. Srinivasan, G. Varghese, S. Suri. “Fast packet classification using tuple space search”, Sigcomm 1999, pp 135-146 • P. Gupta, N. McKeown, “Packet classification using hierarchical intelligent cuttings,” Hot Interconnects VII, 1999 • [Gupta99] P. Gupta, N. McKeown, “Packet classification on multiple fields,” Sigcomm 1999, pp 147-160 [RFC]
Packet Classification: References (contd.) • P. Gupta, “Algorithms for routing lookups and packet classification”, PhD Thesis, Ch 1 and 4, Dec 2000, available at http://yuba.stanford.edu/ ~pankaj/phd.html [Background and introduction to Classification] • P. Gupta and N. McKeown, “Algorithms for packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 24-32 • S. Singh, F. Baboescu, G. Varghese and J. Wang, “Packet classification using multidimensional cutting,” Proc. ACM Sigcomm 2003. [HyperCuts] • S. Iyer, R.R. Kompella, and A. Shelat, “ClassiPI: An architecture for fast and flexible packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 33-41 • TCAM vendors: netlogicmicro.com, idt.com