1.53k likes | 1.74k Views
Routing Lookups and Packet Classification: Theory and Practice. August 18, 2000 Hot Interconnects 8. Pankaj Gupta Department of Computer Science Stanford University pankaj@stanford.edu http://www.stanford.edu/~pankaj. Tutorial Outline. Introduction What this tutorial is about
E N D
Routing Lookups and Packet Classification: Theory and Practice August 18, 2000 Hot Interconnects 8 Pankaj Gupta Department of Computer Science Stanford University pankaj@stanford.edu http://www.stanford.edu/~pankaj
Tutorial Outline • Introduction • What this tutorial is about • Routing lookups • Background, lookup schemes • Packet Classification • Background, classification schemes • Implementation choices for given design requirements
Request to you • Please ask lots of questions! • But I may not be able to answer all of them right now • I am here to learn, so please share your experiences, thoughts and opinions freely
Edge Router Internet: Mesh of Routers The Internet Core Campus Area Network
RFC 1812: Requirements for IPv4 Routers • Must perform an IP datagram forwarding decision (called forwarding) • Must send the datagram out the appropriate interface (called switching) Optionally: a router MAY choose to perform special processing on incoming packets
Examples of special processing • Filtering packets for security reasons • Delivering packets according to a pre-agreed delay guarantee • Treating high priority packets preferentially • Maintaining statistics on the number of packets sent by various routers
Special Processing Requires Identification of Flows • All packets of a flow obey a pre-defined rule and are processed similarly by the router • E.g. a flow = (src-IP-address, dst-IP-address), or a flow = (dst-IP-prefix, protocol) etc. • Router needs to identify the flow of every incoming packet and then perform appropriate special processing
Flow-aware vs Flow-unaware Routers • Flow-aware router: keeps track of flows and perform similar processing on packets in a flow • Flow-unaware router (packet-by-packet router): treats each incoming packet individually
What this tutorial is about: • Algorithms and techniques that an IP router uses to decide where to forward the packets next (routing lookup) • Algorithms and techniques that a flow-aware router uses to classify packets into flows (packet classification)
Routing Lookups: Outline • Background and problem definition • Lookup schemes • Comparative evaluation
HEADER Lookup in an IP Router Forwarding Engine Next Hop Dstn Addr Next Hop Computation Unicast destination address based lookup Forwarding Table Dstn-prefix Next Hop ---- ---- ---- ---- Incoming Packet ---- ----
Packet-by-packet Router Forwarding Table Linecard Forwarding Decision Linecard Routing processor Forwarding Table Linecard Forwarding Decision Linecard Interconnect
Packet-by-packet Router: Basic Architectural Components Routing Control Switching Datapath: per-packet processing Routing lookup Scheduling
ATM and MPLS SwitchesDirect Lookup (Port, vci/label) (Port, vci/label) Memory Address Data
IPv4 Addresses • 32-bit addresses • Dotted quad notation: e.g. 12.33.32.1 • Can be represented as integers on the IP number line [0, 232-1]: a.b.c.d denotes the integer: (a*224+b*216+c*28+d) IP Number Line 0.0.0.0 255.255.255.255
Class-based Addressing A B C D E 128.0.0.0 192.0.0.0 0.0.0.0
Lookups with Class-based Addresses netid port# 23 Port 1 Class A 192.33.32.1 Class B 186.21 Port 2 Class C Exact match 192.33.32 Port 3
Problems with Class-based Addressing • Fixed netid-hostid boundaries too inflexible: rapid depletion of address space • Exponential growth in size of routing tables
Exponential Growth in Routing Table Sizes Number of BGP routes advertised
Classless Addressing (and CIDR) • Eliminated class boundaries • Introduced the notion of a variable length prefix between 0 and 32 bits long • Prefixes represented by P/l: e.g., 122/8, 212.128/13, 34.43.32/22, 10.32.32.2/32 etc. • An l-bit prefix represents an aggregation of 232-l IP addresses
Backbone routing table 192.2.0/22, R2 CIDR:Hierarchical Route Aggregation Router Backbone R3 R1 R4 R2 ISP, P ISP, Q 192.2.0/22 200.11.0/22 Site, S Site, T 192.2.1/24 192.2.2/24 192.2.1/24 192.2.2/24 192.2.0/22 200.11.0/22 IP Number Line
Source: http://www.telstra.net/ops/bgptable.html Size of the Routing Table Number of active BGP prefixes Date
Classless Addressing Class-based: C B A 0.0.0.0 255.255.255.255 Classless: 191.23.14/23 191.23/16 191.128.192/18 23/8 191/8 255.255.255.255 0.0.0.0
Non-aggregatable Prefixes: (1) Multi-homed Networks Backbone routing table Router Backbone 192.2.2/24, R3 R1 192.2.0/22, R2 R3 R2 ISP, P 192.2.0/22 192.2.2/24 R4
Backbone routing table 192.2.2/24, R3 192.2.0/22, R2 Non-aggregatable Prefixes: (2) Change of Provider Router Backbone R4 R2 R1 R3 ISP, P ISP, Q 192.2.0/22 200.11.0/22 Site, S Site, T 192.2.1/24 192.2.2/24 192.2.1/24 192.2.2/24 192.2.0/22 200.11.0/22 IP Number Line
192.2.2.100 Routing Lookups with CIDR 192.2.2/24 192.2.2/24, R3 192.2.0/22 200.11.0/22 192.2.0/22, R2 200.11.0/22, R4 192.2.0.1 200.11.0.33 Find the most specific route, or the longest matching prefix among all the prefixes matching the destination address of an incoming packet
Longest Prefix Match is Harder than Exact Match • The destination address of an arriving packet does not carry with it the information to determine the length of the longest matching prefix • Hence, one needs to search among the space of all prefix lengths; as well as the space of all prefixes of a given length
Metrics for Lookup Algorithms • Speed • Storage requirements • Low update time • Ability to handle large routing tables • Flexibility in implementation • Low preprocessing time
Maximum Bandwidth per Installed Fiber Source: Lucent
Maximum Bandwidth per Router Port, and Lookup Performance Required
Size of Routing Table? • Currently, 85K entries • At 25K per year, 230-256K prefixes for next 5 years • Decreasing costs of transmission may increase rate of routing table growth • At 50K per year, need 350-400K prefixes for next 5 years
Routing Update Rate? • Currently a peak of a few hundred BGP updates per second • Hence, 1K per second is a must • 5-10K updates/second seems to be safe • BGP limitations may be a bottleneck first • Updates should be atomic, and should interfere little with normal lookups
Routing Lookups: Outline • Background and problem definition • Lookup schemes • Comparative evaluation
Linear Search • Keep prefixes in a linked list • O(N) storage, O(N) lookup time, O(1) update complexity • Improve average time by keeping linked list sorted in order of prefix lengths
Caching Addresses Slow Path Buffer Memory CPU Fast Path DMA DMA DMA Line Card Line Card Line Card Local Buffer Memory Local Buffer Memory Local Buffer Memory MAC MAC MAC
Caching Addresses • Advantages • Increased average lookup performance • Disadvantages • Decreased locality in backbone traffic • Cache size • Cache management overhead • Hardware implementation difficult
Add P5=1110* 0 P5 I Lookup 10111 Radix Trie Trie node A next-hop-ptr (if prefix) 1 B right-ptr left-ptr 1 C D 0 P2 1 1 F E P1 0 G P3 1 H P4
Radix Trie • W-bit prefixes: O(W) lookup, O(NW) storage and O(W) update complexity • Advantages • Simplicity • Extensible to wider fields • Disadvantages • Worst case lookup slow • Wastage of storage space in chains
Leaf-pushed Binary Trie Trie node A left-ptr or next-hop right-ptr or next-hop 1 B 1 C D 0 P1 P2 1 E P2 0 G P3 P4
PATRICIA A Patricia tree internal node 2 0 bit-position 1 B C right-ptr left-ptr P1 3 1 0 E D 5 P2 1 0 F G P3 P4 Lookup 10111
PATRICIA • W-bit prefixes: O(W2) lookup, O(N) storage and O(W) update complexity • Advantages • Decreased storage • Extensible to wider fields • Disadvantages • Worst case lookup slow • Backtracking makes implementation complex
Lookup 10111 Path-compressed Tree A 1, , 2 0 1 C B P1 10,P2,4 0 D 1010,P3,5 1 E P4 Path-compressed tree node structure next-hop (if prefix present) variable-length bitstring bit-position left-ptr right-ptr
Path-compressed Tree • W-bit prefixes: O(W) lookup, O(N) storage and O(W) update complexity • Advantages • Decreased storage • Disadvantages • Worst case lookup slow
Early Lookup Schemes • BSD unix [sklower91] : Patricia, expected lookup time = 1.44logN • Dynamic prefix trie [doeringer96] : Patricia variant, complex insertion/deletion : 40K entries consumed 2MB with 0.3-0.5 Mpps
Multi-ary trie W/k Depth = W/k Degree = 2k Stride = k bits Multi-bit Tries Binary trie W Depth = W Degree = 2 Stride = 1 bit
Prefix Expansion with Multi-bit Tries • If stride = k bits, prefix lengths that are not a multiple of k need to be expanded E.g., k = 2: Maximum number of expanded prefixes corresponding to one non-expanded prefix = 2k-1
Lookup 10111 Four-ary Trie (k=2) A 11 10 B C P2 11 10 D E F 10 P3 P11 P12 10 11 H G P41 P42 A four-ary trie node next-hop-ptr (if prefix) ptr00 ptr01 ptr10 ptr11