270 likes | 289 Views
Fast Incremental Updates on Ternary-CAMs for Routing Lookups and Packet Classification. August 17, 2000 Hot Interconnects 8. Devavrat Shah and Pankaj Gupta Department of Computer Science Stanford University {devavrat, pankaj}@stanford.edu http://www.stanford.edu/~pankaj. HEADER.
E N D
Fast Incremental Updates on Ternary-CAMs for Routing Lookups and Packet Classification August 17, 2000 Hot Interconnects 8 Devavrat Shah and Pankaj Gupta Department of Computer Science Stanford University {devavrat, pankaj}@stanford.edu http://www.stanford.edu/~pankaj
HEADER Lookup in an IP Router Forwarding Engine Next Hop Dstn Addr Next Hop Computation Unicast destination address based lookup Forwarding Table Dstn-prefix Next Hop ---- ---- ---- ---- Incoming Packet ---- ----
103.23.122.7 171.3.2.22 IP Lookup = Longest Prefix Matching Prefix Next-hop 100/9 10.0.0.111 103.23/16 171.3.2.4 103.23.122/23 171.3.2.22 101.20/13 320.3.3.1 101.1/16 120.33.32.98 Forwarding Table Find the longest prefix matching the incoming destination address
Requirements of a Route Lookup Scheme • High Speed : tens of millions per sec • Low storage : ~100K entries • Fast updates: few thousands per second, but ideally at lookup speed
Route Lookup Schemes • Various algorithms : come to tutorial tomorrow if interested • This paper is about ternary CAMs
Content-addressable Memory (CAM) • Fully associative memory • Exact match (fixed-length) search operation in a single clock cycle TCAM: stores a 0, 1 or X in each cell: useful for wildcard matching
Route Lookup Using TCAM Location Prefix Next-hop 1 0 P1 103.23.122/23 171.3.2.22 1 P2 103.23/16 171.3.2.4 1 0 P3 101.1/16 120.33.32.98 2 Priority Encoder 103.23.122.7 P1 0 P4 101.20/13 320.3.3.1 3 0 P5 100/9 10.0.0.111 4 0 5 0 6 To find the longest prefix cheaply, need to keep entries sorted in order of decreasing prefix lengths
General TCAM Configuration For Longest Prefix Matching 32 bit Prefixes 31 bit Prefixes 30 bit Prefixes Prefix-length ordering constraint (PLO) 10 bit Prefixes 9 bit Prefixes 8 bit Prefixes
Incremental Update Problem • Updates: • Insert a new prefix • Delete an old prefix • Problem: how to keep the sorting invariant (e.g., the PLO) under updates
Target Update Rate ? • Many are happy with a few hundred thousand per second • Others want (and claim) single clock-cycle updates • Our goal: make them as fast as possible (ideally single-cycle)
Add new 30-bit prefix Common Solution: O(N) 32 bit Prefix 31 bit Prefix 30 bit Prefix M N 10 bit Prefix Problem: How to manage the empty space for best update time and TCAM utilization? 9 bit Prefix 8 bit Prefix Empty Space
Add new 30-bit prefix Better Average Update Rate 32 bit Prefix 31 bit Prefix 30 bit Prefix 9 bit Prefix 8 bit Prefix Worst case is still O(N)
Add An L-solution (L=32) 32 bit Prefix 31 bit Prefix 30 bit Prefix Two prefixes of same length can be in any order 10 bit Prefix 9 bit Prefix 8 bit Prefix Empty Space
Routing Table for Simulation Snapshot + 3-hour updates on the original table Source: www.merit.edu - March 1, 2000
Performance of L-solution Avg #memory writes # Entries
Outline of Rest of the Talk • Algorithm PLO_OPT: worst case L/2 memory shifts (provably optimal) • Algorithm CAO_OPT: even better (conjectured to be optimal)
Add PLO_OPT 32 bit Prefix 31 bit Prefix 21 bit Prefix Empty Space 20 bit Prefix Worst-case L/2 9 bit Prefix 8 bit Prefix
Better Algorithm ? • PLO_OPT is optimal under the PLO constraint • Question: can we relax the constraint and still achieve correct lookup operation?
P2 has no ordering constraint with P3 or P4 Maximal chain P4 < P3 < P1, P2 < P1 Chain ancestor Ordering Constraint Yes: PLO Constraint is More Restrictive Than Needed P4 31 P3 29 P2 15 8 P1
Algorithm CAO_OPT • Maintain free space pool in the “middle” of the maximal chain • Basic idea: for every prefix, the longest chain that this prefix belongs to should be split around the free space pool as equally as possible
CAO_OPT: Example P4 P2 P3 P4 < P3 < P1, P2 < P1 P1
CAO_OPT: Updates • Insertion : find the maximal chain to which new entry belongs and insert it such that this chain is distributed as equally as possible around the free space : D/2 operations • Deletion : reverse operation with update possibly using another chain
Auxiliary Data Structure • Trie of prefixes with two additional fields per node • Update operation takes L memory writes in software and D/2 in TCAM
Maximal-chain Length (D) Distribution Number of chains Chain Length
CAO_OPT (MAE-EAST) Avg #memory writes # Entries
Summary of Simulation Results Hence, can achieve 1-2 cycle updates