300 likes | 520 Views
An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup. Socrates Demetriades , Michel Hanna, Sangyeun Cho and Rami Melhem . Hot Interconnects 2008. Background. IP Lookup in Core Router. Incoming Packet. Outgoing Link. 10101110. Lookup IP Address . Port 2.
E N D
An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Socrates Demetriades, Michel Hanna, Sangyeun Cho and RamiMelhem. Hot Interconnects 2008
Background IP Lookup in Core Router Incoming Packet Outgoing Link 10101110 Lookup IP Address Port 2 Next Hop IP address 1010**** (Port 2) Longer Prefix Matching
Motivation Increasing Internet Traffic • High Speed links • Optical technology -> link rates ~100Gbps • High Speed Routers • TCAM-based forwarding engines • Larger forwarding tables • TCAMs FAIL to scale.
IP Lookup Schemes • TCAM-based schemes. [idt, netlogic, micron,CoolCAM] • Fast and constant lookup time • High cost and power consumption • Trie-based schemes. [Eatherton04, Devroye03,…] • Multi-cycle lookup latencies and low worse-case throughput. • Performance and scalability are fundamentally tied with the IP address length. • Hash-based schemes. [Srinivasan98, Hasan06, Kaxiras05,…] • Key-length independent latencies • Easy to implement in hardware • Hashing collisions -> space inefficiency • Hash keys (prefixes) include “don’t care” bits and they make hashing complicated.
Overview Problem: Hash-based schemes can be power and cost efficient but are still space inefficient or slow. Goal: A hardware-based forwarding engine that has: 1. Constant and high speed lookup throughput. 2. Space efficiency. 3. Scales well with the increasing fwrding tables 4. Low cost and power consumption. Proposal:A h/w-based multi hash architecture with high throughput (1 packet lookup per mem cycle) and at the same time is space and power efficient.
Outline • Introduction • High Speed and Space Efficient Implementation • Selecting hashing bits / Dealing with wildcard bits • Experimental Evaluation • Summary
h/w Hash-based IP Lookup Key (IP address) C-way associative memory array Much more power efficient scheme compared with TCAM. Hash Index generator High Throughput C entries Matching Processors … key1 key2 keyc 2R rows LPM logic C keys fetched … match1 match2 matchc
Hash-based IP Lookup example Key (IP address) 10101110/8 bits 1010**** 1111**** 10100001 Hash Index generator C entries 1111**** 10100001 1010**** … key1 key2 keyj 2R rows 1010**** Next Hop C keys fetched … match1 match2 matchj
Hash-based IP Lookup - LPM Key (IP address) 10101110/8 bits 1010**** 101011** 1010111* Hash Index generator C entries 1010**** 101011** 1010111* … key1 key2 keyj 2R rows LPM (Longest Prefix Match) 1010111* Next Hop C keys fetched … match1 match2 matchj
Hash index generation IP Prefix or IP incoming address Bit-Select mechanism N selected bits F R = N – F XOR hash function Skew XOR R bit hash index Simple XOR-folding hash function
Inserting / Hashing IP prefixes Balanced is better Total available memory space Bucket Load Single Hash Table Used memory space Bucket index Space Utilization = 30%
How to Improve the utilization of the hash table. • Powerful Hash Functions • -> Complexity -> Delay on Critical lookup path. • Adaptive perfect or semi-perfect Hash Functions • -> Rehashing of the whole routing table is needed periodically – very time consuming process. • Using multiple hash functions (MHT) • -> Increase of space efficiency • Our proposal: multi-hashing scheme (MHT) + items are allowed to migrate during insertion operation.
IP prefix insertion (multi-hashing) h1 h2 h3 Used Entry
Hashing IP prefixes: multi-hashing Multi-hashing with 3 hash tables. Single hashing Bucket Load Single Hash Table Bucket index Space Utilization 30% 50%
Migrations are allowed during the insertion operation h1 h2 h3 Insertion time?
Hashing prefixes: MHT + migrations (a) (b) (c) Single hashing Multi-hashing with 3 hash tables. Multi-hashing with 3 hash tables + migrations. Single Hash Table Space Utilization 30% 50% 70% 16
Crisis: Handling unresolved collisions h1 h2 h3 Victim TCAM
Outline • Introduction • High Speed and Space Efficient Implementation. • Selecting Hashing Bits / Dealing with wildcard bits. • Experimental Evaluation. • Summary
Selecting hashing bits from prefixes 10101110************************/length = 8 bits 1010111000110111****************/length = 16 bits 101011100011011110001110********/length = 24 bits - No prefix has length < 8 bits - Rightmost bits have higher entropy and are more suitable for hashing. - Routing tables become larger while wildcard bits participate in hashing. 19
Supporting wildcard bits in hashing • Current technique: Convert each prefix of length x to a set of new prefixes of length L=x+kso the wildcard bits are eliminated up to length L. Then hash the whole new expanded set of prefixes. [Srinivasan et al.] • -> Each prefix expands the table by 2^k prefixes. 1010111000110111****************/length = 16 bits 10101110001101110000 /16 10101110001101110001 /16 … 10101110001101111110 /16 10101110001101111111 /16 20
Control Wildcard Resolution (CWR) CWR: Select bits from any carefully predefined positions 1010111000110111************** 1010111000110111************** 10101110001101110000 10101110001101110001 … 10101110001101111110 10101110001101111111 0111011011100 (index) 0111011011101 (index) 0111011011110 (index) 0111011011111 (index) 16 keys to be inserted 4 keys to be inserted • CWR: -> Allows Sensitivity analysis that can find optimal configuration points for maximum space efficiency. • -> faster Insertion time per prefix
Outline • Introduction • High Speed and Space Efficient Implementation. • Selecting hashing bits / Dealing with wildcard bits • Experimental Evaluation. • Summary
Lookup Architecture Incoming packet’s IP Address (32 bits) Bit-Select mechanism R+F bits (Selected bits for Index generation ) … T + F bits (TAG) R bits Hash Index … … Tag to match LPM 23
Sensitivity AnalysisDifferent Bit-select configurations Advantage over the standard MHT scheme. Very small deviation of the points around the trend line. -> a practical guarantee that the unresolved collisions will not be far from an estimated value.
Space Efficiency - Comparison Load Factor = Routing table size / Available space capacity
Power Consumption Even with load factor = 0.5 - 8x more power efficient than TCAM - 2x compared with IPStash.
Victim TCAM space requirements The percentage of the ‘unresolved collisions’ is an accurate estimator of the victim space that is required for the corresponding load factor.
Summary IP Lookup using TCAMs is expensive. Current hash-based approaches are promising but are either space inefficient or limited by low lookup throughput. The proposed h/w-based multi-hash lookup scheme has:1. High Speed Lookup Throughput. Requires 1 mem access time per packet lookup 2. Space Efficiency. Effective Load Factor 70% with < 5% victim TCAM 3. Low power consumption and cost. 8x less power than dynamic TCAMs. Best among hash-based schemes. Simple and easy hardware implementation. 4. Scalable to future routing table sizes abd IPv6 transition. All methods and techniques used scale well.
Questions source code:www.cs.pitt.edu/~socrates/HBip