250 likes | 491 Views
FlashTrie : Hash-based Prefix-Compressed Trie for IP Route Lookup Beyond 100Gbps. Author : Masanori Bando and H. Jonathan Chao Publisher : INFOCOM, 2010 Presenter : Jo-Ning Yu Date : 2011/02/16. Introduction FlashTrie Prefix-Compressed Trie HashTune Membership Query Module
E N D
FlashTrie: Hash-based Prefix-Compressed Triefor IP Route Lookup Beyond 100Gbps Author : Masanori Bando and H. Jonathan Chao Publisher : INFOCOM, 2010 Presenter : Jo-Ning Yu Date : 2011/02/16
Introduction • FlashTrie • Prefix-Compressed Trie • HashTune • Membership Query Module • Lookup operation • Architecture • Update • Performance evaluation Outline
Tree Bitmap have been used in today’s high-end routers. However, their large data structure often requires multiple external memory accesses for each route lookup. In this paper, we propose a new IP route lookup architecture called FlashTrie that overcomes the shortcomings of the multibit-trie based approach. We also develop a new data structure called Prefix-Compressed Trie that reduces the size of a bitmap by more than 80%. We use a hash-based membership query to limit off-chip memory accesses per lookup to one and to balance memory utilization among the memory modules. Introduction
overview FlashTrie Architecture
Construction of the PC-Trie has two rules: All sibling nodes must be filled with NHI if at least one node in the node set contains NHI. The parents node set can be deleted if two child node sets exist. Prefix-Compressed Trie(PC-Trie)
The main difference between Tree Bitmap and our data structure is that a bit in the Tree Bitmap represents only one node, while a bit represents more than one node in PC-Trie. Prefix-Compressed Trie(PC-Trie)
Advantage Tree Bitmap, the internal bitmap : 2stride+1−1bits. PC-Trie n, 2(s+1−log(n))−1 bits, s is the stride size (in bits). For example, a PC-Trie 8 and 8-bit stride (n = 8, s = 8) : Tree Bitmap = 28+1−1 = 29−1 = 511 bits PC-Trie = 2(8+1−log(8))−1 = 2(9−3)−1 = 26−1 = 63 bits Drawback NHI may need to be duplicated as the figure shows. However, the number of memory slots needed for the NHI table is reduced. Because the size of a NHI is so small that multiple NHI can be stored into a single memory slot. Prefix-Compressed Trie(PC-Trie)
The hash table has two different entries: no-collision: Least-Significant Bit of the hash table entry is set to “0”. collision: Least-Significant Bit of the hash table entry is set to “1”. The collided items are stored in Black Sheep (BS) memory. Membership Query Module
no-collision : If the Verify Bits are matched with the input IP address, then the Verify Bits plus the hash result becomes the PC-Trie address for the input IP address. collision : We have multiple on-chip BS memory modules that are accessed in parallel to void access one BS memory multiple times. Based on simulation results, in the worst case IPv4 : five BS memory modules are needed. IPv6 : seven BS memory modules are needed. Membership Query Module
We use HashTune as the hash function because it has a compact data structure and better memory utilization. Advantages: Key distribution is more uniform over the entire hash table. The size of Verify Bits can be reduced. HashTune
The entire hash table is segmented into multiple small hash tables called groups and all groups have the same number of bins. Each group is allowed to select a different hash function from the pool of hash functions. The selected hash function ID is stored in a Hash ID Table, which is also stored in on-chip memory and used for query operations. HashTune
Each group is assigned an ID called Group ID and the ID is selected from LSBs of root nodes in each sub-trie. For example : Resolving 17 bits input and choosing 8 LSBs as the group ID gives us the remaining 9 bits to be stored as Verify Bits. As a result, the Verify Bit size and on-chip memory requirements are reduced. HashTune
IPv4 Prefix Distribution from 2002 to 2009. Sources from the RouteViews project , Oregon, USA. Prefix Distribution • We select three levels based on the prefix distribution of figure. • They are IPv4/16 (16 Most-Significant Bits of an IPv4 address), IPv4/17 (MSB 17 bits to 24 bits), and IPv4/25 (MSB 25 bits to 32 bits).
IPv6 Prefix Distribution. Sources from the RouteViews project , Oregon, USA and expected future IPv6 routing tables. Prefix Distribution • The number of routes increases every 8 bits, and we can observe major prefix lengths in that 8-bit region (e.g., /32, /40, and /48). • Thus, we also select stride size 8 for IPv6.
The input 32-bit IPv4 address is categorized in IPv4/16, IPv4/17, and IPv4/25. IPv4/16 is resolved using Direct Lookup(on-chip), and IPv4/17 and IPv4/25 are resolved using the Membership Query(on-chip) and PC-Trie(off-chip). Lookup Operation
FlashTrie Architecture for IPv4 The contents of the bitmap is “1,” which means NHI exists for the input “100”. Architecture Suppose “100,” is the input to this PC-Trie Since the LSB of the input is “0”, E is selected as the final NHI. • For PC-Trien, more than one (log2(n)) bit from the LSB of the input destination IP address is used to find the final NHI.
Performance Evaluation stride size = 8 bits 279, 117 routes 279, 117 routes
Memory Requirements On-Chip Memory: Direct Lookup for up to /16. Hash table for membership queries. Hash ID table (storing a hash ID for each group). Black Sheep memories for collided items in the hash table. Performance Evaluation
Memory Requirements • Off-Chip Memory: • PC-Trie • sub-trie size : (stride size = 8 bits) • Tree Bitmap = 1063 bits (internal bitmap + external bitmap + pointers) • PC-Trie8 = 83 bits (PC-Trie bitmap + pointer) Performance Evaluation
Memory Requirements Off-Chip Memory: NHI Performance Evaluation
Memory Requirements It is evident from the result that the reduction in bitmap size is more than 80% (for higher compression degree of PC-Trie). Performance Evaluation
Lookup Speed and Timing Analysis Performance Evaluation