290 likes | 474 Views
An On-Chip IP Address Lookup Algorithm. Authors:- Xuehong Sun Yiquiang Q. Zhao. Why is this algorithm in demand. Internet system consists of internet nodes and transmission media. Use of Optical fibers in transmission media. Goal of Optical transmission media. Advantages.
E N D
An On-Chip IP Address Lookup Algorithm Authors:- Xuehong Sun Yiquiang Q. Zhao
Why is this algorithm in demand • Internet system consists of internet nodes and transmission media. • Use of Optical fibers in transmission media. • Goal of Optical transmission media.
Advantages • On-Chip memory access latency is very low. • Larger bus width to on-chip memory than to off-chip memory. • Number of pins in the chip is smaller if the memory goes on chip rather than off chip. • This algorithm uses small amount of memory.
Limitations • Memory cannot be large for on-chip system.
Prefix Compression Algorithm • This IP address lookup algorithm uses small amount of memory. With this merit, this can be implemented on one single chip. • Main approach of author is to convert longest prefix match problem into a range search problem and use a tree structure to do this search. • One of the main contribution for this is the development of a prefix compression algorithm for compactly storing IP address lookup table.
Techniques used in prefix compression algorithm • Compress keys in the tree node. • Use a shared pointer in a tree node. • Use bottom-up process from leaf to root scheme to build the tree.
IP Address Lookup problem • Internet Protocol defines a mechanism to forward internet packets. Each packet has an IP address. • In an Internet node (router), there is an IP address lookup table (forwarding table) which associates any IP destination address with an output port number (or next hop address). • So IP address lookup problem is to study how to construct data structure to accommodate the routing table so that we can find the output port number quickly.
IPv4 Address • An IPv4address is 32 bits long. • It can be represented in dotted-decimal notation: 32 bits are divided into 4 groups of 8bits with each group separated by dots. • IP address is partitioned into 2 parts: a network prefix and host number on that network.
IPv6 Address • This has solved IPv4 space exhaustion problem. • IPv6 is 128 bits long. • It has 8 16-bit pieces of address which uses hexadecimal format.
Converting longest prefix match to Range search problem • Prefix represents address(es) in a range. • If [b,e) is range of prefix, then b,e are endpoints. • Two distinct prefixes share atmost one endpoint. • If two consecutive ranges have same port, shared endpoint can be eliminated and 2 ranges can be merged into 1 range.
A mini routing table implement range search conversion problem
Endpoints with corresponding ports obtained by longest prefix match
Put the tree in memory After putting tree in memory, endpoints are compressed, shared pointer is used and bottom up approach is used to build a range tree in the memory. Compressed endpoints:- the more leading bits and trailing zeros there are, the better the endpoints are compressed. Real-Life iPv4 endpoints
Treeinmemory Date structure for a tree node
Discussion of above data structure • First field is 1-bit, indicates if node is internal node or a leaf node. • Number of keys, is number of endpoints stored in node. • Number of Skip bits, indicates number of head bits to skip in IP address. • Number of trailing zeros, are number of 0’s to ignore at the trail of IP address. • Keys, will have endpoints stored. • Next tree pointer, is pointer indexing to next level node of tree structure.
Build the tree from bottom up • Given a set of sorted endpoints, we create a tree structure from it. For this we adopt bottom up approach. First, we assign endpoints to leaf nodes and then to next highest level until root level.
Variants for tree creation • Several variants will be discussed for tree creation. They differ in calculating compressed keys and select endpoints to store in higher level nodes. • All these variants will have trade-offs between memory size and memory access.
Variant one • if (e1,e2,e3,…,en} be set of endpoints stored in a tree. Assume the first 4 endpoints {e1,e2,e3,e4} are stored in a leaf node, then endpoint {e5} is stored in next higher level node. Assume if next 3 endpoints {e6,e7,e8} are stored in second leaf node, then next endpoint {e9} will be stored in next higher level an so on. • For this scheme, endpoints in next higher level node must be involved in leaf nodes to calculate compressed keys. • Time to create tree structure is O(N). • Uses less memory and has high memory access rate.
Variant two • Difference between variants one and two is that a new endpoint is created to store in the higher level node rather than an existing endpoint. • If {e1,e2,e3,…,en} are endpoints. Assume {e1,e2,e3} are stored in 1st leaf node, {e4,e5,e6} are stored in 2nd leaf node and {e7,e8} are stored in next leaf node and so on. • 1st endpoint to be stored in higher level is simply the common leading bits of {e1,e2,e3} padded with trailing 0’s to form a 32 bit endpoint h1, which is stored in higher level node. 2nd endpoint is created according to e3,e4 and number of common leading bits of {e4,e5,e6}. And this goes on till tree is built to root.
Variant three • This is a combination of variants one and two. We use variant one first. When we are in a level of tree approaching the root, we may change to variant two.
Variantfour • This uses a frontend array. This divides all endpoints into groups depending up on their leading bits. • In each group, a tree structure is created. • Needs more memory, but less memory access time. • When number of empty groups is large, large amount of memory is wasted. So tree is used in frontend instead of array. • So, variant one is used to create search tree. Search result is a pointer to a subtree instead of port entry. Then subtree is created using variant one or two. So this is a layered approach which can be applied recursively.
Optimizations • level one (root) always occupy first row and level2 always start from next row. So pointer in root always points to next row, therefore, is not necessary. Removing this pointer saves 20 bits for root so that we can store more end points for root. • Put the root in the registers. In general number of endpoints in root could be small. In this case we can save endpoints in registers, thus saving one memory access.
Experimental study • If two consecutive intervals have same port, the two consecutive intervals can be merged into one interval.
Conclusion • The main merit of this algorithm is that, it has a very small memory requirement. • With this merit a routing table can be put into a single chip, thus, the memory latency to access the routing table can be reduced. • Experimental analysis shows that, given memory width of 144 bits, our algorithm needs only 400kb memory for storing a 20k entry IPv4 routing table and 5 memory accesses for a search.