1 / 21

Parallel IP Lookup using Multiple SRAM-based Pipelines

This paper presents a parallel SRAM-based architecture for IP lookup, utilizing multiple pipelines for improved throughput. The architecture ensures memory and traffic balancing, along with maintaining intra-flow packet order using a flow pre-caching scheme. Experimental results validate the effectiveness of the proposed design, addressing major challenges in IP lookup engines. Future work includes extending the architecture for multidimensional packet classification and deep packet inspection.

jthomas
Download Presentation

Parallel IP Lookup using Multiple SRAM-based Pipelines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel IP Lookup using Multiple SRAM-based Pipelines Authors: Weirong Jiang and Viktor K. Prasanna Presenter: Yi-Sheng, Lin (林意勝) Date: 2008.12.10 Publisher/Conf. : Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on Dept. of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

  2. Outline • Introduction • Related Work • Architecture Overview • Memory Balancing • Traffic Balancing • Performance Evaluation • Conclusion

  3. Introduction • Multiple pipelines can be utilized in parallel to improve the throughput further. • The memory distribution over different pipelines as well as across different stages of each pipeline must be balanced. • The traffic among these pipelines should be balanced. IP/prefix caching to utilize the locality of Internet traffic. [1, 14] [1] M. J. Akhbarizadeh, M. Nourani, R. Panigrahy, and S. Sharma. A TCAM-based parallel architecture for highspeed packet forwarding. IEEE Trans. Comput.,56(1): 58–72, 2007. [14] D. Lin, Y. Zhang, C. Hu, B. Liu, X. Zhang, and D. Pao. Route table partitioning and load balancing for parallel searching with TCAMs. In Proc. IPDPS ’07, pages 1–10.

  4. Introduction • The caching may fail to capture the traffic locality due to the long pipeline delay. A flow pre-caching scheme benefits from deep pipelining since it utilizes the inherent caching in the architecture. • The intra-flow packets may go out of order.  An approach called payload exchange, which exploits the pipeline delay, is used to maintain the intra-flow packet order.

  5. Related Work

  6. Related Work • Most published parallel IP lookup engines are TCAM-based. They partition the full routing table into several blocks, and make the search process parallel on different blocks. Ex : Trie-based approaches  Splits the trie by carving subtries. [24] F. Zane, G. J. Narlikar, and A. Basu. CoolCAMs: Powerefficient TCAMs for forwarding engines. In Proc. INFOCOM’03, pages 42–52.

  7. Architecture Overview

  8. Architecture Overview • Lookup Engines : • The routing table is constructed as a leaf-pushed uni-bit trie. • To store the mapping function between subtries and pipelines, several small memories called Destination Index Tables (DITs) are used. • By searching the DIT, the packet also retrieves the address of the subtrie’s root in the first stage of the pipeline. • Each pipeline employs a multi-port queue to handle the access conflicts when multiple incoming packets are directed to the same pipeline.

  9. Architecture Overview • Load Balancer : • Caching is an efficient way to exploit Internet traffic locality for parallel IP lookup. • Define a sequence of packets with the same destination IP address as a flow. • We propose a scheme called flow pre-caching, which allows the destination IP address of a flow to be cached before its next-hop information is retrieved. • If the intra-flow out-oforder packet is detected, a task to exchange the payload between out-of-order packets is initiated.

  10. Memory Balancing

  11. Memory Balancing • Experimental results

  12. Memory Balancing

  13. Memory Balancing • Experimental results

  14. Traffic Balancing • Flow Pre-Caching

  15. Traffic Balancing • Detecting out-of-order packets

  16. Performance Evaluation

  17. Performance Evaluation

  18. Performance Evaluation

  19. Performance Evaluation

  20. Performance Evaluation

  21. Conclusion • This paper proposed a parallel SRAM-based multipipeline architecture for terabit trie-based IP lookup. • Memory and traffic balancing, and intra-flow packet ordering were identified as three major problems. • Our future work includes applying the SRAM-based pipeline architectures to multidimensional packet classification and deep packet inspection.

More Related