1 / 28

Memory-Efficient and Scalable Virtual Routers Using FPGA

Memory-Efficient and Scalable Virtual Routers Using FPGA. Author : Hoang Le, Thilan Ganegedara and Viktor K. Prasanna Publisher: ACM/SIGDA FPGA '11 Presenter: Zi-Yang Ou Date: 2011/10/12. About FPGA’11. Introduction.

burt
Download Presentation

Memory-Efficient and Scalable Virtual Routers Using FPGA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory-Efficient and Scalable Virtual Routers Using FPGA Author: Hoang Le, Thilan Ganegedara and Viktor K. Prasanna Publisher: ACM/SIGDA FPGA '11 Presenter: Zi-Yang Ou Date: 2011/10/12

  2. About FPGA’11

  3. Introduction • Network virtualization is a technique to consolidate multiple networking devices onto a single hardware platform. • To make efficient use of the networking resources. • An abstraction of the network functionality away from the underlying physical network. • Separated and Merged

  4. Contributions • A simple merging algorithm results in the total amount of required memory to be less sensitive to the number of routing tables, but to the total number of virtual prefixes. • A tree-based architecture for IP lookup in virtual router that achieves high throughput and supports quick update. • Use of external SRAMs to support large virtual routing table of up to 16M virtual prefixes. • A scalable design with linear storage complexity and resource requirements.

  5. Set-Bounded Leaf-Pushing Algorithm • In order to use tree search algorithms, the given set of prefixes needs to be processed to eliminate the overlap between prefixes. This elimination process results in a set of disjoint prefixes. • 1. Build a trie from the given routing table. • 2. Grows the trie to a full tree. • 3. Pushes all the prefixes to the leaf nodes. • The prefix tables expand about 1.6 times after leaf pushing. • About 90% of the prefixes are leaf prefixes.

  6. Set-Bounded Leaf-Pushing Algorithm • There are 3 steps involved in the algorithm: 1. Move the leaves of the trie into Set 1 2. Trim the leaf-removed trie 3. Leaf-push the resulting trie and move the leaves into Set 2 • NS1 = lN NS2 = kN(1 − l) N‘ = NS1 + NS2 = kN + lN(1 − k) = N(k + l − kl) • l = 0.9 k = 1.6 N‘ = 1.06N

  7. 2-3 Tree

  8. Merging Algorithm

  9. Merging Algorithm

  10. Merging Algorithm

  11. Merging Algorithm

  12. Merging Algorithm

  13. Merging Algorithm

  14. IP Lookup Algorithm for Virtual Router

  15. IP Lookup Algorithm for Virtual Router

  16. Memory Requirement

  17. Memory Requirement • M1 = |S1|(L + LP + LV ID + LNHI + 2LPtr1) M2 = |S2|(L + LP + LV ID + LNHI + 2LPtr2) • M = M1 +M2 = (|S1| + |S2|)(L + LP + LV ID + LNHI) + 2|S1|LPtr1 + 2|S2|LPtr2 • M = (|S1| + |S2|)(L + LP + logm + LNHI) + |S1|LPtr1 + |S2|LPtr2 • |S1| + |S2| = 1.06N , O(N × logm).

  18. Overall Architecture

  19. Overall Architecture

  20. Overall Architecture

  21. Memory Management

  22. Virtual Routing Table Update

  23. Scalability • The per-level required memory size grows exponentially as we go from one level to the next of the tree. Therefore, we can move the last few stages onto external SRAMs.

  24. IMPLEMENTATION • M = (|S1| + |S2|)(L + LP + LVID + LNHI) + |S1|LPtr1 + |S2|LPtr2 • M = NS(L + LP + LVID + LNHI + LPtr) • MIPv4 = NS(32 + 5 + 5 + 6 + 20) = 68NS • MIPv6 = NS(128 + 7 + 5 + 6 + 20) = 164NS • A state-of-the-art FPGA device with 36 Mb of on-chip memory (e.g. Xilinx Virtex6) can support up to 530K prefixes (for IPv4), or up to 220K prefixes (for IPv6), without using external SRAM.

  25. IMPLEMENTATION • In our design, external SRAMs can be used to handle even larger routing tables, by moving the last stages of the pipelines onto external SRAMs. • Thus, the architecture can support up to 16M prefixes, or 880K prefixes for IPv4 and IPv6, respectively.

  26. Experimental Setup

  27. Experimental Setup

  28. Performance Comparison • Candidates : trie-overlapping (A1), trie braiding (A2) • Time complexity: Our algorithm:O(N) A1 : O(NlogN) A2 : O(N^2) • Memory efficiency: Our algorithm: 15 MB for 1.3M prefixes A1 : 9MB for 290K prefixes A2 : 4.5MB for 290K prefixes • Quick-update capability: A1 and A2 : reconstruct the entire lookup data structure

More Related