1 / 26

ECE 526 – Network Processing Systems Design

ECE 526 – Network Processing Systems Design. Network Processor Tradeoffs and Examples Chapter: D. E. Comer. Outline. Network Processor design tradeoffs Sample Network Processor. NP Architecture. Numerous different design goals Performance Cost Functionality Programmability

mary-beard
Download Presentation

ECE 526 – Network Processing Systems Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE 526 – Network Processing Systems Design Network Processor Tradeoffs and Examples Chapter: D. E. Comer

  2. Outline • Network Processor design tradeoffs • Sample Network Processor ECE 526

  3. NP Architecture • Numerous different design goals • Performance • Cost • Functionality • Programmability • Numerous different system choices • Use of parallelism • Types of memories • Types of interfaces • Etc. • We consider • Design tradeoffs on high level (qualitative tradeoffs) • Commercial Network Processors ECE 526

  4. Processor Topologies • How can processors be arranged on NP? • Consider heterogeneity of processing resources and workload • Multiprocessor • Parallel processors with shared interconnect • Problems? • Pipeline • Multiple processors per data path • Problems? • Data Flow Architecture • Extreme form of pipelining • Problems? • Heterogeneous Architectures ECE 526

  5. Design Tradeoffs (1) • Low development cost vs. performance • ASICs give higher performance, but take time to develop • NPs allow faster development, but might give lower performance • Programmability vs. processing speed • Similar to tradeoff between ASIC and NP • Co-processors pose the same tradeoffs • Complexity of instruction set • Performance: packet rate, data rate, and bursts • Difficult to assess the performance of a system • Even more difficult to compare different systems • Per-interface rate vs. aggregate data rate • NP usually limited to one port ECE 526

  6. Design Tradeoffs (2) • NP speed vs. bandwidth • How much processing power per bandwidth is necessary? • Depends on application complexity • Coprocessor design: look aside vs. flow-through • Look aside: “called” from main processor, need state transfer • Flow-through: all traffic streams through coprocessor • Pipelining: uniform vs. synchronized • Pipeline stages can take different times • Tradeoff between slowing down or synchronization • Explicit parallelism vs. cost and programmability • Hidden parallelism is easier to program • Explicit parallelism is cheaper to implement ECE 526

  7. Design Tradeoffs (3) • Parallelism: scale vs. packet ordering • Why is packet order important? • Giving up packet order constraint gives better throughput • Parallelism: speed vs. stateful classification • Shared state requires synchronization • Limits parallelism • Memory: speed vs. programmability • Different types of memories give performance • Increases difficulty in programming • I/O performance vs. pin count • Packaging can be major cost factor • More pins give higher performance ECE 526

  8. Design Tradeoffs (4) • Programming languages • Ease of programming vs. functionality vs. speed • Multithreading: throughput vs. programmability • Threads improve performance • Threads require more complex programs and synchronization • Traffic management vs. blind forwarding at low cost • Traffic management is desirable but requires processing • Generality vs. specific architecture role • NPs can be specialized for access, edge, core • NPs can be specialized towards certain protocols • Memory type: special-purpose vs. general-purpose • SRAM and DRAM vs. CAM ECE 526

  9. Design Tradeoffs (5) • Backward compatibility vs. architectural advances • On component level: e.g., memories DDR DRAM • On system level: NP needs to fit into overall router system • Parallelism vs. pipelining • Depends on usage of NP • Summary: • Lots of choices • Most decisions require some insight in expected NP usage • Tradeoffs are all qualitative • Lets look at the commercial design ECE 526

  10. Novel Areas of NP Use • TCP/IP offloading on high-performance servers • Security processing: SSL offloading • Storage area networks • Many others: IDSs and etc. ECE 526

  11. Performance Bottlenecks • Memory • Bandwidth available, but access time too slow • Increasing delay for off-chip memory • I/O • High-speed interfaces available • Cost problem with optical interfaces • Otherwise no problem • Processing power • Individual cores are getting more complex • Problems with access to shared resources • Control processor can become bottleneck ECE 526

  12. Limitations on Scalability • What are the limitations on how fast NPs need to get? • Link rates (optical bandwidth limits) • Application complexity (core vs. edge) • What are the limitations on how fast NPs can get? • Parallelism in networks • Power consumption • Chip area ECE 526

  13. Commercial Network Processors • Commercial NPs • Large variety of architectures • Different applications and performance spaces • Lots of implementation details and practical issues • General Themes • Type and number of processors • Homogeneous vs. heterogeneous • Type and size of memories • Internal and External communications channels • Mechanisms of scalability: parallelism and pipelining • Generality vs. specialization ECE 526

  14. Intel IXP1200: external connection ECE 526

  15. Intel IXP1200: internal architecture ECE 526

  16. Cisco PXF ECE 526

  17. Motorola C-Port: conceptual design ECE 526

  18. Motorola C-Port: internal architecture ECE 526

  19. Motorola C-Port: channel processor ECE 526

  20. IXP2400 • XScale (ARM compliant) embedded control processor • Instruction and data caches • 8 microengines • 400 or 600 MHz • 8 threads per microengine • Multiple instruction stores with 4k instructions • 256 general purpose registers • 512 transfer registers • 2GB addressable DDR-DRAM memory (19.2 Gbps) • 32MB addressable QDR-SRAM memory (12 Gbps r+w) • 16 words of Next Neighbor Registers • 16kB scratchpad ECE 526

  21. IXP2400 • Interconnects • Coprocessor bus added (incl. access to T-CAM) • Flow control bus for two-chip configurations (e.g., ingress and egress) • Switch Fabrics • No IX bus • Utopia 1, 2, 3 • CSIX-L1 • SPI-3 (POS-PHY 2/3) ECE 526

  22. Two-Chip Configurations • Flow control needed between ingress and • 1Gbps over flow control bus (not shown) ECE 526

  23. IXP2400 Internal Architecture ECE 526

  24. IXP2400 Microengine • Enhancements over IXP1200 microengines: • Multiplier unit • Pseudo-random number generator • CRC calculator • 4 32-bit timers and timer signaling • 16-entry CAM for inter-thread communication • Time stamping unit • Generalized thread signaling • 640 words of local memory • Simultaneous access to packet queues without mutual exclusion • Functional units for ATM segmentation and reassembly • Automated byte-alignment • uE divided into two clusters with independent command and SRAM buses ECE 526

  25. Software • Support for software pipelining • “Reflector Mode Pathways” for communication • Next Neighbor Registers as programming abstraction • SDK 4.0 • Simulator, debugger, profiler, traffic generator • Portable modules • Provides better infrastructure support • C compiler ECE 526

  26. Summary • Network Processor design space is big due to • Varying design goals • Varying implementation choices • Qualitative tradeoffs • Survey commercial NPs • Network processors are getting more features • Main architecture characteristic is still parallelism • Software support is becoming more important ECE 526

More Related