160 likes | 280 Views
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany. TERENA 2001. Motivation – Demanding Services. 1000+ Instructions per Packet. Application Complexity. Internet Security Provision. Required Processing Power.
E N D
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001
Motivation – Demanding Services 1000+ Instructions per Packet Application Complexity Internet Security Provision Required Processing Power Quality of Service Support Routing Switching Required Processing TERENA 2001
Hardware Support for Protocol Processing Acceleration Case Study: TCP Motivation – MIPS versus Bandwidth Trend MIPS Performance / Bandwidth Processor Performance Evolution ~100%/18 month Available Bandwidth Technological Progress / Time TERENA 2001
Protocol Analysis Flexible Protocol Engine Domain Specific Methodology TCP/IP Partitioning System Simulation Prototype Variants Evaluation Optimisation Efficient OS Integration Project Overview • Assumptions & Preconditions • Restriction to Local Area Networks (e.g. Gigabit Ethernet) • High Bandwidth and Low Error Probability • Concentration on Host Implementations TERENA 2001
Talk Outline • TCP Protocol Performance Evaluation • TCP Acceleration Approach • System Simulation Environment • Operating System Integration • Hardware Implementation Directions • Myrinet Implementation and Results • Conclusions and Outlook TERENA 2001
TCP Protocol Performance Evaluation • TCP Software Implementation Structure • Sources of Protocol Processing Overhead • Communication, Synchronisation • Operating System Call Overhead • Copy Operation • Classification: Per-Byte / Per-Packet • Optimisation Opportunities • Interrupt Suppression • Zero Copy Mechanisms • User Level Networking • Checksum Offloading (e.g. Task Offload) • Extending frame sizes (e.g. Jumbo Frames) Application Socket TCP IP Driver Network TERENA 2001
TCP Protocol Performance Evaluation • Performance TCP versus Myrinet GM: • Throughput 335/967 Mbit/s (TCP/Myrinet) • Latency 81/29 s (TCP/Myrinet) • 100% CPU Utilisation • (RedHat Linux 6.2 / PIII 500 MHz) TERENA 2001
Goals? • Software Implementation as a Foundation • Achieve On Wire Compatibility • Consider Different Target Architectures • Develop Re-Useable Hardware Components • Integration of High Level Tools • System Wide Optimisation • Efficient, Transparent Operating System Integration Flexible Protocol Engine Domain Specific Methodology TERENA 2001
PE TCP Acceleration Approach • TCP SW Stack Complexity • General Purpose Protocol • Not Designed for High End Networking • Many Interdependent Algorithms • Often Modified, Adapted, Optimised • ~15.000 Lines C • Approach • TCP Partitioning -> Fast Path Extraction • Hardware Support -> Acceleration • Operating System Bypass • HW/SW Synchronisation • Initialisation, Termination/Error • Transparent Integration • Socket Level Switch Application Socket TCP IP Driver Network TERENA 2001
Fast Path Protocol Processing • Only for User Data Exchange • No Connection Management • No Error Recovery – Only Detection • Complexity ~10% of SW Stack Sender Receiver Connection Context Connection Context TCP Send TCP Send Data Send Ack Send Ack Ack TCP Recv TCP Recv Network TERENA 2001
Netserver Netserver Netperf Socket Socket Socket TCP Fast Path HW TCP Fast Path SW User Mode Linux User Mode Linux User Mode Linux Evaluation VHDL Simulator ISA Simulator System Simulation Environment • Complex Communication System • Real Applications, Operating System (User Mode Linux) • Network Simulation – Error Injection • Fast Path Implementation: Hardware/Software • System Evaluation: Functionality & Performance CORBA Network Simulator TERENA 2001
Fast Path Hardware Implementation Directions • Embedded RISC Processor • LEON Sparc 33 MHz, INTEL StrongARM 200 MHz • OS: ucLinux, GNU C Environment • Intelligent Network Adapter (Myrinet) • RISC Core with User/Network Interface, DMA Engines • Control Program Modification, no Operating System • Network Processor (INTEL IXP1200) • 6 multithreaded microengines • Development: IXP Assembler, Simulator • Specific Hardware • High Level FPGA Design Flow, XILINX Virtex • SYNOPSYS Protocol Compiler Software Hardware TERENA 2001
64 bit LOCAL SRAM Host Interface Packet Interface PCI Bridge DMA Controller Myrinet Link 64 bit, 33 MHz 1280 Mbit/s RISC LANai 7 Myrinet Implementation Plattform • Technology • Packet-Communication and Switching Technology • High-Performance, Highly Reliable • System-Area Network, Cluster Interconnect • Intelligent Network Adapter TERENA 2001
TCP Fast Path/Myrinet • Development Environment • Host SW GM (message passing), Firmware MCP – open source • GNU C Suite, no OS, one context only, no Interrupts • Implementation • MCP: 4 Event Driven State Machines • Fast Path Integration within Network Send & Recv Code • Exploitation of Hardware Support for Checksum Computation • No specific Optimisations, Some Limitations TERENA 2001
TCP Fast Path / Myrinet Performance Results • Performance • Test Setup: INTEL PIII/500MHz, Myrinet LAN Adapter, Linux OS • Netperf Benchmark Throughput/Delay • Throughput Peak: 967, 816, 333 Mbit/s (GM, Fast Path, TCP) • Delay Minimum: 16.5, 49, 81 s (GM, Fast Path, TCP) TERENA 2001
Summary & Outlook Protocol Analysis • Integrated Architecture and Desing Flow for Protocol Processing Acceleration • TCP Partitioning • System Simulation Environment • Integration with existing SW TCP Stack & OS • Prototype with Promising Performance • Present Work: • Fast Path HW Implementation and SoC Integration TCP/IP Partitioning System Simulation Prototype Variants Evaluation Optimisation Efficient OS Integration Flexible Configurable Protocol Engine TERENA 2001