1 / 12

TCP Offload Through Connection Handoff

TCP Offload Through Connection Handoff. Hyong-youb Kim and Scott Rixner Rice University April 20, 2006. Full TCP Offloading. Move all TCP/IP processing to the network interface Computation Saves processing resources on the host NIC can be customized for TCP/IP processing Memory

Download Presentation

TCP Offload Through Connection Handoff

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TCP Offload Through Connection Handoff Hyong-youb Kim and Scott Rixner Rice University April 20, 2006

  2. Full TCP Offloading • Move all TCP/IP processing to the network interface • Computation • Saves processing resources on the host • NIC can be customized for TCP/IP processing • Memory • Reduces host memory references • Network interface can exploit small, fast, local memory • Problems • Network interface can become a performance bottleneck • Limited computation on NIC • Limited memory capacity on NIC • Complicates global resource management in the stack TCP Offload Through Connection Handoff

  3. Solution: Connection Handoff • Only handoff established connections to NIC • Operating system controls division of work • Only TCP send and receive on the NIC • OS performs connection establishment, routing, … • No changes to sockets API • SPECweb99 performance • 17% and 32% reduction in cycles per packet • 15% and 27% improved throughput TCP Offload Through Connection Handoff

  4. ~3100 instructions per packet ~50% of all operations are memory references User requests Socket Protocol/socket operations TCP IP Packet generation Receive processing Host OS Ethernet Driver Ethernet frames Transmit Receive NIC Unmodified Network Stack User Application TCP Offload Through Connection Handoff

  5. Connection in OS Connection on NIC Protocol/socket operations Packet send, same as unmodified stack Packet generation Receive processing Packet receive now goes through lookup Network Stack with Connection Handoff User Application Socket TCP Bypass IP Host OS Ethernet Driver Socket TCP NIC IP Ethernet Lookup Transmit/Receive TCP Offload Through Connection Handoff

  6. Handoff Interface • Extend driver/OS API • Move connections • Handoff (OS): move connection from OS to NIC • Restore (OS, NIC): move connection from NIC to OS • Relay socket operations between OS and NIC • Send (OS): insert send data into NIC's socket • Acknowledge (NIC): remove ack'ed data from OS's socket • Receive (NIC): insert received data into OS's socket • Received (OS): remove received data from NIC's socket • Control (OS, NIC): change socket states, etc. • Misc. • Forward (OS), Post (OS), Resource (NIC) TCP Offload Through Connection Handoff

  7. handoff Allocate connection Accept receive Receive data Enqueue data received Read data Dequeue data send Write data Enqueue data acknowledge Receive ACK Drop sent data control Receive FIN Change socket state control Send FIN Close control Destroy connection Destroy connection Example Use Accept connection, receive request, send response, close connection Host OS Handoff Command NIC TCP Offload Through Connection Handoff

  8. Real Prototype • Modified FreeBSD 4.7 • AMD Athlon XP 2800+ CPU • Alteon programmable Gigabit Ethernet NIC • 1MB memory • Limited to 256 connections • Actual socket buffer data only in main memory • 88MHz processor • Limits maximum throughput TCP Offload Through Connection Handoff

  9. Cycles (No Handoff, 1 connection) Cycles (No Handoff, 256 connections) Cycles (Handoff, 256 connections) L2 misses (No Handoff, 1 connection) L2 misses (No Handoff, 256 connections) L2 misses (Handoff 256 connections) 7000 9 8 6000 7 5000 6 4000 5 L2 misses per packet Cycles per packet 4 3000 3 2000 2 1000 1 0 TCP Send 0 System Total TCP Bypass IP Ethernet Driver Call TCP Offload Through Connection Handoff

  10. Simulated Machine • Prototype NIC is too slow • Simics full-system simulator • Boots unmodified operating systems • Use same software as real prototype • Simulated processor • 1GHz functional x86 processor • Timed memory to mimic Athlon XP 2800+ • Simulated NIC • 450MHz functional processor • Timed 1Gb/s Ethernet wire TCP Offload Through Connection Handoff

  11. No Handoff Handoff 1024 connections Static (No Handoff) Static (Handoff 1024 connections) SPECweb99, 1024 Connections 14000 15% increase in HTTP throughput (Mb/s) 12000 10000 27% increase in HTTP throughput (Mb/s) 8000 Cycles per packet 6000 4000 2000 0 System Call TCP IP Ethernet Driver Bypass Total TCP Offload Through Connection Handoff

  12. Summary • Memory behavior limits TCP performance • Connection state accesses cause cache pressure • Offload can help, but full offload is problematic • Connection handoff: offloading made practical • OS in charge of division of work • Host network stack largely unaffected • Ongoing work: OS handoff policies TCP Offload Through Connection Handoff

More Related