1 / 51

CS 838: NetFPGA Tutorial

CS 838: NetFPGA Tutorial. Theophilus Benson. Outline. Background: What is the NetFPGA ? Life cycle of a packet through a NetFPGA Demo. What is the NetFPGA?. 1GE. FPGA. 1GE. 1GE. Memory. 1GE. Networking Software running on a standard PC. CPU. Memory. PCI.

Download Presentation

CS 838: NetFPGA Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 838: NetFPGA Tutorial Theophilus Benson

  2. Outline • Background: What is the NetFPGA? • Life cycle of a packet through a NetFPGA • Demo

  3. What is the NetFPGA? 1GE FPGA 1GE 1GE Memory 1GE NetworkingSoftware running on a standard PC CPU Memory PCI A hardware accelerator built with Field Programmable Gate Arraydriving Gigabit network links

  4. Function 4 Gigabit Ethernet ports Fully programmable FPGA hardware Open-source FPGA hardware -- Verilog base design Open-source Software -- Linux user Level Drivers in C and C++ NetFPGA Router

  5. NetFPGA Platform Major Components • Interfaces • 4 Gigabit Ethernet Ports • PCI Host Interface • Memories • 36Mbits Static RAM • 512Mbits DDR2 Dynamic RAM • FPGA Resources • Block RAMs • Configurable Logic Block (CLBs) • Memory Mapped Registers

  6. NetFGPA: Router Design • Pipeline of modules • FIFO queues between each module • Inter module communication • CTRL: Send on ctrl bus (8 bits) • Metadata about the data being send • DATA: Send on data bus (64 bits) • RDY: Signifies ready to receive packet (1 bit) • WR: Signifies packet being send(1bit)

  7. Software Hardware NetFPGA Linux user-level processes Linux Processes Verilog on NetFPGA PCI board FGPA Modules 1 FGPA Modules 2

  8. Software Hardware Example: An IP Router on NetFPGA Management & CLI Linux user-level processes Routing Protocols Exception Processing Routing Table Verilog on NetFPGA PCI board Forwarding Table Switching

  9. Life of a Packet through the hardware 192.168.102.y 192.168.101.x port0 port2 IP packet

  10. Router Stages MAC RxQ CPU RxQ MAC RxQ CPU RxQ MAC RxQ CPU RxQ MAC RxQ CPU RxQ Input Arbiter Output Port Lookup Output Queues MAC TxQ CPU TxQ MAC TxQ CPU TxQ MAC TxQ CPU TxQ MAC TxQ CPU TxQ

  11. Inter-module Communication Using “Module Headers”: Ctrl Word (8 bits) Data Word (64 bits) x Module Hdr Contain information such as packet length, input port, output port, … … … y Last Module Hdr 0 Eth Hdr 0 IP Hdr 0 … 0x10 Last word of packet

  12. Inter-module Communication Module i Module i+1 data ctrl wr rdy

  13. MAC Rx Queue MAC Rx Queue Eth Hdr: Dst MAC = port 0, Ethertype = IP IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 Data

  14. Rx Queue Rx Queue 0xff Pkt length, input port = 0 0 Eth Hdr: Dst MAC = port 0, Ethertype = IP 0 IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 0 Data

  15. Input Arbiter Rx Q 7 Input Arbiter Pkt … Rx Q 1 Pkt Rx Q 0 Pkt

  16. Output Port Lookup Output Port Lookup 0xff Pkt length, input port = 0 0 EthHdr: Dst MAC = 0 Src MAC = x, Ethertype = IP 0 IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 0 Data

  17. Output Port Lookup 5- Add output port module 1- Check input port matches Dst MAC Output Port Lookup 0x04 output port = 4 6- Modify MAC Dst and Src addresses 2- Check TTL, checksum 0xff Pkt length, input port = 0 0 EthHdr: Dst MAC = 0 Src MAC = x, Ethertype = IP EthHdr: Dst MAC = nextHopSrc MAC = port 4, Ethertype = IP 3- Lookup next hop IP & output port (LPM) 0 7-Decrement TTL and update checksum IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 IP Hdr: IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2 4- Lookup next hop MAC address (ARP) 0 Data

  18. Output Queues Output Queues OQ0 OQ4 Pkt OQ7

  19. MAC Tx Queue MAC Tx Queue 0x04 output port = 4 0xff Pkt length, input port = 0 0 EthHdr: Dst MAC = nextHopSrc MAC = port 4, Ethertype = IP 0 IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 IP Hdr: IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2 0 Data

  20. MAC Tx Queue MAC Tx Queue 0x04 output port = 4 0xff Pkt length, input port = 0 0 EthHdr: Dst MAC = nextHopSrc MAC = port 4, Ethertype = IP 0 IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 IP Hdr: IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2 0 Data

  21. NetFPGA-Host Interaction • Linux driver interfaces with hardware • Packet interface via standard Linux network stack • Register reads/writes via ioctl system call (with convenience wrapper functions) • readReg(nf2device *dev, int address, unsigned *rd_data) • writeReg(nf2device *dev, int address, unsigned *wr_data) eg: readReg(&nf2, OQ_NUM_PKTS_STORED_0, &val);

  22. 2. Driver performs PCI memory read/write NetFPGA-Host Interaction Register access PCI Bus 1. Software makes ioctl call on network socket. ioctl passed to driver.

  23. NetFPGA-Host Interaction • Packet transfers shown using DMA interface • Alternative: use programmed IO to transfer packets via register reads/writes • slower but eliminates the need to deal with network sockets

  24. DEMO: Life of a Packet through the hardware 192.168.2.y 192.168.1.x port0 port2 IP packet

  25. Programming the FPGA with your code • nf2_download NF2/bitfiles/reference_router.bit • Mirror linuxarp • ./NF2/projects/router_kit/sw/rkd • Helpful tool • ./NFlib/C/router/cli • Shows forwarding tables {arp table, ip table} • Allows to modify tables

  26. Useful Links • NetFPGA Website • NetFPGA Wiki • NetFPGA Guide • Walkthrough the Reference Designs • The Verilog Golden Reference Guide

  27. Questions

  28. Verilog

  29. Hardware Description Languages • Concurrent • By Default, Verilog statements evaluated concurrently • Express fine grain parallelism • Allows gate-level parallelism • Provides Precise Description • Eliminates ambiguity about operation • Synthesizable • Generates hardware from description

  30. Verilog Data Types reg [7:0] A; // 8-bit register, MSB to LSB // (Preferred bit order for NetFPGA) reg [0:15] B; // 16-bit register, LSB to MSB B = {A[7:0],A[0:7]}; // Assignment of bits reg [31:0] Mem [0:1023]; // 1K Word Memory integer Count; // simple signed 32-bit integer integer K[1:64]; // an array of 64 integers time Start, Stop; // Two 64-bit time variables From: CSCI 320 Computer Architecture Handbook on Verilog HDL, by Dr. Daniel C. Hyde : http://eesun.free.fr/DOC/VERILOG/verilog-manual.html

  31. Signal Multiplexers • Two input multiplexer (using if / else) • reg y; • always @*   if (select)      y = a;   else      y = b; From:http://eesun.free.fr/DOC/VERILOG/synvlg.html Two input multiplexer (using ternary operator ?:) wire t = (select ? a : b);

  32. Larger Multiplexers Three input multiplexer reg s; always @*   begin   case (select2)       2'b00: s = a;      2'b01: s = b;      default: s = c;    endcase   end

  33. Din D Q Dout Clock 1 Clock Transition Clock 0 time t=0 t=1 t=2 A B C Synchronous Storage Elements • Values change at times governed by clock • Clock • Input to circuit • Clock Event • Example: Rising edge Din t=0 • Flip/Flop • Transfers Value From Din to Dout on Clock event Clock Transition Dout A B S0 t=0

  34. Finite State Machines

  35. Synthesizable Verilog : Delay Flip/Flops • D-type flip flop • reg q; • always @ (posedge clk)  q <= d; From:http://eesun.free.fr/DOC/VERILOG/synvlg.html • D type flip flop with data enable • reg q; • always @ (posedge clk)   if (enable)     q <= d;

  36. More on NetFPGA System

  37. NetFPGA System User Space Linux Kernel CAD Tools Monitor Software Web & VideoServer Browser & Video Client Packet Forwarding Table PCI-e PCI VI VI VI VI NIC NetFPGA RouterHardware GE GE GE GE GE GE (nf2c0 .. 3) (eth1 .. 2)

  38. NetFPGA System Implementation • NetFPGA Blocks • Virtex-2 Pro FPGA • 4.5MB ZBT SRAM • 64MB DDR2 DRAM • PCI Host Interface • 4 Gigabit Ethernet ports • Intranet Test Ports • Dual or Quad Gigabit Etherents on PCI-e • Internet • Gigabit Ethernet on Motherboard • Processor • Dual-Core CPU • Operating System • Linux CentOS 4.4

  39. NetFPGA Lab Setup CPU x2 Dual NIC Client Eth2 : Server PCI-e GE (eth1 .. 2) Eth1 : Local host GE Server Net-FPGA Nf2c3 : Adj. Server GE PCI NetFPGA Control SW Nf2c2 : Local Host Internet Router Hardware GE Nf2c1 : Adjacent GE Nf2c0 : Adjacent GE CAD Tools

  40. Exception Path

  41. Exception Packet • Example: TTL = 0 or TTL = 1 • Packet has to be sent to the CPU which will generate an ICMP packet as a response • Difference starts at the Output Port lookup stage

  42. Exception Packet Path CPU RxQ CPU RxQ CPU RxQ CPU RxQ MAC TxQ MAC TxQ MAC TxQ MAC TxQ PW-OSPF Java GUI Software Driver nf2c0 nf2c1 nf2c2 nf2c3 ioctl PCI Bus DMA Registers CPU TxQ CPU TxQ CPU TxQ CPU TxQ nf2_reg_grp NetFPGA user data path MAC RxQ MAC RxQ MAC RxQ MAC RxQ Ethernet

  43. Output Port Lookup 1- Check input port matches Dst MAC Output Port Lookup 0x04 output port = 1 2- Check TTL, checksum – EXCEPTION! 0xff Pkt length, input port = 0 0 EthHdr: Dst MAC = 0, Src MAC = x, Ethertype = IP 0 IP Hdr: IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4 3- Add output port module 0 Data

  44. Output Queues Output Queues OQ0 OQ1 OQ2 Pkt OQ7

  45. CPU Tx Queue CPU Tx Queue 0x04 output port = 1 0xff Pkt length, input port = 0 0 EthHdr: Dst MAC = 0, Src MAC = x, Ethertype = IP 0 IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 IP Hdr: IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4 0 Data

  46. CPU Tx Queue CPU Tx Queue 0x04 output port = 1 0xff Pkt length, input port = 0 0 EthHdr: Dst MAC = 0, Src MAC = x, Ethertype = IP 0 IP Hdr: IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4 0 Data

  47. ICMP Packet • For the ICMP packet, the packet arrives at the CPU Rx Queue from the PCI Bus • Follows the same path as a packet from the MAC until the Output Port Lookup. • The OPL module seeing the packet is from the CPU Rx Queue 1, sets the output port directly to 0. • The packet then continues on the same path as the non-exception packet to the Output Queues and then MAC Tx queue 0.

  48. ICMP Packet Path CPU RxQ CPU RxQ CPU RxQ CPU RxQ MAC TxQ MAC TxQ MAC TxQ MAC TxQ PW-OSPF Java GUI Software Driver nf2c0 nf2c1 nf2c2 nf2c3 ioctl PCI Bus DMA Registers CPU TxQ CPU TxQ CPU TxQ CPU TxQ nf2_reg_grp NetFPGA user data path MAC RxQ MAC RxQ MAC RxQ MAC RxQ Ethernet

  49. 1. Packet arrives – forwarding table sends to CPU queue 2. Interrupt notifies driver of packet arrival 3. Driver sets up and initiates DMA transfer NetFPGA-Host Interaction NetFPGA to host packet transfer PCI Bus

  50. 5. Interrupt signals completion of DMA 4. NetFPGA transfers packet via DMA NetFPGA-Host Interaction NetFPGA to host packet transfer (cont) PCI Bus 6. Driver passes packet to network stack

More Related