180 likes | 408 Views
R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet Domenico Galli Università di Bologna and INFN, Sezione di Bologna XII SuperB Project Workshop, Annecy-les-Vieux, 18 th March, 2010.
E N D
R&D on data transmission FPGA → PCusing UDP over 10-Gigabit Ethernet Domenico Galli Universitàdi Bologna and INFN, Sezionedi Bologna XII SuperB Project Workshop, Annecy-les-Vieux, 18th March, 2010
More and more often used in HEP for DAQ, Event Building and High Level Trigger Systems: Limited costs; Maintainability; Upgradability. Demand of data throughput in HEP is increasing following: Physical event rate; Number of electronic channels; Reduction of the on-line event filter (trigger) stages. Industry has moved on since the design of the DAQ for the LHC experiments: 10 Gigabit Ethernet well established; 4x DDR Infiniband(16 Gb/s) ready; 100 Gigabit Ethernet is being actively worked on. Commodity Links DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
Evaluation of New Commercial Link Technologies • Bologna group, in its spare time, is constantly evaluating new commodity link technologies: • In the perspective of an employment in DAQ/EB/HLT. • Evaluated parameters: • Maximum throughput; • Maximum datagram rate; • CPU load; • Datagram loss rate. • Recently tested links: • Gigabit Ethernet (presented at IEEE RT-05); • 10-Gigabit Ethernet (presented at IEEE RT-09); • Infiniband (2010). • Choice of technology for the experiment must be delayed as much as possible. DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
10-GbE Point-to-Point Tests • We start technology evaluation from PC-to-PC tests. • NIC mounted on the PCI-E bus of commodity PCs as transmitters and receivers. • In real operating condition, maximum transfer rate limited not only by the capacity of the link itself, but also: • by the capacity of the data busses (PCI and FSB/QPI); • by the ability of the CPUsand of the OSto handle packet processing and interrupt ratesraised by the network interface cards in due time. 10GBase-SR DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
10-GbE Network I/O • “Fast network, slow host” scenario: • Already seen in transitions to 1 Gigabit Ethernet: • 3 major system bottlenecks may limit the efficiency of high-performance I/O adapters: • The peripheral bus bandwidth: • PCI-X (peak throughput 8.5 Gbit/s in 133 MHz flavor) substituted by the PCI-E, (20 Gbit/s peak throughput in x8 flavor). • The memory bandwidth: • FSB has increased the clock from 533 MHz to 1600 MHz and then substituted by AMD Hypertransportand Intel QuickPath Interconnect. • The CPU utilization: • Multi-core architectures. DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
CPU Affinity Settings DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
CPU Affinity Settings (II) DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
UDP protocol • UDP/IP protocol is the simplestIP protocol that can be implemented in a FPGA. • It does not hide the network problems at lower layers. • SCTP/IP(Stream Control Transmission Protocol)could be an alternative. • TCP/IP is too complex: • Need thousands of connections (and buffers) to be kept open on the FPGA side. • Too many mechanism which slow down the data flow to be tuned: • Congestion control, slow start, sliding windows, retransmission timer, Nagle’s algorithm, etc. • Large protocol overhead. • Retransmission timer to be tuned in order to keep the latency low. • Experience in DAQ shows that aprotocol stackas complete as possible is very useful to simplify debugging in commissioning phase: • Including ARP,RARP, ICMP (ping), etc. DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
User System IRQ Soft IRQ Total UDP – Standard Frames • 1500 B MTU (Maximum Transfer Unit). • UDP datagrams sent as fast as they can be sent. • Bottleneck: sender CPU core 2 (sender process 100% system load). ~ 4.8 Gb/s softIRQ(4/5) IRQ(1/5) 100%(bottleneck) fake softIRQ ~ 440 kHz softIRQ(~50%) 3 frames 2 frames 4 frames system(~50%) DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
User System IRQ Soft IRQ Total UDP – Jumbo Frames • 9000 B MTU. • Sensible enhancement with respect to 1500 MTU. 3 PCI-Eframes ~ 9.7 Gb/s softIRQ(4/5) IRQ(1/5) 2 PCI-Eframes 2 frames 3 frames 4 frames 100%(bottleneck) fake softIRQ ~ 440 kHz 2 frames 3 frames 4 frames softIRQ(~50%) 3 PCI-Eframes 2 PCI-Eframes system(~50%) DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
User System IRQ Soft IRQ Total UDP – Jumbo Frames2 Sender Processes • Doubled availabilityof CPU cycles to the senderPC. • 10GbEfully saturated. • Receiver (playing against 2 senders) not yet saturated. ~ 10 Gb/s 4 frames 3 frames 2 frames ~3 KiB softIRQ(4/5) IRQ(1/5) ~5 KiBno more CPU bottleneck fakesoftIRQ 200%(bottleneck) ~ 600 kHz 2 frames 3 frames 4 frames softIRQ(25-75%) ~3 KiB system(75-90%) DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
R&D Project • A R&D project (PRIN) has been funded by Italian Education and Research Ministry (MIUR): • TeraDAQ: • protype demonstrator of a high-performance data acquisition system based on a PC cluster and using ultra-high speed networking standards. The project targets particle physics experiments on next-generation accelerators of very high luminosity. • INFN Bologna, Bologna University and Roma Tor Vergata University. • 51,700 €. DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
Electronics • Evaluation kit Xilinx ML605: • Equipped with last generation Virtex-6 Xilinx FPGA; • FPGA Mezzanine Connector (FMC). • Connectivity board FMC XM104: • 10-GbE CX4 connector. Xilinx Virtex-6 FPGA ML605 Evaluation board connector CX4 FPGA Virtex-6 Xilinx Mezzanine FMC XM104 connectivity card Xilinx PC 10 GbE Software VHDL UDP/IP Software core 10-GbE MAC Software core XAUI SERDES 10GBASE-CX4 (max 10 m) FMC 10 Gb/s DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
Electronics (II) • FMC XM104 Connectivity Card: • designed to provide access to eight serial transceivers on the FMC HPC connector found on Xilinx FMC-supported boards including Virtex-6 ML605. ML605 board DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
Software • XAUI SERDES and 10-GbE MAC: • Available as evaluation software for free. • UDP/IP: • Evaluating possible solutions. DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
Domenico Galli DipartimentodiFisica, Alma Mater Studiorum - Universitàdi Bologna and INFN, Sezionedi Bologna domenico.galli@bo.infn.it http://www.unibo.it/docenti/domenico.galli
Test Platform DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet
Settings DOMENICO GALLI - R&D on data transmission FPGA → PC using UDP over 10-Gigabit Ethernet