170 likes | 284 Views
High Rate Event Building with Gigabit Ethernet. Introduction Transport protocols Methods to enhance link utilisation Test bed measurements Conclusions. A. Barczyk, J-P Dufey, B. Jost, N. Neufeld CERN, Geneva. Introduction. Typical applications of Gigabit networks in DAQ:
E N D
High Rate Event Building with Gigabit Ethernet • Introduction • Transport protocols • Methods to enhance link utilisation • Test bed measurements • Conclusions A. Barczyk, J-P Dufey, B. Jost, N. Neufeld CERN, Geneva RT2003, 22.05.03
Introduction • Typical applications of Gigabit networks in DAQ: • Fragment sizes O(kB) • Fragment rates O(10-100 kHz) • Good use of protocols (high user data occupancy) • At higher rates • Frame size limited by link bandwidth • Protocol overheads sizeable • User data bandwidth occupancy becomes smaller • We studied the use of Gigabit Ethernet technology for 1MHz readout • Ethernet protocol (Layer 2) • IP protocol (Layer 3) RT2003, 22.05.03
8 6 6 2 46…1500 4 Protocols – Ethernet • Ethernet (802.3) frame format: • Total overhead: 26 Bytes (fixed) • At 1 MHz: Preamble Dst.Addr. Src.Addr. Type Payload FCS RT2003, 22.05.03
Protocols - IP • IP (over Ethernet) frame format • Total overhead: 46 Bytes (26 Eth. + 20 IP) • At 1 MHz: • A consideration for the choice of the switching hardware 22 20 26…1480 4 Ethernet Header IP Header Payload FCS RT2003, 22.05.03
Protocol overheads & occupancy • Max. fragment payload given by • L = link load [0,1] • F = Frame rate [Hz] • ov = protocol overhead [B] RT2003, 22.05.03
… FE FE FE SWITCH … FE FE FE SWITCH Fragment aggregation • No higher level protocols (only Layer 2/3) • Avoid congestion in switch (packet drop) • Lower link occupancy (70%) • Need to enhance user data bandwidth occupancy • 2 Methods: • Aggregation of consecutive event fragments (vertical aggregation) • Aggregation of fragments from different sources (horizontal aggregation) RT2003, 22.05.03
Vertical aggregation • In first approach, each event fragment is packed into one Ethernet frame • Aggregating at source N events into one frame reduces overhead by (N-1)x ov bytes • Implementation: front end hardware (FPGA) • Higher user data occupancy ( [N-1]x ov Bytes less overhead ) • Reduced frame rate (by factor 1/N) • Increase in latency (1st event has to wait for Nth event for transmission) • Larger transport delays (longer frames) • N limited by max. Ethernet frame length (segmentation re-introduces overheads) RT2003, 22.05.03
Horizontal aggregation • Aggregate fragments from several sources (N:1) • Increase output bandwidth by use of several output ports (N:M multiplexing) • Implementation: dedicated Readout Unit between Front-End and switch • Higher user data occupancy ( [N-1]x ov Bytes less overhead ) • Reduced frame rate (by factor 1/M) • No additional latency in event building • Needs dedicated hardware (e.g. Network Processor based) with enough processing power to handle full input rate RT2003, 22.05.03
Case Studies • We have studied horizontal aggregation in a test bed using the IBM NP4GS3 Network Processor reference kit • 2 cases: • 2:2 multiplexing on single NP • 4:2 multiplexing with 2 NPs RT2003, 22.05.03
2:2 Multiplexing - Setup • We used one NP to • Aggregate frames from 2 input ports (on ingress): • strip off headers • Concatenate payloads • Distribute combined frames on 2 output ports (round-robin) • Second NP generated frames with • Variable payload, and at • Variable rate Multiplexing Input @ 1 MHz Output @ 0.5MHz Generation RT2003, 22.05.03
Aggregation Code Slot address given by Evt Nr • Multi-threaded code (hardware dispatch) • Independent threads, but • Shared memory for bookkeeping (circular buffer to store Frame Control Block information) • Frames are merged when two fragments with same event nr arrived: • Data store coprocessor to fetch blocks of 64 B into (thread-) local memory • String copy coprocessor to re-arrange data • All coprocessor calls asynchronous EVT NR FCBA Local Memory RT2003, 22.05.03
2:2 Multiplexing - Results • Link load at 1 MHz input rate: • Single output port: • Link load above 70% for > 30B input fragment payload • Two output ports: • load per link is below 70% at up to ~75 B payload (theory) • Measured up to 56 B input payload:500kHz output rate per link56% output link utilization • Perfect agreement with calculations • To be extended to higher payloads 100% load 70% load RT2003, 22.05.03
4:2 Multiplexing • Use 2:2 blocks to perform 4:2 multiplexing with 2 NPs • Each processor • Aggregates 2 input fragments on ingress • Sends every 2nd frame to “the other” NP • Aggregates further on egress (at half rate, twice the payload) DASL Ingress 2:2 Egress 2:1 Ingress 2:2 Egress 2:1 Ethernet Input @ 1 MHz Output @ 0.5MHz RT2003, 22.05.03
4:2 Test bed 1 x 0.5 MHz • Run full code on one NP (ingress & egress processing) • Used second processor to generate traffic: • 2 x 1 MHz over Ethernet • 1 x 0.5 MHz over DASL (double payload) • Sustained aggregation at 1 MHz input rate with up to 46 Bytes input payload(output link occupancy: 84% per link) • Only fraction of processor resources used (8 out of 32 threads on average) DASL Ingress 2:2 Egress 2:1 Generation Ethernet 2 x 1 MHz Output @ 0.5MHz RT2003, 22.05.03
Conclusions • At 1 MHz, protocol overheads “eat up” significant fraction of link bandwidth • 2 methods proposed for increasing bandwidth fraction for user data and reducing packet rates: • Aggregation of consecutive event fragments • Aggregation of fragments from different sources • N:M multiplexing increases total available bandwidth • Test bed results confirm calculations for aggregation and multiplexing: “Horizontal” aggregation and 4:2 Multiplexing with Network Processors feasible at 1MHz RT2003, 22.05.03
Appendix A: Ingress Data Structures FCBs BCBs Buffers • Ethernet frames have assigned • 1 Frame control Block • Linked list of Buffer control Blocks • Data stored in buffers of 64 Bytes • Ingress Data Store: • Internal • 2048 FCBs/BCBs/Buffers Current 2:2 implementation: merged frames up to 2 buffers max. 57 Bytes input payload RT2003, 22.05.03
Appendix B: Egress Data Structures CH DASL-FH • Linked list of “twin”-buffers • Reflects DASL cell structure • Data stored together with “bookkeeping” information • Engress Data Store: • External (DDRAM) • Up to 512k twin-buffers LP Current 4:2 implementation: merged frames up to 2 twin-buffers max. 56 Bytes input payload CH: Cell Header FH: Frame Header LP: Link Pointer RT2003, 22.05.03