1 / 22

Time measurement of network data transfer

Time measurement of network data transfer. R. Fantechi , G. Lamanna 25/5/2011. Outline. Motivations Hardware setup Software tools Measurement and their (possible) interpretation Prospects. Motivations. Network transfers to L1 and L2 need low latency

Download Presentation

Time measurement of network data transfer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Time measurement of network data transfer R. Fantechi, G. Lamanna 25/5/2011

  2. Outline • Motivations • Hardware setup • Software tools • Measurement and their (possible) interpretation • Prospects

  3. Motivations • Network transfers to L1 and L2 need low latency • Forboth TEL62-PC and PC-PC transfer, do weknowhowmuchitis? • For which network protocol is it the best? • How does it depend from the computer HW? • How does it depend from the network interface? • How much is it the latency fluctuaction? GPUs are sensitive… • The knowledge of fluctuations is important to stay within the 1ms budget • Standard software monitor tools give averages • Try to use hardware signals, generated in strategic points inside the software • Correlate signals from a sender to those from a receiver

  4. Hardware setup • TwoPCswith GB I/F • A is a Pentium 4 2.4GHz • Called PCATE • B is a 2*4 core Xeon • Called PCGPU • Direct Ethernet connection on hidden network • Each PC is equipped with a Parallel port I/F • It is used to generate timing pulses • Lecroy scope • Time measurements • Histograms • Storage of screenshots PCATE PCGPU Adapter for the parallel port

  5. Software tools • Investigate three “protocols” • Raw Ethernet packets (socketPF_PACKET, SOCK_RAW) • IP packets (socket PF_INET, SOCK_RAW) • TCP packets (socket PF_INET, SOCK_STREAM) • Three pairs of simple senders/receivers • The sender • Gets from the command line packet size, number of packets, delay between packets , downscaling factor (see later) • Initialize the socket and go in a tight loop, with a delay inside • Inside the loop, before and after the send command, write a pulse on the parallel port • The receiver • After inizialization, go in a receive loop and write a pulse on the parallel port after having received a packet

  6. Code example /* Create rawsocket */ sock = socket(AF_INET, SOCK_RAW, PROBE_PROT); if (sock < 0) { perror("opening rawsocket"); exit(1); } …………………………. if (iloop<0) iloop = 1000000000; for (i=0;i<iloop;i++) { if (i%50==0) { buf[0]=0x01; out=0x01; outb(out,0x378); out=0x00; outb(out,0x378); } else buf[0]=0x00; if (sendto(sock, buf, buflen,0,&server,sizeof(structsockaddr_in)) < 0) perror("writing on stream socket"); out=0x02; outb(out,0x378); out=0x00; outb(out,0x378); for (k=0; k<conv_time; k++); } • Sender /* Create socket */ sock = socket(AF_INET, SOCK_RAW, PROBE_PROT); if (sock < 0) { perror("opening streamsocket"); exit(1); } …………………. int kk=0; serv_size = sizeof(server); do { if ((rval = recvfrom(sock, buf, BUFFER_SIZE,0,(structsockaddr *)&server,&serv_size)) < 0) perror("readingstreammessage"); i = 0; if (rval == 0) printf("Endingconnection\n"); else { if(rval== BUFFER_SIZE) { outb(0x01,0x378); outb(0x00,0x378); } ("-->%d\n", rval); } while (rval != 0); • Receiver Send a pulse Delay loop

  7. Software tools • Maximum rate • On the sender, some timeisspentfor the code execution • The minimum achievable repetition rate between packets varies from ~6 ms to ~10 ms • Depending on machine speed, type of protocol, etc • Downscaling factor • Needed to operate properly the scope at high rates • If the loop index modulo the downscaling factor is 0, send in the packet the pattern to be written by the receiver on the parallel port, otherwise 0 • Packets are sent at the specified rate, but the scope registers only a fraction • Additional tools used • Wireshark and Tcpdump to check packet arrival • Ifconfig and /proc/interrupts to count packet and interrupt loss

  8. Basicmethodcheck • Are these pulse reliable? • A simple check: histogram the width of the pulse generated by the sender • Pulse width: ~1.22 ms , sdev 0.04 ms, watch out the maximum

  9. Parametersused in the tests • Packet size • Small packets (200 bytes) or large packets (1300 bytes) • Protocols • 3 as mentioned before • Delay between packets • Usually from 10 ms down to the minimum • Typical sequence: 10, 5, 2, 1 ms, 100, 50, 20, 10 ms • Measurements • Store interesting screenshots • Record time difference, sigma, max value • Time difference = time of rx pulse – time of tx pulse

  10. Lost packets and interrupts • No lost packets observed at any rate • Checked with ifconfig at source and destination • Interrupt behaviour via /proc/interrupts • At high rates the number of interrupts decreases • Well known phenomenon of “interrupt coalescence” in the driver • Packets received too fast are buffered and the CPU interrupted only once • For TCP at high rates and 200 bytes buffers, interrupts are reduced also because TCP puts many buffers in an Eth packet • Anyway, measuring TCP performances is more difficult as the protocol has the freedom of segmenting user buffers as it likes (i.e. flow control)

  11. RX interrupts - PCGPU

  12. Interrupt coalescence Two examples, at 15 ms (left) and 12 ms (right) 1300 bytes, PCATE->PCGPU

  13. CPU usage Sender Receiver

  14. Time across sendto Time difference btw a pulse after sendto and one before – The machine is the same

  15. Time across sendto - Fluctuations Count how many times the time is over 20 ms (wrt all times) Raw ~5/26000 IP ~13/26000 TCPmin ~8/20000 (1 ms) max ~402/20000 (100 ms) - 1300 bytes 18/26000 - 200 bytes On PCATE as sender Quiet example Moving the mouse… Only 15 > 4500

  16. Transfer time As a function of time, different buffer sizes Critical zone

  17. Transfer time As a function of packet size, different times, PCATE->PCGPU

  18. Transfer time PCATE -> PCGPU, raw, 1300 bytes 5 ms 2 ms 1 ms 200 ms 100 ms 500 ms

  19. Transfer time 5 ms PCGPU->PCATE ~8 ms 200 bytes 1300 bytes

  20. Transfer time trending PCGPU->PCATE, raw 200 bytes 50 ms 1000 bytes 50 ms 1300 bytes 40 ms 1300 bytes 20 ms 200 bytes 20 ms 1000 bytes 20 ms

  21. Summary • Hardware timing system • Reliable, not interfering with the measurement (at level of max 10 ms) • Time spent in the sender • A fraction (<10%) of the total transfer time • Varies with the protocol type • Stable with the packet rate • Transfer time • Down to 50 ms varies a little as a function of packet rate • Between 50 and 120 ms • Below 20 ms it increases (up to 2 ms) for raw, but not for IP • This setup is not working below ~10 ms • Where we are most interested

  22. To be done • Complete the measurement • Both directions • All protocols (TCP, maybe new ones) • Performance as a function of CPU power • Use different PCs • Add load on the machines • Test multiple I/F and switches • Change the sender to an object driven by an FPGA • TEL62 or TALK • Investigate different protocol features • New protocols or switch features of the old ones • Test more complex transfer sw (i.e. TDBIO) • Some work hopefully done by USA summer students…

More Related