210 likes | 294 Views
MB - NG. High Throughput: Progress and Current Results. Lots of people helped: MB-NG team at UCL MB-NG team at Manchester Andrew McNab. What’s Involved in Performance. End Hosts NIC: operation & PCI design – use of modern PCI commands Chipset & PCIX CPU, memory & memory-bus
E N D
MB - NG High Throughput: Progress and Current Results Lots of people helped:MB-NG team at UCLMB-NG team at ManchesterAndrew McNab DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
What’s Involved in Performance • End Hosts • NIC: operation & PCI design – use of modern PCI commands • Chipset & PCIX • CPU, memory & memory-bus • OS kernel: drivers & TCP UDP IP stack • Disk sub-system: disks, controllers & interconnects • Routers • Blades: operation • Switching / routing fabric: bus, crossbar, non-blocking operation • Policies • The Network (s) • Framing • Bandwidth • Load & Congestion DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
MAN MCC OSM-1OC48-POS-SS Gigabit Ethernet 2.5 Gbit POS Access 2.5 Gbit POS core MPLS Admin. Domains SJ4 Dev SJ4 Dev SJ4 Dev PC PC SJ4 Dev UCL OSM-1OC48-POS-SS 3ware RAID0 3ware RAID0 PC PC PC PC MB – NG SuperJANET4 Development Network (22 Mar 02) DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
Initial Measurements • UDP throughput very variable • TCP throughput very variable • Lon Man < Man Lon ? • Investigations: • Packet loss several % • Errors on interface (ifconfig) • Overruns DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
txqueuelen-vs-sendstalls • TCP throughput very variable • Lon Man < Man Lon ? • throughput very variable DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
Interrupt Coalescence Investigations • TCP mem-mem lon2-man1 • Tx 64 Tx-abs 64 • Rx 0 Rx-abs 128 • 820-980 Mbit/s +- 50 Mbit/s • Tx 64 Tx-abs 64 • Rx 20 Rx-abs 128 • 937-940 Mbit/s +- 1.5 Mbit/s • Tx 64 Tx-abs 64 • Rx 80 Rx-abs 128 • 937-939 Mbit/s +- 1 Mbit/s DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
24 Hours HighSpeed TCP mem-mem • TCP mem-mem lon2-man1 • Tx 64 Tx-abs 64 • Rx 64 Rx-abs 128 • 941.5 Mbit/s +- 0.5 Mbit/s DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
Raid0 Performance (1) • Maxdor 3.5 Series DiamondMax PLus 9 120 Gb ATA/133 • WriteSlight increase with number of disks • Read • 3 Disks OK • Write 100 MBytes/s • Read 130 MBytes/s DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
Raid0 Performance (2) • Maxdor 3.5 Series DiamondMax PLus 9 120 Gb ATA/133 • No difference for Write • Larger Stripe lower the performance • Write 100 MBytes/s • Read 120 MBytes/s DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
Gridftp Throughput HighSpeedTCP • Int Coal 64 128 • Txqueuelen 2000 • TCP buffer 1 M byte(rtt*BW = 750kbytes) • Interface throughput • Acks received • Data moved • 520 Mbit/s • Same for B2B tests • So its not that simple! DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
Gridftp Throughput + Web100 • Throughput Mbit/s: • See alternate 600/800 Mbitand zero • Cwnd smooth • No dup Ack / send stall /timeouts DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
Gridftp Throughput + Web100 • Throughput Mbit/s vsRecv Window Size • Zero throughputindependent of Recv Window Size • Bytes sent • Bytes received • Waits 0.4s at start ! DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
http data transfers HighSpeed TCP • Apachie web server out of the box! • prototype client - curl http library • 1Mbyte TCP buffers • 2Gbyte file • Throughput 72 MBytes/s • Cwnd - some variation • No dup Ack / send stall /timeouts DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
http data transfers (2) • Limited by: • Sender • Receive window size DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
TCP sharing man1-lon2 mem-mem web100 • Int Coal 64 128 • Txqueuelen 500(no stall in rtt) • TCP buffer 750k byte(rtt*BW = 750k bytes) • 1 stream every 60 s: • man1 lon2 • man2 lon2 • man3 lon2 • Sample ever 10ms • Send rates: • 940 Mbit/s • 450 Mbit/s • 300 Mbit/s DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
TCP sharing man1-lon2 mem-mem web100 • Int Coal 64 128 • Txqueuelen 500(no stall in rtt) • TCP buffer 750k byte(rtt*BW = 750k bytes) • 1 stream every 60 s: • man1 lon2 • man2 lon2 • man3 lon2 • Sample ever 10ms • Time in send limit: • Sender • Cwind • Recv wind DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
TCP sharing man1-lon2 the WHY? • 1Stream: • No Dup ACKs • No SACKs • No Sendstalls • Why does Cwnd vary DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
2 TCP streams man1-lon2 - the WHY? • 2Streams: • Many Dup ACKs • Many SACKs • Why does Cwnd have large variations DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
2 TCP streams man1-lon2 - the WHY? (2) • 2Streams: • Dips in throughput due to Dup ACK • ~4 losses /sec • A bit regular ? • Cwnd decreases: • 1 point 33% • Ramp starts at 62% • Slope 70Bytes/us 1 sec DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
3 TCP streams man1-lon2 - the WHY? • 3Streams: • Dips in throughput due to Dup ACK 10 sec DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester
TCP sharing man1-lon2 - the WHY? • There is (a) correlation DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester