160 likes | 320 Views
Network Measurement & Characterisation and the Challenge of SuperComputing SC200x. Bandwidth Lust at SC2003. Working with S2io Cisco & folks. The SC Network. At the SLAC Booth Running the BW Challenge. The Bandwidth Challenge at SC2003.
E N D
Network Measurement & Characterisation and the Challenge of SuperComputing SC200x ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
Bandwidth Lust at SC2003 • Working with S2io Cisco & folks • The SC Network • At the SLAC BoothRunning theBW Challenge ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
The Bandwidth Challenge at SC2003 • The peak aggregate bandwidth from the 3 booths was 23.21Gbits/s • 1-way link utilisations of >90% • 6.6 TBytes in 48 minutes ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
Multi-Gigabit flows at SC2003 BW Challenge • Three Server systems with 10 Gigabit Ethernet NICs • Used the DataTAG altAIMD stack 9000 byte MTU • Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to: • Pal Alto PAIX • rtt 17 ms , window 30 MB • Shared with Caltech booth • 4.37 Gbit HighSpeed TCP I=5% • Then 2.87 Gbit I=16% • Fall when 10 Gbit on link • 3.3Gbit Scalable TCP I=8% • Tested 2 flows sum 1.9Gbit I=39% • Chicago Starlight • rtt 65 ms , window 60 MB • Phoenix CPU 2.2 GHz • 3.1 Gbit HighSpeed TCP I=1.6% • Amsterdam SARA • rtt 175 ms , window 200 MB • Phoenix CPU 2.2 GHz • 4.35 Gbit HighSpeed TCP I=6.9% • Very Stable • Both used Abilene to Chicago ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
Super Computing 2004 ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
UKLight at SC2004 • UK e-Science Researchers from Manchester, UCL & ULCC involved in the Bandwidth Challenge • Collaborated with Scientists & Engineers from Caltech, CERN, FERMI, SLAC, Starlight, UKERNA & U. of Florida • Worked on: • 10 Gbit Ethernet link from SC2004 to ESnet/QWest PoP in Sunnyvale • 10 Gbit Ethernet link from SC2004 and the CENIC/NLR/Level(3) PoP in Sunnyvale • 10 Gbit Ethernet link from SC2004 to Chicago and on to UKLight • UKLight focused on disk-to-disk transfers between UK sites and Pittsburgh • UK had generous support from Boston Ltd who loaned the servers • The BWC Collaboration had support from: • S2io NICs • Chelsio TOE • Sun who loaned servers • Essential support from Cisco ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
Collaboration at SC2004 • Working with S2io, Sun, Chelsio • Setting up the BW Bunker • SCINet ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones • The BW Challenge at the SLAC Booth
The Bandwidth Challenge – SC2004 • The peak aggregate bandwidth from the booths was 101.13Gbits/s • Or 3 full length DVD per second • Saturated TEN 10GE waves • SLAC Booth: Sunnyvale to Pittsburgh, LA to Pittsburgh and Chicago to Pittsburgh (with UKLight). ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
MB-NG Managed Bandwidth Amsterdam SC2004 UKLIGHT – Focused on Disk-to-Disk Manchester SLAC Booth SC2004 Cisco 6509 MB-NG 7600 OSR Caltech Booth UltraLight IP UCL network UCL HEP NLR Lambda NLR-PITT-STAR-10GE-16 ULCC UKlight K2 K2 Ci UKlight 10G Four 1GE channels Ci CERN 7600 UKlight 10G Surfnet/ EuroLink 10G Two 1GE channels Chicago Starlight K2 ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
Transatlantic Ethernet: TCP Throughput Tests • Supermicro X5DPE-G2 PCs • Dual 2.9 GHz Xenon CPU FSB 533 MHz • 1500 byte MTU • 2.6.6 Linux Kernel • Memory-memory TCP throughput • Standard TCP • Wire rate throughput of 940 Mbit/s • First 10 sec • Work in progress to study: • Implementation detail • Advanced stacks • Packet loss • Sharing ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
Transatlantic Ethernet: disk-to-disk Tests • Supermicro X5DPE-G2 PCs • Dual 2.9 GHz Xenon CPU FSB 533 MHz • 1500 byte MTU • 2.6.6 Linux Kernel • RAID0 (6 SATA disks) • Bbftp (disk-disk) throughput • Standard TCP • Throughput of 436 Mbit/s • First 10 sec • Work in progress to study: • Throughput limitations • Help real users ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
10 Gigabit Ethernet: UDP Throughput Tests • 1500 byte MTU gives ~ 2 Gbit/s • Used 16144 byte MTU max user length 16080 • DataTAG Supermicro PCs • Dual 2.2 GHz Xenon CPU FSB 400 MHz • PCI-X mmrbc 512 bytes • wire rate throughput of 2.9 Gbit/s • CERN OpenLab HP Itanium PCs • Dual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz • PCI-X mmrbc 512 bytes • wire rate of 5.7 Gbit/s • SLAC Dell PCs giving a • Dual 3.0 GHz Xenon CPU FSB 533 MHz • PCI-X mmrbc 4096 bytes • wire rate of 5.4 Gbit/s ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
mmrbc 512 bytes mmrbc 1024 bytes mmrbc 2048 bytes CSR Access PCI-X Sequence Data Transfer Interrupt & CSR Update mmrbc 4096 bytes 5.7Gbit/s 10 Gigabit Ethernet: Tuning PCI-X • 16080 byte packets every 200 µs • Intel PRO/10GbE LR Adapter • PCI-X bus occupancy vs mmrbc • Measured times • Times based on PCI-X times from the logic analyser • Expected throughput ~7 Gbit/s • Measured 5.7 Gbit/s ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
10 Gigabit Ethernet: SC2004 TCP Tests • Sun AMD opteron compute servers v20z • Chelsio TOE Tests between Linux 2.6.6. hosts • 10 Gbit ethernet link from SC2004 to CENIC/NLR/Level(3) PoP in Sunnyvale • Two 2.4GHz AMD 64 bit Opteron processors with 4GB of RAM at SC2004 • 1500B MTU, all Linux 2.6.6 • in one direction 9.43G i.e. 9.07G goodput • and the reverse direction 5.65G i.e. 5.44G goodput • Total of 15+G on wire. • 10 Gbit ethernet link from SC2004 to ESnet/QWest PoP in Sunnyvale • One 2.4GHz AMD 64 bit Opteron each end • 2MByte window, 16 streams, 1500B MTU, all Linux 2.6.6 • in one direction 7.72Gbit/s i.e. 7.42 Gbit/s goodput • 120mins (6.6Tbits shipped) • S2io NICs with Solaris 10 in 4*2.2GHz Opteron cpu v40z to one or more S2io or Chelsio NICs with Linux 2.6.5 or 2.6.6 in 2*2.4GHz V20Zs • LAN 1 S2io NIC back to back: 7.46 Gbit/s • LAN 2 S2io in V40z to 2 V20z : each NIC ~6 Gbit/s total 12.08 Gbit/s ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones
UKLight and ESLEA • Collaboration forming for SC2005 • Caltech, CERN, FERMI, SLAC, Starlight, UKLight, … • Current Proposals include: • Bandwidth Challenge with even faster disk-to-disk transfers between UK sites and SC2005 • Radio Astronomy demo at 512 Mbit user data or 1 Gbit user dataJapan, Haystack(MIT), Jodrell Bank, JIVE • High Bandwidth linkup between UK and US HPC systems • 10Gig NLR wave to Seattle • Set up a 10 Gigabit Ethernet Test Bench • Experiments (CALICE) need to investigate >25 Gbit to the processor • ESLEA/UKlight need resources to study: • New protocols and congestion / sharing • The interaction between protcol processing, applications and storage • Monitoring L1/L2 behaviour in hybrid networks ESLEA Bedfont Lakes Dec 04Richard Hughes-Jones