120 likes | 243 Views
Experiences from SLAC SC2004 Bandwidth Challenge. Les Cottrell, SLAC www.slac.stanford.edu/grp/scs/net/talk03/bwc-04.ppt. Sponsors/partners. SLAC/SC04 Bandwidth Challenge (plan C). PSC SciNet. 2 Sun Opteron/Chelsio-10GE. SLAC/FNAL booth (2418). 6 Boston file servers 1 GE.
E N D
Experiences from SLAC SC2004 Bandwidth Challenge Les Cottrell, SLAC www.slac.stanford.edu/grp/scs/net/talk03/bwc-04.ppt
SLAC/SC04 Bandwidth Challenge (plan C) PSC SciNet 2 Sun Opteron/Chelsio-10GE SLAC/FNAL booth (2418) 6 Boston file servers 1 GE 2 Sun file server 1 GE 2 Sun Opteron/S2io-10GE Loaned Cisco Rtr 6 Sun Opteron/Chelsio-10GE 10Gbps from NLR (via SEA, DEN, CHI) SLAC Cisco Rtr NLR-PITT-SUNN-10GE-17 NLR-PITT-SUNN-10GE-17 Juniper T320 NLR demarc NLR demarc 1 Sun file server 1 GE 15808 15808 15540 15454 ESnet/QWest OC192/SONET Sunnyvale/Level(3)/NLR 1 Sun Opteron/Chelsio-10GE 1380 Kifer Sunnyvale/Qwest/ESnet 1400 Kifer
SC2004: Tenth of a Terabit/s Challenge • Joint Caltech, SLAC, FNAL, CERN, UF, SDSC, BR, KR, …. • 10 10 Gbps waves to HEP on show floor • Bandwidth challenge: aggregate throughput of 101.13 Gbps • FAST TCP
Components 10 Gbps NICs v20z Chelsio SR XENPAK S2io 1982 10Mbps 3COM v40z SVL/NLR 3510 disk array
Challenge aggregates from SciNet • Aggregate Caltech & SLAC booth, in & out • 7 lambdas to Caltech, 3 ro SLAC
Challenge aggregates from MonALISA • Sustained ~ 10Gbps for extended periods
To/From SLAC booth • NLR: 9.43Gbps (9.07 goodput) + 5.65Gbps (5.44Gbps goodput) in reverse • Two hosts to two hosts • ESnet: 7.72Gbps (7.43Gbps goodput) • Only one 10Gbps host at SVL • Single V40Z host with 2*10GE NICs to 2*V20Z across country got 11.4Gbps • S2io and Chelsio (& Cisco & Juniper) all interwork • Chelsio worked stably on uncongested paths
TOE • Chelsio had TCP Offload Engine • Utilization factor of throughput & parallel streams • Reduced cpu c.f. S2io non0TOE by factor ~ 3
Challenges • Could not get 10Gbps waves to SLAC only SVL • Equipment in 3 locations • Keeping configs in lock-step (no NFS, no name service) • Security concerns, used iptables • Machines only available 2 weeks before, some not until we got to SC04 • Jumbo frames not configured correctly at SLAC booth, used 1500B frames mainly • Mix of hdw/swr: Opterons with various GHz & disks, Xeons; Solaris 10, Linux 2.4, 2.6 • Coordination between booths (sep by 100 yds) • Everything state of art (Linux 2.6.6, SR XENPAKs, NICs