250 likes | 263 Views
Ultimate Integration. Joseph Lappa Pittsburgh Supercomputing Center ESCC/Internet2 Joint Techs Workshop. Agenda. Supercomputing 2004 Conference Application Ultimate Integration Resource Overview Did it work? What did we take from it?. Supercomputing 2004. Annual Conference
E N D
Ultimate Integration Joseph Lappa Pittsburgh Supercomputing Center ESCC/Internet2 Joint Techs Workshop
Agenda • Supercomputing 2004 Conference • Application • Ultimate Integration • Resource Overview • Did it work? • What did we take from it?
Supercomputing 2004 • Annual Conference • Supercomputers • Storage • Network hardware • Original reason for application • Bandwidth Challenge • Didn’t apply due to time
Application Requirements • Runs on Lemieux (PSC’s supercomputer) • Application Gateways (AGW) • Cisco CRS-1 • 40Gb/sec OC-768 cards • Few exist • Single application • Be used with another demo on the show floor if possible
Ultimate Integration Application • Checkpoint Recovery System • Program • Garden variety Laplace solver instrumented to save its memory state in checkpoint files • Checkpoints memory to remote network clients • Runs on 34 Lemieux nodes
Lemieux TCS System • 750 Compaq Alphaserver ES45 nodes • SMP • Four 1GHz Alpha Processors • 4 GB of Memory • Interconnection • Quadrics Cluster Interconnect • Shared memory library
Application Gateways • 750 GigE connections are very expensive • Reuse Quadrics network to attach cheap Linux boxes with GigE • 15 AGWS • Single processor Xeons • 1 Quadrics card • 2 Intel GigE • Each GigE card maxes out at 990Mb/sec • Only need 30 GigE to fill link to Teragrid • Web100 kernel
Network • Cisco 6509 • Sup720 • WS-X6748-SFP • Two WS-X6704-10GE • Used 4 10GE interfaces • OSPF load balancing was my real worry • >30 GE streams over 4 links
Network • Cisco CRS-1 • 40 Gb/sec slot • 16 slots • For Demo • Two OC-768 cards • Ken Goodwin’s and Kevin McGratten’s big worry was the OC-768 transport • Two 8 Port 10 GE cards • Running production IOS-XR code • Had problems with tracking hardware • Ran both without 2 Switching Fabrics with no effects on traffic
Network • Cisco CRS-1 • One at Westinghouse Machine Room • One on show floor • Fork lift needed to place it • 7 feet tall • 939 lbs empty • 1657 lbs fully loaded
The Magic Box • Stratalight – OTS 4040 transponder “compresses” the 40Gbs signal to fit into the spectral bandwidth of a traditional 10G wave • http://www.stratalight.com/ • Uses proprietary encoding techniques • The Stratalight transponder was connected to the Mux/DMUX of the 15454 as an alien wavelength
Time Dependences • OC-768 wasn’t worked on until one week before the conference
Where Does the Data Land? • Lustre Filesystem • http://www.lustre.org/ • Developed by Cluster File Systems • http://www.clusterfs.com/ • POSIX compliant, Open Source, parallel file system • Separates metadata and data objects to allow for speed and scaling
The Show Floor • 8 Checkpoint Servers with a 10GigE and Infiniband connections • 5 Lustre OSTs connected via Infiniband with 2 SCSI disk shelves (RAID5) • Lustre meta-data server (MDS) connected via Infiniband
How well did it run? • Laplace Solver w/ Checkpoint Recovery • Using 16 Application Gateways (32 GigE connections): 31.1Gbs • Only 32 Lemieux nodes were available • IPERF • Using 17 Application Gateways + 3 single GigE attached machines: 35 Gbs • Zero SONET errors reported on interface • Over 44TB were transferred
Just Demoware? • AGWs • qsub command now has AGW option • Can do accounting (and possibly billing) • Mysql database with Web100 stats • Validated that AGW was cost effective solution • OC-768 Metro can be done by mere mortals
Just Demoware?? • Application receiver • Laplace solver ran at PSC • Checkpoint receiver program tested / run at both NCSA and SDSC • Ten IA64 compute nodes as receiver • ~10 Gb/sec Network to Network (/dev/null) • 990 Mb/sec * 10 streams