240 likes | 374 Views
CLAN A High Performance, Dual Operating System Compute Server Based on Intel Processors. Kathy Benninger John Kochmar Mike Levine Juan Leon Paul Nowoczynski J. Ray Scott Pittsburgh Supercomputing Center. History/Background Design Issues Prototype Software Production Cluster
E N D
CLANA High Performance, Dual Operating System Compute Server Based on Intel Processors Kathy Benninger John Kochmar Mike Levine Juan Leon Paul Nowoczynski J. Ray Scott Pittsburgh Supercomputing Center
History/Background Design Issues Prototype Software Production Cluster Research Areas Agenda
The Pittsburgh Supercomputing Center • Past — One of four NSF-fundedcenters in the “Centers” program • Present — • Active in providing high-end cycles for science research • Advancing Computational Science • Education
PSC, continued • Organizational unit of Carnegie Mellon University • Joint operation by CMU, University of Pittsburgh, and Westinghouse Electric Company • www.psc.edu
PSC Partners • Department of Energy • National Institutes of Health • NIST • National Science Foundation • State of Pennsylvania • Industrial Affiliates
PSC Workstation Cluster History • 1988 - VAX 2000 • 1992 - Cluster • Digital MIPS-based workstations • 1994 - SuperCluster • Digital Alpha-based workstations
Intel Project History • Spring, 1997Intel Technology for Education 2000 Grant • Microsoft Support • Prototype Hardware, 1998 • Production Hardware, 1999
Design Issues • SMP vs Single CPU • Quad vs Dual • Linux vs NT • Why pick!! • Management • emphasize build replication • NT Screen-centric design • Interconnect -VIA Spec on the horizon First study the enemy. Seek weakness. -- Romulan Commander, stardate 1709.2
Prototype • Dual Boot • Replication • MPI • Early Demonstrations • Scientist Development Systems
Software • Base System • NT 4.0 • Linux RH 5.2 Kernel 2.2.2 • Language Support • Digital Visual Fortran (NT) • MS Visual C/C++ (NT) • File system • NFS • SAMBA Madness has no purpose. Or reason. But it may have a goal. -- Spock, stardate 3088.7
Software, cont • Message Passing • MPI • PVM • NT threads • Scheduler • Compaq Batch Scheduler • PBS • (more at the end…)
Management • Software Installation • Replication • Configuration Control • User Administration • Verification • snmp / Java applet • Remote Management Conquest is easy. Control is not. -- Kirk
Production Cluster • Hardware Delivered January 1999 • Installation in Mellon Institute Building We Klingons believe as you do -- the sick should die. Only the strong should live.
Cluster Hardware Detail • Quad 400MHz Pentium II Xeon Processor • 512KB L2 cache • 1GB memory • One 18GB disk • Rackmounted cases, 7 PCI slots: 4 64-bit, 3 32-bit • Dell Remote Assistant card version 2.0 • 14/32X SCSI CD-ROM • Intel EtherExpress PRO/100B PCI fast Ethernet LAN card • (RAID SCSI adapter -- not fully utilized yet)
Cluster Hardware Detail (cont’d) • Redundant hot swap power supply • Redundant hot swap cooling and CPU fans • 17" monitor • Keyboard with integrated trackball and buttons • Two cascaded 8-port APEX PC Solutions KVM switches • Intel Fast Ethernet switch
Users • Yang Wang - LSMS • John Urbanic - Performance • Mike Kopko - Sloan Sky Search • Marcela Madrid - Amber • Dave Yaron - Organic Displays • Nick Nystrom - Games • Westinghouse CNFD A little suffering is good for the soul. -- Kirk, stardate 1514.0
Research Areas • Scheduling in mixed OS environment • “mpi” aware scheduling • PBS port to NT • CBS Sentinel • Genius • not LSF • Usage Accounting • scheduling support • Interconnect • Giganet
Research Areas, cont. • Data Issues • Database/Data Mining • Shared File System • Parallel File System • Middleware • Java • CORBA • Agents/aglets Oracle 8i ?
Plans(Run-Away Train) • National Institutes of HealthBiomedical Cluster WorkshopJune 1999 • 40 more CPU’s • quads?