120 likes | 255 Views
Chiba City An Open Source Computer Science Testbed. http://www.mcs.anl.gov/chiba/ Mathematics & Computer Science Division Argonne National Laboratory. The Chiba City Project. Chiba City is a Linux cluster built of 314 computers. It was installed at MCS in October of 1999.
E N D
Chiba CityAn Open Source Computer Science Testbed http://www.mcs.anl.gov/chiba/Mathematics & Computer Science DivisionArgonne National Laboratory
The Chiba City Project • Chiba City is a Linux cluster built of 314 computers. It was installed at MCS in October of 1999. • The primary purpose of Chiba City is to be a scalability testbed, built from open sourcecomponents, for the High Performance Computing and Computer Science communities. • Chiba City is a first step towards a many-thousand node system.
Cluster Projects at MCS • Windows NT SuperCluster (1996-1998) (8 heterogeneous nodes) (r.i.p.) • Windows NT MPI porting project (8 nodes) • Microsoft supported (PIIs,SMP, 256 MB, 8 GB disk, Giganet) • Collage and ActiveMural (22 nodes) • Linux Cluster to Drive (PIIs, 256 MB, 4 GB disk, display Adapters) • Chiba City Test Clusters (4-16 nodes) • various development environments • Chiba City Main System (300+ nodes) • 256 core nodes (dual PIIIs, 512MB RAM, 9 GBdisk, Myrinet, Gig-E) • 32 visualization nodes (PIIIs, 512MB RAM, 9 GBdisk, Myrinet, Matrox G200s) • 8 storage nodes (Xeon, 300 GB Disk/node) • 18 management nodes (PIIs, multiple networks, 18 GB disk) • The DSL Data Grid Node (20 nodes) 4 TB aggregate • PIII, 512 MB RAM, 200 GB disk, Fast and Gig-E • Alpha Cluster for Computational Biology • 18 XP1000’s (633MHz, Ev6), 512 MB RAM, 10 GB disk , ServerNetII, Management Node
Cluster-Related Research at MCS • Scalable Systems Management • Chiba City Management Model (w/LANL, LBNL) • Msys and City Software for Linux Clusters and Large Unix Environments • MPI and Communications Software • MPICH • GigaNet, Myrinet, ServerNetII • Data Management and Grid Services • Globus Services on Linux (w/LBNL, ISI) • Visualization and Collaboration Tools • Parallel OpenGL server (w/Princeton, UIUC) • vTK and CAVE Software for Linux Clusters • Scalable Media Server (FL Voyager Server on Linux Cluster) • Scalable Display Environment and Tools • Virtual Frame Buffer Software (w/Princeton) • VNC (ATT) modifications for ActiveMural • Parallel I/O • MPI-IO and Parallel Filesystems Developments (w/Clemson, PVFS) • Plus many other MCS research projects that focus on parallel computing in general, with clusters being a specific case of that. - PETSc, ALICE, ADIFOR, Neos, ...
Chiba City User Community • Computer Scientists • Computational Scientists • Industry and Educational Partners • Open Source development groups
Chiba City Design Goals • Support Computer Scientists and Open Source Developers • Dedicated visualization nodes • Dedicated storage nodes • Extremely flexible node configuration and recovery • Support Computational Science • Computation nodes for message passing parallel applications • High performance network • Production environment: reliable system, scheduled allocated projects • Prototype Scalable Systems Software • Management fabric • Hierarchical, central management system • Database driven configuration
Chiba City The Argonne Scalable Cluster 8 Computing Towns 256 Dual Pentium III systems 1 Storage Town 8 Xeon systems with 300G disk each 1 Visualization Town 32 Pentium III systems with Matrox G400 cards Cluster Management 12 PIII Mayor Systems 4 PIII Front End Systems 2 Xeon File Servers 3.4 TB disk Management Net Gigabit and Fast Ethernet Gigabit External Link High Performance Net 64-bit Myrinet 27 Sep 1999
A “town” is the basic cluster building unit. 8-32 systems, for the actual work. 1 mayor, for management. OS loading, monitoring, file service Network and management gear. 8 compute towns: 32 dual PIII 500 compute nodes that run user jobs. 1 storage town: 8 Xeon systems with 300G disk For storage-related research, eventually for production global storage. 1 visualization town: 32 nodes for dedicated visualization experiments. Chiba Computing Systems mayor node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node
High Performance Network (Not Shown) 64-bit Myrinet. All systems connected. Flat topology. Management Network Switched Fast and Gigabit Ethernet. Primarily for builds and monitoring. Fast ethernet: Each individual node. Bonded gigabit ethernet: Mayors, servers, & login nodes. Town interconnects. External links. IP Topology: 1 flat IP subnet. … … Chiba Networks mayor n n n n n n n n Eth Switch n n n n … n n n n n n n n Gig Eth Switch n n n n … n n n n n n n n Control Systems Front Ends Test Clusters Gigabit Ethernet Fast Ethernet ANL network
Chiba Deployment Schedule • The Chiba City Barnraising - October 1999 • SC99 - November 1999 • System Shakedown, Myrinet Installation - December 1999 • Early Users - January 2000 • Production mode - Spring 2000
MPI and Parallel I/O on Chiba • Using MPICH, ROMIO, and PVFS (Clemson) • Redesigning MPICH internals for better scalability, support for faster networking technologies (VIA, Myrinet), and for MPI-2 • PVFS servers will run on the storage town • Collaborating with Clemson on PVFS: • scalability and reliability • eliminated dependency on NFS for metadata storage • redesigning PVFS to support faster communication mechanisms
Chiba CityAn Open Source Computer Science Testbed http://www.mcs.anl.gov/chiba/Mathematics & Computer Science DivisionArgonne National Laboratory