130 likes | 232 Views
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies. Patrick Dreher MIT. A Multi-Purpose PC Farm. Goals of the Project Functionality and Constraints Hardware Selection Software Selection Operation. Goals of the Project.
E N D
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT
A Multi-Purpose PC Farm • Goals of the Project • Functionality and Constraints • Hardware Selection • Software Selection • Operation
Goals of the Project • User requirements • Production machine for the experimentalists for Monte Carlo simulations and physics analysis of experimental data • Development and testing platform for the theorists to examine the performance characteristics of the x86 chip design • Design a way for both experimentalists and theorists to peacefully co-exist sharing the existing PC farm hardware at the same time
Existing PC Farm Hardware • The configuration for each machine in the existing PC farm • 20 dual Pentium II 400 MHz CPUs • 384 Mbytes memory • 13 Gbytes disk space • fast Ethernet • PCs interconnected by Kingston EtherRx 100 BaseTx fast Ethernet stackable hubs • A front end PC connecting the farm nodes to the internet
PC Farm Software • Operating system is RedHat Linux x86 version 5.2 • Linux kernel configured for SMP operations • Production of batch jobs managed through Network Queuing System (http://www.gnqs.org)
HUB LAN
Constraints for the Project • No new funds were available to purchase additional CPUs for the existing PC farm • No new funds were available to purchase a separate PC farm for development and testing of theory codes • No funds were available at the level needed to purchase high performance network switches (such as Myrinet) • Small amounts of funds were available for additional peripherals
Modified PC Farm - Functionality • Original configuration had 20 machines (40 CPUs) available under a batch queuing system (NQS) • Modified configuration set aside 4 of the machines for the theorists (8 CPUs) leaving the other 32 CPUs for production work and analysis of experimental data • Four 4-port Adaptec network cards were purchased and one was installed in each of the four machines • The four machines were networked together in a two-dimensional torus
HUB LAN
Modes of Operation • Production operation for the experimentalists involved configuring NQS so that it identified 40 CPUs available for production and analysis of data • An alternate NQS configuration was built that identified only 32 CPUs available for production • Only one of these two configurations could be installed and operational on the PC farm at a given time
Modes of Operation (cont’d) • When the alternate NQS configuration was loaded • Experimentalists would continue to use the 32 CPUs • The theorists would first log onto the front end and then use ssh to log onto one of the 4 machines not grouped under the alternate NQS configuration • From this point, theory codes could be started using one, two, or all four machines
Results • Theorists - • Gathered data as part of a larger program to compare the performance between x86, alpha 164 and 264 chips • Interprocessor communication using MPI • Tests on memory bandwidth • Tests on lattice size of versus L2 cache for certain computation routines • Experimentalists • continued Monte Carlo production and analysis of experimental data