260 likes | 516 Views
Presented by, Harsha Madduri. Cluster Computing. What is a Cluster?. A single administrative domain A cluster network Multi-mode An example : Beowulf-class system. Why Clusters?. Ordinary consumer H/W and S/W lead to HPC Very inexpensive but very powerful.
E N D
Presented by, Harsha Madduri Cluster Computing
What is a Cluster? • A single administrative domain • A cluster network • Multi-mode • An example : Beowulf-class system
Why Clusters? • Ordinary consumer H/W and S/W lead to HPC • Very inexpensive but very powerful
Some interesting facts • Clusters can exhibit price-performance advantage near to a factor of 50. • Real-world application -> factor of 3 • In general, twice the performance than a very expensive counterpart. • Advancements in network technologies(LAN) • Gordon Bell
Hardware • Nodes and interconnection network: • Processor • Memory • Secondary Storage • Interface • Hub • Advancements: cLAN, Infiband
Software • Operating system - Omnipresent– at the heart of every node • Application Programming Environments • PVM • LAM • MPI
Building a cluster • Very simple, just need to take care • Beowulf cluster building illustrated...
What is Beowulf? • High performance parallel computer • Built out of commodity H/W components, running a free S/W OS
Where do I get Beowulf “software” • There isn't a software package called Beowulf.
What you need • A few PC's/workstations • A hub • A few Ethernet cables • Any Linux(Slackware, Redhat....) • An Internet connection (or an interface to put some files onto the head node)
Installing OS • Install complete OS (full bundle) on the head node (GUI if needed) • Base system installations on the rest • Cloning can be easy (building slave nodes)
Connections • Connect all the PC's using hub/switch • Connect Monitor, Keyboard & mouse(if required) • Power connections: • Use a 230Watt power supply • ATX and AT boxes not to be combined
Network address • Choose among the private network range • 192.168.x.x • 10.x.x.xIf Internet connection available, configure gateway server
Machine Naming • Use obvious naming schemes (node01, node02, etc) • Correspondence between names and IP's can help • Example:server - 10.0.0.1node01 - 10.0.0.101node02 - 10.0.0.102
File Sharing • Setup nodes such that no floppy drive or Flash drive is required later. • Connect the systems • Start the NFS server Example: /sbin/service nfs start
MPI Installation • What we need • MPI package (eg: mpich2) • A C compiler (gcc should do good) • A C++ compiler (like g++) for MPI bindings • Python 2.2 (for MPD process manager)
Simple steps to install • Uncompress the package • Choose a directory to install(create if required) • Configure mpich2 (specify the installation directory) • Build mpich2 (using “make”) • Install the mpich2 commands (Ex: make install) • Set the environment variable “PATH”(add 'bin' subdirectory)
Checking the installation • Use commands • which mpd • which mpiexec • which mpirunThese should refer to the commands in bin subdirectory of the installation directory
The Final step • Duplicate this directory onto all the slaves • Machines should be reached using ssh/rsh without password • Bringing “up” MPD (a process manager) • mpd & • mpdtrace • mpdallexit(alt : mpd & mpdtrace -l mpd -h <hostname> -p <portnumber>)
Testing the speed • Command: • mpdringtest • mpdringtest 100 (number of times) • Testing if it can run multiprocess job • mpiexec -n <number> hostname
Conclusion • Clusters are high performance systems at a low cost • They are easy to build • If latency is eliminated Massively parallel processing could be undoubtedly done using clusters
References • [1]Cluster Computing White Paper (Dec 2000) • by, Thomas Sterling, Steve Chapin, Erich Schikuta, Daniel Katz • [2] Building a Beowulf Cluster (May 2001) by Asmund Odegard