1 / 31

Beowulf Clusters

Beowulf Clusters. Matthew Doney. What is a cluster?. A cluster is a group of several computers connected Several different methods of connecting them Distributed Computers widely separated, connected over the internet Used by research groups like SETI@home and GIMPS Workstation Cluster

anoush
Download Presentation

Beowulf Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beowulf Clusters Matthew Doney

  2. What is a cluster? • A cluster is a group of several computers connected • Several different methods of connecting them • Distributed • Computers widely separated, connected over the internet • Used by research groups like SETI@home and GIMPS • Workstation Cluster • Collection of Workstations loosely connected by LAN • Cluster Farm • PC’s connected over LAN that perform work when idle

  3. What is a Beowulf Cluster • A Beowulf Cluster is one class of a cluster computer • Uses Commercial Off The Shelf (COTS) hardware • Typically contains both master and slave nodes • Not defined by a specific piece of hardware Image Source: http://www.cse.mtu.edu/Common/cluster.jpg

  4. What is a Beowulf Cluster • The origin of the name “Beowulf” • Main character of Old English poem • Described in the poem – “he has thirty men’s heft of grasp in the gripe of his hand, the bold-in-battle”. Image Source: http://www.teachingcollegeenglish.com/wp-content/uploads/2011/06/lynd-ward-17-jnanam-dot-net.jpg

  5. Cluster Computer History – 1950’s • SAGE, one of the first cluster computers • Developed by IBM for NORAD • Linked radar stations together for first early warning detection system Image Source: http://www.ieeeghn.org/wiki/images/3/34/Sage_nomination.jpg

  6. Cluster Computer History – 1970’s • Technological Advancements • VLSI (Very Large Scale Integration) • Ethernet • UNIX Operating System

  7. Cluster Computer History – 1980’s • Increased interest in cluster computing • Ex: NSA connected 160 Apollo workstations in a cluster configuration • First widely used clustering product: VAXcluster • Development of task scheduling software • Condor package developed by UW-Madison • Development of parallel programming software • PVM(Parallel Virtual Machine)

  8. Cluster Computer History – 1990’s • NOW(Network of workstations) project at UC Berkeley • First cluster on TOP500 list • Development of Myrinet LAN system • Beowulf project started at NASA’s Goddard Space Flight Center Image Source: http://www.cs.berkeley.edu/~pattrsn/Arch/NOW2.jpg

  9. Cluster Computer History - Beowulf • Developed by Thomas Sterling and Donald Becker • 16 Individual nodes • 100 MHz Intel 80486 processors • 16 MB memory, 500 MB hard drive • 2 10Mbps Ethernet ports • Early version of Linux • Used PVM library

  10. Cluster Computer History – 1990’s • MPI standard developed • Created to be a global standard to replace existing message passing protocols • DOE, NASA, California Institute of Technology collaboration • Developed a Beowulf system with sustained performance 1 Gflops • Cost $50,000 • Awarded Gordon Bell prize for price/performance • 28 Clusters were on the TOP500 list by the end of the decade

  11. Beowulf Cluster Advantages • Price/Performance • Using COTS hardware greatly reduces associated costs • Scalability • By using individual nodes, more can easily be added by slightly altering the network • Convergence Architecture • Using commodity hardware has standardized operating systems, instruction sets, and communication protocols • Code portability has greatly increased

  12. Beowulf Cluster Advantages • Flexibility of Configuration and Upgrades • Large variety of COTS components • Standardization of COTS components allows for easy upgrades • Technology Tracking • Can use new components as soon as they come out • No delay time waiting for manufacturers to integrate components • High Availability • System will continue to run if an individual node fails

  13. Beowulf Cluster Advantages • Level of Control • System is easily configured to users liking • Development Cost and Time • No special hardware needs to be designed • Less time designing system, just pick parts to be used • Cheaper mass market components

  14. Beowulf Cluster Disadvantages • Programming Difficulty • Programs need to be highly parallelized to take advantage of hardware design • Distributed Memory • Program data is split over the individual nodes • Network speed can bottleneck performance • Results may need to be compiled by a single node

  15. Beowulf Cluster Architecture • Master-Slave configuration • Master Node • Job scheduling • System monitoring • Resource management • Slave Node • Does assigned work • Communicates with other slave nodes • Sends results to master node

  16. Node Hardware • Typically desktop PC’s • Can consist of other types of computers i.e. • Rack-mount servers • Case-less motherboards • PS3’s • RaspberryPi boards

  17. Node Software • Operating System • Resource Manager • Message Passing Software

  18. Resource Management Software • Condor • Developed by UW-Madison • Allows distributed job submission • PBS (Portable Batch System) • Initially developed by NASA • Developed to schedule jobs on parallel compute clusters • Maui • Adds enhanced monitoring to existing job scheduler (i.e. PBS) • Allows administrator to set individual and group job priorities

  19. Sample Condor Submit File • Submits 150 copies of the program foo • Each copy of the program has its own input, output, and error message file • All of the log information from Condor goes to one file

  20. Sample Maui Configuration File • User yangq will have the highest priority users of the group ART having lowest • Members of group CS_SE are limited to 20 jobs which use no more than 100 nodes

  21. Sample PBS Submit File • Submits job “my_job_name” that needs 1 hour and 4 CPUs with 2GB of memory • Uses file “my_job_name.in” as input • Uses file “my_job_name.log” as output • Uses file “my_job_name.err” as error output

  22. Message Passing Software • MPI (Message Passing Interface) • Widely used in HPC community • Specification is controlled by MPI-Forum • Available for free • PVM (Parallel Virtual Machine) • First message passing protocol in be widely used • Provided for fault tolerant operation

  23. MPI Hello World Example

  24. MPI Hello World Example(cont)

  25. PVM Hello World Example

  26. PVM Hello World Example

  27. Interconnection Hardware • Two main choices – technology and topology • Main Technologies • Ethernet with speeds up to 10Gbps • Infiniband with speeds up to 300 Gbps Image Source:http://www.sierra-cables.com/Cables/Images/12X-Infiniband-R.jpg

  28. Interconnection Topology Bus Network Torus Network Flat Neighborhood Network

  29. References • [1] Impagliazzo, J., & Lee, J. A. N. (2004). History of Computing in Education. Norwell: Kluwer Academic Publishers. • [2] Pfeiffer, C. (Photographer). (2006, November 25). Cray-1 Deutsches Museum [Web Photo]. Retrieved from http://en.wikipedia.org/wiki/File:Cray-1-deutsches-museum.jpg • [3] Sterling, T. (2002). Beowulf Cluster Computing with Linux. United States of America: MassahusettsInstitue of Technology. • [4] Sterling, T. (2002). Beowulf Cluster Computing with Windows. United State of America: Massachusetts Institute of Technology. • [5] Condor High Throughput Computing. (2013, October 24). Retrieved October 27, 2013, from http://research.cs.wisc.edu/htcondor/

  30. References • [6] Beowulf: A Parallel Workstation For Scientific Computation. (1995). Retrieved October 27, 2013, from http://www.phy.duke.edu/~rgb/brahma/Resources/beowulf/papers/ICPP95/ icpp95.html • [7] Development over Time | TOP500 Supercomputer Sites. Retrieved October 27, 2013, from www.top500.org/statistics/overtime/ • [8] Jain, A. (2006). Beowulf cluster design and setup. Retrieved October 27, 2013. Informally published manuscript, Department of Computer Science, Boise State University, Retrieved from http://cs.boisestate.edu/~amit/research/beowulf/beowulf-setup.pdf • [9] Zinner, S. (2012). High Performance Computing Using Beowulf Clusters. Retrieved October 27, 2013. Retrieved from http://www2.hawaii.edu/~zinner/101/students/MitchelBeowulf/cluster.html

  31. Questions???

More Related