1 / 27

Beowulf Clusters

Beowulf Clusters. Paul Tymann Computer Science Department Rochester Institute of Technology ptt@cs.rit.edu. Parallel Computers ( Summary ). In the mid 70s and 80s high performance computing was dominated by systems that were contained in a single “box”.

Download Presentation

Beowulf Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beowulf Clusters Paul Tymann Computer Science Department Rochester Institute of Technology ptt@cs.rit.edu

  2. Parallel Computers (Summary) • In the mid 70s and 80s high performance computing was dominated by systems that were contained in a single “box”. • Architectures were specialized and very different from each other. • There is no Von Neumann architecture for parallel computers • If the software and hardware architectures matched you could attain significant improvements in performance • Difficult (almost impossible in some cases) to port programs • Programmers had to be specialized • Very expensive 364 - Beowulf Clusters

  3. Seymour Cray (1925-1996) • Packaging, including heat removal. • High level bit plumbing… getting the bits from I/O, into memory through a processor and back to memory and to I/O. • Parallelism. • Programming: O/S and compiler. • Problems being solved. • Established the template for vector supercomputer architecture. 364 - Beowulf Clusters

  4. Cray XMP/4 364 - Beowulf Clusters

  5. Cray 2 364 - Beowulf Clusters

  6. Thinking Machines • Company founded by Danny Hillis, Guy Steele and others. • Thinking Machines was the leader in scalable computing, with software and applications running on parallel systems ranging from 16 to 1024 processors. • In developing the Connection Machine system, Thinking Machines also did pioneering work in parallel software. 364 - Beowulf Clusters

  7. Basic Organization • Host sends commands & data to microcontroller • Microcontroller broadcasts control signals, data to array • Microcontroller collects data from processor array CM Processors And Memories Host Computer Microcontroller 364 - Beowulf Clusters

  8. CM2 364 - Beowulf Clusters

  9. CM5 364 - Beowulf Clusters

  10. SPMD Computing • SPMD stands for single program multiple data • The same program is run on the processors of an MIMD machine • Occasionally the processors may synchronize • Because an entire program is executed on separate data, it is possible that different branches are taken, leading to asynchronous parallelism • SPMD can about as a desire to do SIMD like calculations on MIMD machines • SPMD is not a hardware paradigm, it is the software equivalent of SIMD 364 - Beowulf Clusters

  11. Distributed Systems • A collection of autonomous computers linked by a network, with software designed to produce an integrated computing facility. • The introduction of LANs at the beginning of the 1970s triggered the development of distributed systems. • As an alternative to expensive parallel systems, many researchers began to “build” parallel computers using distributed computing technology. Local Area Network 364 - Beowulf Clusters

  12. Distributed vs. High Performance • Distributed systems, and distributed software are in common use today • Web servers • ATM networks • Cell Phone System • … • These system use distributed computing for architectural reasons (reliability, modularity, …) not necessarily for speed. • High performance distributed computing uses distributed computing to reduce the run time of an application • Primary interest is speed • Primary use is parallel computing 364 - Beowulf Clusters

  13. Common Systems Beowulf – conqueror of computationally intensive problems COWS – Clusters of Workstations 364 - Beowulf Clusters

  14. Clusters of Workstations • Cycle vampires • Use wasted compute cycles on the desktop • Utilize equipment that is not designed for distributed computing • 100mbps may be fine for mail… • Must work with an OS that is designed for general purpose computing • Typically suspend computation when workstation becomes active • Some common software environments include • Condor • PVM/P4 • Autorun • … 364 - Beowulf Clusters

  15. Communication • Communication is vital in any kind of distributed application. • Initially most people wrote their own protocols • Tower of Babel effect • Eventually standards appeared • Parallel Virtual Machine (PVM) • Message Passing Interface (MPI) 364 - Beowulf Clusters

  16. What Is a Beowulf Cluster? • “It's a kind of high-performance massively parallel computer built primarily out of commodity hardware components, running a free-software operating system like Linux or FreeBSD, interconnected by a private high-speed network.” – Beowulf FAQ. • A key feature of a Beowulf cluster is that the machines in the cluster are dedicated to running high-performance computing tasks. • The cluster is on a private network. • It is usually connected to the outside world through only a single node. 364 - Beowulf Clusters

  17. Beowulf Architecture External Network Control Nodes … Cluster of dedicated machines on separate network 364 - Beowulf Clusters

  18. Origins of Beowulf • In the early 1990s, NASA researchers Becker & Sterling identify these problems: • Computing projects need more power. • Budgets are increasingly tight. • Supercomputer manufacturers were going bust. • Maintenance contracts voided. • Proprietary hardware no longer upgradeable. • Proprietary software no longer maintainable. “Learning the peculiarities of a specific vendor only enslaves you to that vendor.” 364 - Beowulf Clusters

  19. 1994: Wiglaf • Becker & Sterling named their prototype system Wiglaf: • 16 nodes, each with • 486-DX4 CPU (100-MHz) • 16M RAM (60 ns) • 540Mb or 1Gb disk • three 10-Mbps ethernet cards (communication load spread across three distinct ethernets) • Triple-bus topology • 42 Mflops (measured) Source: Joel Adams, http://www.calvin.edu/~adams/ 364 - Beowulf Clusters

  20. 1995: Hrothgar • They named their next system Hrothgar: • 16 nodes, each with • Pentium CPU (100-MHz) • 32M RAM • 1.2Gb disk • Two 100-Mbps NICs • Two 100-Mbps switches double-bus topology • 280 Mflops (measured) Source: Joel Adams, http://www.calvin.edu/~adams/ 364 - Beowulf Clusters

  21. 1997: Stone Soupercomputer • Hoffman & Hargrove built ORNL’s Stone Soupercomputer • donated/castoff nodes • 486s, Pentiums, ... • whatever RAM, disk • one 10-Mbps NIC • one 10-Mbps ethernet (bus topology) • Feb 2002: 133 nodes • Total cost: $0 • Performance/Price ratio:  Source: Joel Adams, http://www.calvin.edu/~adams/ 364 - Beowulf Clusters

  22. Stone SouperComputer Source: Joel Adams, http://www.calvin.edu/~adams/ 364 - Beowulf Clusters

  23. plexus.lac.rit.edu • 53 dual (106 CPUs) PIII 1.4GHz boxes each with. • 36GB SCSI drives. • 512MB of RAM. • Gigabit Ethernet card. • A management node. • Storage/server node. • an attached storage array with 14 73GB scsi drives (approx. 1TB of storage). • The cluster is connected by switched Gigabit Ethernet. • A 100Mbps Ethernet is used for administration. 364 - Beowulf Clusters

  24. Beowulf Resources • The Beowulf Project • http://www.beowulf.org • The Beowulf Underground • http://www.beowulf-underground.org/ • The Beowulf HOWTO • http://www.linux.com/howto/Beowulf-HOWTO.html • The Scyld Computing Corporation • http://www.scyld.com 364 - Beowulf Clusters

  25. Communication • Communication is vital in any kind of distributed application. • Initially most people wrote their own protocols. • Tower of Babel effect. • Eventually standards appeared. • Parallel Virtual Machine (PVM). • Message Passing Interface (MPI). 364 - Beowulf Clusters

  26. What is MPI? • A message passing library specification • Message-passing model • Not a compiler specification (i.e. not a language) • Not a specific product • Designed for parallel computers, clusters, and heterogeneous networks 364 - Beowulf Clusters

  27. The MPI Process • Development began in early 1992 • Open process/Broad participation • IBM,Intel, TMC, Meiko, Cray, Convex, Ncube • PVM, p4, Express, Linda, … • Laboratories, Universities, Government • Final version of draft in May 1994 • Public and vendor implementations are now widely available 364 - Beowulf Clusters

More Related