640 likes | 817 Views
Cluster Computing: An Introduction. 金仲達 國立清華大學資訊工程學系 king@cs.nthu.edu.tw. Clusters Have Arrived. What is a Cluster?. A collection of independent computer systems working together as if a single system Coupled through a scalable, high bandwidth, low latency interconnect
E N D
Cluster Computing:An Introduction 金仲達 國立清華大學資訊工程學系 king@cs.nthu.edu.tw
What is a Cluster? • A collection of independent computer systems working together as if a single system • Coupled through a scalable, high bandwidth, low latency interconnect • The nodes can exist in a single cabinet or be separated and connected via a network • Faster, closer connection than a network (LAN) • Looser connection than a symmetric multiprocessor
Outline • Motivations of Cluster Computing • Cluster Classifications • Cluster Architecture & its Components • Cluster Middleware • Representative Cluster Systems • Task Forces on Cluster • Resources and Conclusions
How to Run Applications Faster ? • There are three ways to improve performance: • Work harder • Work smarter • Get help • Computer analogy • Use faster hardware: e.g. reduce the time per instruction (clock cycle) • Optimized algorithms and techniques • Multiple computers to solve problem=> techniques of parallel processing is mature and can be exploited commercially
Motivation for Using Clusters • Performance of workstations and PCs is rapidly improving • Communications bandwidth between computers is increasing • Vast numbers of under-utilized workstations with a huge number of unused processor cycles • Organizations are reluctant to buy large, high performance computers, due to the high cost and short useful life span
Motivation for Using Clusters • Workstation clusters are thus a cheap and readily available approach to high performance computing • Clusters are easier to integrate into existing networks • Development tools for workstations are mature • Threads, PVM, MPI, DSM, C, C++, Java, etc. • Use of clusters as a distributed compute resource is cost effective --- incremental growth of system!!! • Individual node performance can be improved by adding additional resource (new memory blocks/disks) • New nodes can be added or nodes can be removed • Clusters of Clusters and Metacomputing
Key Benefits of Clusters • High performance:running cluster enabled programs • Scalability:adding servers to the cluster or by adding more clusters to the network as the need arises or CPU to SMP • High throughput • System availability (HA):offer inherent high system availability due to the redundancy of hardware, operating systems, and applications • Cost-effectively
Hardware and Software Trends • Important advances taken place in the last five year • Network performance increased with reduced cost • Workstation performance improved • Average number of transistors on a chip grows 40% per year • Clock frequency growth rate is about 30% per year • Expect 700-MHz processors with 100M transistors in early 2000 • Availability of powerful and stable operating systems (Linux, FreeBSD) with source code access
Why Clusters NOW? • Clusters gained momentum when three technologies converged: • Very high performance microprocessors • workstation performance = yesterday supercomputers • High speed communication • Standard tools for parallel/ distributed computing & their growing popularity • Time to market => performance • Internet services: huge demands for scalable, available, dedicated internet servers • big I/O, big compute
Efficient Communication • The key enabling technology:from killer micro to killer switch • Single chip building block forscalable networks • high bandwidth • low latency • very reliable • Challenges for clusters • greater routing delay and less thancomplete reliability • constraints on where the networkconnects into the node • UNIX has a rigid device andscheduling interface
Putting Them Together ... • Building block = complete computers(HW & SW) shipped in 100,000s:Killer micro, Killer DRAM, Killer disk,Killer OS, Killer packaging, Killer investment • Leverage billion $ per year investment • Interconnecting building blocks => Killer Net • High bandwidth • Low latency • Reliable • Commodity (ATM, Gigabit Ethernet, MyridNet)
Windows of Opportunity • The resources available in the average clusters offer a number of research opportunities, such as • Parallel processing: use multiple computers to build MPP/DSM-like system for parallel computing • Network RAM: use the memory associated with each workstation as an aggregate DRAM cache • Software RAID: use the arrays of workstation disks to provide cheap, highly available, and scalable file storage • Multipath communication: use the multiple networks for parallel data transfer between nodes
Windows of Opportunity • Most high-end scalable WWW servers are clusters • end services (data, web, enhanced information services, reliability) • Network mediation services also cluster-based • Inktomi traffic server, etc. • Clustered proxy caches, clustered firewalls, etc. • => These object web applications increasingly compute intensive • => These applications are an increasing part of the “scientific computing”
Clusters Classification 1 • Based on Focus (in Market) • High performance (HP) clusters • Grand challenging applications • High availability (HA) clusters • Mission critical applications
Clusters Classification 2 • Based on Workstation/PC Ownership • Dedicated clusters • Non-dedicated clusters • Adaptive parallel computing • Can be used for CPU cycle stealing
Clusters Classification 3 • Based on Node Architecture • Clusters of PCs (CoPs) • Clusters of Workstations (COWs) • Clusters of SMPs (CLUMPs)
Clusters Classification 4 • Based on Node Components Architecture & Configuration: • Homogeneous clusters • All nodes have similar configuration • Heterogeneous clusters • Nodes based on different processors and running different OS
Clusters Classification 5 • Based on Levels of Clustering: • Group clusters (# nodes: 2-99) • A set of dedicated/non-dedicated computers --- mainly connected by SAN like Myrinet • Departmental clusters (# nodes: 99-999) • Organizational clusters (# nodes: many 100s) • Internet-wide clusters = Global clusters(# nodes: 1000s to many millions) • Metacomputing
Cluster Components...1aNodes • Multiple high performance components: • PCs • Workstations • SMPs (CLUMPS) • Distributed HPC systems leading to Metacomputing • They can be based on different architectures and running different OS
Cluster Components...1bProcessors • There are many (CISC/RISC/VLIW/Vector..) • Intel: Pentiums, Xeon, Merced…. • Sun: SPARC, ULTRASPARC • HP PA • IBM RS6000/PowerPC • SGI MPIS • Digital Alphas • Integrating memory, processing and networking into a single chip • IRAM (CPU & Mem): (http://iram.cs.berkeley.edu) • Alpha 21366 (CPU, Memory Controller, NI)
Cluster Components…2OS • State of the art OS: • Tend to be modular: can easily be extended and new subsystem can be added without modifying the underlying OS structure • Multithread has added a new dimension to parallel processing • Popular OS used on nodes of clusters: • Linux (Beowulf) • Microsoft NT (Illinois HPVM) • SUN Solaris (Berkeley NOW) • IBM AIX (IBM SP2) • …..
Cluster Components…3High Performance Networks • Ethernet (10Mbps) • Fast Ethernet (100Mbps) • Gigabit Ethernet (1Gbps) • SCI (Dolphin - MPI- 12 usec latency) • ATM • Myrinet (1.2Gbps) • Digital Memory Channel • FDDI
Cluster Components…4Network Interfaces • Dedicated Processing power and storage embedded in the Network Interface • An I/O card today • Tomorrow on chip? Mryicom Net 160 MB/s Myricom NIC M P I/O bus (S-Bus) 50 MB/s M $ Sun Ultra 170 P
Cluster Components…4Network Interfaces • Network interface card • Myrinet has NIC • User-level access support: VIA • Alpha 21364 processor integrates processing, memory controller, network interface into a single chip..
Cluster Components…5 Communication Software • Traditional OS supported facilities (but heavy weight due to protocol processing).. • Sockets (TCP/IP), Pipes, etc. • Light weight protocols (user-level): minimal Interface into OS • User must transmit directly into and receive from the network without OS intervention • Communication protection domains established by interface card and OS • Treat message loss as an infrequent case • Active Messages (Berkeley), Fast Messages (UI), ...
Cluster Components…6aCluster Middleware • Resides between OS and applications and offers an infrastructure for supporting: • Single System Image (SSI) • System Availability (SA) • SSI makes collection of computers appear as a single machine (globalized view of system resources) • SA supports check pointing and process migration, etc.
Cluster Components…6bMiddleware Components • Hardware • DEC Memory Channel, DSM (Alewife, DASH) SMP techniques • OS/gluing layers • Solaris MC, Unixware, Glunix • Applications and Subsystems • System management and electronic forms • Runtime systems (software DSM, PFS etc.) • Resource management and scheduling (RMS): • CODINE, LSF, PBS, NQS, etc.
Cluster Components…7aProgramming Environments • Threads (PCs, SMPs, NOW, ..) • POSIX Threads • Java Threads • MPI • Linux, NT, on many Supercomputers • PVM • Software DSMs (Shmem)
Cluster Components…7bDevelopment Tools? • Compilers • C/C++/Java/ • RAD (rapid application development tools):GUI based tools for parallel processing modeling • Debuggers • Performance monitoring and analysis tools • Visualization tools
Cluster Components…8Applications • Sequential • Parallel/distributed (cluster-aware applications) • Grand challenging applications • Weather Forecasting • Quantum Chemistry • Molecular Biology Modeling • Engineering Analysis (CAD/CAM) • ………………. • Web servers, data-mining
Middleware Design Goals • Complete transparency • Let users see a single cluster system • Single entry point, ftp, telnet, software loading... • Scalable performance • Easy growth of cluster • no change of API and automatic load distribution • Enhanced availability • Automatic recovery from failures • Employ checkpointing and fault tolerant technologies • Handle consistency of data when replicated..
Single System Image (SSI) • A single system image is the illusion, created by software or hardware, that a collection of computers appear as a single computing resource • Benefits: • Usage of system resources transparently • Improved reliability and higher availability • Simplified system management • Reduction in the risk of operator errors • User need not be aware of the underlying system architecture to use these machines effectively
Desired SSI Services • Single entry point • telnet cluster.my_institute.edu • telnet node1.cluster.my_institute.edu • Single file hierarchy: AFS, Solaris MC Proxy • Single control point: manage from single GUI • Single virtual networking • Single memory space - DSM • Single job management: Glunix, Condin, LSF • Single user interface: like workstation/PC windowing environment
Application and Subsystem Level Operating System Kernel Level SSI Levels • Single system support can exist at different levels within a system, one is able to be built on another Hardware Level
Availability Support Functions • Single I/O space (SIO): • Any node can access any peripheral or disk devices without the knowledge of physical location. • Single process space (SPS) • Any process can create processes on any node, and they can communicate through signals, pipes, etc, as if they were one a single node • Checkpointing and process migration • Saves the process state and intermediate results in memory or disk; process migration for load balancing • Reduction in the risk of operator errors
Strategies for SSI • Build as a layer on top of existing OS (e.g. Glunix) • Benefits: • Makes the system quickly portable, tracks vendor software upgrades, and reduces development time • New systems can be built quickly by mapping new services onto the functionality provided by the layer beneath, e.g. Glunix/Solaris-MC • Build SSI at the kernel level (True Cluster OS) • Good, but can’t leverage of OS improvements by vendor • e.g. Unixware and Mosix (built using BSD Unix)
Research Projects of Clusters • Beowulf: CalTech, JPL, and NASA • Condor: Wisconsin State University • DQS (Distributed Queuing System): Florida State U. • HPVM (High Performance Virtual Machine): UIUC& UCSB • Gardens: Queensland U. of Technology, AU • NOW (Network of Workstations): UC Berkeley • PRM (Prospero Resource Manager): USC
Commercial Cluster Software • Codine (Computing in Distributed Network Environment): GENIAS GmbH, Germany • LoadLeveler: IBM Corp. • LSF (Load Sharing Facility): Platform Computing • NQE (Network Queuing Environment): Craysoft • RWPC: Real World Computing Partnership, Japan • Unixware: SCO • Solaris-MC: Sun Microsystems