1 / 42

Maximizing Application Performance with Cluster Computing

Learn about cluster computing, parallel computing, middleware, virtualization, resource sharing, and more. Discover how to run applications faster and improve performance. Understand different parallel computing architectures and enabling technologies for high-performance computing. Explore the history of cluster computing and its benefits for modern applications and demanding industries. Discover the potential of commodity clusters and how they outperform specialized systems. Dive into the taxonomy of parallel computing and explore the cluster architecture for efficient computing solutions.

danad
Download Presentation

Maximizing Application Performance with Cluster Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Technologies for Cluster Computing Oren Laadan Columbia University <orenl@cs.columbia.edu> ECI, July 2005

  2. Course Overview (contd) • What is Cluster Computing ? • Parallel computing, enabling technologies, definition of a cluster, taxonomy • Middleware • SSI, operating system support, software support • Virtualization & Process Migration • Resource sharing • Job assignment, load balancing, information dissemination • Grids

  3. Motivation • Demanding Applications • Modeling and simulations (physics, weather, CAD, aero-dynamics, finance, pharmaceutical) • Business and E-commerce (Ebay, Oracle) • Internet (Google, eAnything) • Number crunching (encryption, data mining) • Entertainment (animation, simulators) • CPUs are reaching physical limits • Dimensions • Heat dissipation

  4. How to Run Applications Faster ? • 3 ways to improve performance: • Work Harder • Work Smarter • Get Help • And in computers: • Using faster hardware • Optimized algorithms and techniques • Multiple computers to solve a particular task

  5. Parallel Computing • Hardware: Instructions or Data ? • SISD – classic cpu • SIMD – vector computers • MISD – pipelined computers • MIMD – general purpose parallelism • Sofware ajdustments • Parallel programming: multiple processes collaborating, with communication and synchronization between them • Operating systems, compilers etc.

  6. Parallel Computer ArchitecturesTaxononmy of MIMD: • SMP - Symmetric Multi Processing • MPP - Massively Parallel Processors • CC-NUMA - Cache-Coherent Non-Uniform Memory Access • Distributed Systems • COTS – Commodity Off The Shelf • NOW – Network of Workstations • Clusters

  7. Taxononmy of MIMD (contd) • SMP • 2-64 processors today • Everything shared • Single copy of OS • Scalability issues (hardware, software) • MPP • Nothing shared • Several hundred nodes • Fast interconnection • Inferior cost/performance ratio

  8. Taxonomy of MIMD (contd) • CC-NUMA • Scalable multiprocessor system • Global view of memory at each node • Distributed systems • Conventional networks of independent nodes • Multiple system images and OS • Each node can be of any type (SMP, MPP etc) • Difficult to use and extract performance

  9. Taxonomy of MIMD (contd) • Clusters • Nodes connected with high-speed network • Operate as an integrated collection of resources • Single system image High performance computing – commodity super computing High availability computing – missions critical applications

  10. Taxonomy of MIMD - summary

  11. Enabling Technologies • Performance of individual components • Microprocessor (x2 every 18 months) • Memory capacity (x4 every 3 years) • Storage (capacity same !) – SAN, NAS • Network (scalable gigabit networks) • OS, Programming environments • Applications • Rate of performance improvements exceeds specialized systems

  12. The “killer” workstation • Traditional usage • Workstations w/ Unix for science & industry • PC’s for administrative work & work processing • Recent trend • Rapid convergence in processor performance and kernel-level functionality of PC vs Workstations • Killer CPU, killer memory, killer network, killer OS, killer applications…

  13. Computer Food Chain

  14. Towards Commodity HPC • Link together multiple computers to jointly solve a computational problem • Ubiquitous availability of commodity high performance components • Out: expensive and specialized proprietary and parallel computers • In: cheaper clusters of loosely coupled workstations

  15. History of Cluster Computing PDA Clusters 1960 1980s 1990 1995+ 2000+

  16. Why PC/WS Clustering Now ? • Individual PCs/WS become increasing powerful • Development cycle of supercomputers too long • Commodity networks bandwidth is increasing and latency is decreasing • Easier to integrate into existing networks • Typical low user utilization of PCs/WSs ( < 10% ) • Development tools for PCs/WS are more mature • PCs/WS clusters are cheap and readily available • Clusters can leverage from future technologies and be easily grown

  17. What is a Cluster ? • Cluster - a parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource. • Each node in the cluster is • A UP/MP system with memory, I/O facilities, & OS • Connected via fast interconnect or LAN • Appear as a single system to users and applications

  18. PC/Workstation PC/Workstation PC/Workstation PC/Workstation Communications Software Communications Software Communications Software Communications Software Network Interface Hardware Network Interface Hardware Network Interface Hardware Network Interface Hardware Cluster Architecture Parallel Applications Parallel Applications Parallel Applications Sequential Applications Sequential Applications Sequential Applications Parallel Programming Environment Cluster Middleware (Single System Image and Availability Infrastructure) Cluster Interconnection Network/Switch

  19. A Winning Symbiosis • Parallel Processing Create MPP or DSM –like parallel processing systems • Network RAM Use cluster-wide available memory to aggregate a substantial cache in RAM • Software RAID Use arrays of WS disks to provide cheap, highly available and scalable storage and parallel IO • Multi-path communications Use multiple networks for parallel file transfer

  20. Design Issues • Cost/performance ratio • Increased Availability • Single System Image (look-and-feel of one system) • Scalability (physical, size, performance, capacity) • Fast communication (network and protocols) • Resource balancing (cpu, network, memory, storage) • Security and privacy • Manageability (administration and control) • Usability and applicability (programming environment, cluster-aware apps)

  21. Cluster Objectives • High performance Usually dedicated clusters for HPC Partitioning between users • High throughput Steal idle cycles (cycle harvesting) Maximum utilization of available resources • High availability Fail-over configuration Heartbeat connections • Combined: HP nd HA

  22. Example: MOSIX at HUJI

  23. Example: Berkeley NOW

  24. Cluster Components Nodes Operating System Network Interconnects Communication protocols & services Middleware Programming models Applications

  25. Cluster Components: Nodes • Multiple High Performance Computers • PCs • Workstations • SMPs (CLUMPS) • Processors • Intel/AMD x86 Processors • IBM PowerPC • Digital Alpha • Sun SPARC

  26. Cluster Components: OS • Basic services: • Easy access to hardware • Share hardware resources seemlessly • Concurrency (multiple threads of control) • Operating Systems: • Linux (Beowulf, and many more) • Microsoft NT (Illinois HPVM, Cornell Velocity) • SUN Solaris (Berkeley NOW, C-DAC PARAM) • Mach (-kernel) (CMU) • Cluster OS (Solaris MC, MOSIX) • OS gluing layers (Berkeley Glunix)

  27. Cluster Components: Network • High Performance Networks/Switches • Ethernet (10Mbps), Fast Ethernet (100Mbps), Gigabit Ethernet (1Gbps) • SCI (Scalable Coherent Interface- 12µs latency) • ATM (Asynchronous Transfer Mode) • Myrinet (1.2Gbps) • QsNet (5µsec latency for MPI messages) • FDDI (fiber distributed data interface) • Digital Memory Channel • InfiniBand

  28. Cluster Components: Interconnects • Standard Ethernet • 10 Mbps, cheap, easy way deploy • bandwidth & latency don’t match CPU capabilities • Fast Ethernet, and Gigabit Ethernet • Fast Ethernet – 100 Mbps • Gigabit Ethernet – 1000Mbps • Myrinet • 1.28 Gbps full duplex interconnect, 5-10s latency • Programmable on-board processor • Leverage MPP technology

  29. Interconnects (contd) • Infiniband • Latency < 7s • Insdustry standard based on VIA • Connects components within a system • SCI – Scalable Coherent Interface • Interconnection technology for clusters • Directory based cache scheme • VIA – Virtual Interface Architecture • Standard for low-latency communications software interface

  30. Cluster Interconnects: Comparison

  31. Cluster Components: Communication protocols • Fast Communication Protocols (and user Level Communication): • Standard TCP/IP, 0-Copy TCP/IP • Active Messages (Berkeley) • Fast Messages (Illinois) • U-net (Cornell) • XTP (Virginia) • Virtual Interface Architecture (VIA)

  32. Cluster Components: Communication services • Communication infrastructure • Bulk-data transport • Streaming data • Group communications • Provide important QoS parameters • Latency, bandwidth, reliability, fault-tolerance • Wide range of communication methodologies • RPC • DSM • Stream-based and message passing (e.g., MPI, PVM)

  33. Cluster Components: Middleware • Resides between the OS and the applications • Provides infrastructure to transparently support: • Single System Image (SSI) Makes collection appear as a single machine • System Availability (SA) Monitoring, checkpoint, restart, migration • Resource Management and Scheduling (RMS)

  34. Cluster Components: Programming Models • Threads (PCs, SMPs, NOW..) • POSIX Threads, Java Threads • OpenMP • MPI (Message Passing Interface) • PVM (Parallel Virtual Machine) • Software DSMs (Shmem) • Compilers • Parallel code generators, C/C++/Java/Fortran • Performance Analysis Tools • Visualization Tools

  35. Cluster Components: Applications • Sequential • Parametric Modeling • Embarrassingly parallel • Parallel / Distributed • Cluster-aware • Grand Challenging applications • Web servers, data-mining

  36. Clusters Classification (I) • Application Target • High Performance (HP) Clusters • Grand Challenging Applications • High Availability (HA) Clusters • Mission Critical applications

  37. Clusters Classification (II) • Node Ownership • Dedicated Clusters • Non-dedicated clusters • Adaptive parallel computing • Communal multiprocessing

  38. Clusters Classification (III) • Node Hardware • Clusters of PCs (CoPs) • Piles of PCs (PoPs) • Clusters of Workstations (COWs) • Clusters of SMPs (CLUMPs)

  39. Clusters Classification (IV) • Node Operating System • Linux Clusters (e.g., Beowulf) • Solaris Clusters (e.g., Berkeley NOW) • NT Clusters (e.g., HPVM) • AIX Clusters (e.g., IBM SP2) • SCO/Compaq Clusters (Unixware) • Digital VMS Clusters • HP-UX clusters • Microsoft Wolfpack clusters

  40. Clusters Classification (V) • Node Configuration • Homogeneous Clusters • All nodes will have similar architectures and run the same OS • Semi-Homogeneous Clusters • Similar architectures and OS, varying performance capabilities • Heterogeneous Clusters • All nodes will have different architectures and run different OSs

  41. Clusters Classification (VI) • Levels of Clustering • Group Clusters (#nodes: 2-99) • Departmental Clusters (#nodes: 10s to 100s) • Organizational Clusters (#nodes: many 100s) • National Metacomputers (WAN/Internet) • International Metacomputers (Internet-based, #nodes: 1000s to many millions) • Grid Computing • Web-based Computing • Peer-to-Peer Computing

  42. Summary: Key Benefits • High Performance With cluster-aware applications • High Throughput Resource balancing and sharing • High Availability Redundancy in hardware, OS, applications • Expandability and Scalability Expand on-demand by adding HW

More Related