520 likes | 727 Views
High Performance Cluster Computing. CSI668 Xinyang(Joy) Zhang. Overview of Parallel Computing Cluster Architecture & its Components Several Technical Areas Representative Cluster Systems Resources and Conclusions. Outline. Overview of Parallel Computing.
E N D
High Performance Cluster Computing CSI668 Xinyang(Joy) Zhang
Overview of Parallel Computing Cluster Architecture & its Components Several Technical Areas Representative Cluster Systems Resources and Conclusions Outline CSI668 HPCC
Overview of Parallel Computing CSI668 HPCC
E-commerce/anything Computing Power (HPC) Drivers Life Science Digital Biology Military Applications CSI668 HPCC
How to Run App. Faster ? • Use faster hardware: e.g. reduce the time per instruction (clock cycle). • Optimized algorithms and techniques • Multiple computers to solve problem: That is, increase No. of instructions executed per clock cycle. CSI668 HPCC
Parallel Processing • Limitations on traditional sequential supercomputer • physical limit of the speed • production cost • Rapid increase in the performance of commodity processors • Intel x86 architecture chip • RISC CSI668 HPCC
Parallel Architecture • Processors • amount of processors • processor type • MIPS, HP PA 8000, Digital Alpha, IBM RIOS, Intel Pentium • Memories • Distributed Memory, Shared Memory, Distributed Shared Memory (DSM) • Processor/Memory Interaction • SIMD, MIMD • Interconnection Network • Bus, Ring, Hybrid, etc. CSI668 HPCC
HPC Examples CSI668 HPCC
The Need for Alternative Supercomputing Resources • Vast numbers of under utilized workstations available to use. • Huge numbers of unused processor cycles and resources that could be put to good use in a wide variety of applications areas. • Reluctance to buy Supercomputer due to their cost • Distributed compute resources “fit” better into today's funding model. CSI668 HPCC
What is a cluster? A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone/complete computers cooperatively working together as a single, integrated computing resource. CSI668 HPCC
Motivation for using Clusters • Recent advances in high speed networks • Performance of workstations and PCs is rapidly improving • Workstation clusters are a cheap and readily available alternative to specialized High Performance Computing (HPC) platforms. • Standard tools for parallel/ distributed computing & their growing popularity CSI668 HPCC
Towards Inexpensive Supercomputing • 17 IBM's Netfinity Servers (36 Pentium II chips) Linux Cluster • Cray T3E-900-AC64 • Costs : - IBM $1.5 Million - Cray $5.5 Million CSI668 HPCC
Cluster Computer and its Components CSI668 HPCC
Cluster Computer Architecture CSI668 HPCC
Cluster Components...1a Nodes • Multiple High Performance Components: • PCs • Workstations • SMPs (CLUMPS) • They can be based on different architectures and running difference OS CSI668 HPCC
Cluster Components...1b Processors • There are many (CISC/RISC/VLIW/Vector..) • Intel: Pentiums • Sun: SPARC, ULTRASPARC • HP PA • IBM RS6000/PowerPC • SGI MPIS • Digital Alphas CSI668 HPCC
Cluster Components…2 OS • State of the art OS: • Linux (Beowulf) • Microsoft NT (Illinois HPVM) • Sun Solaris (Berkeley NOW) • IBM AIX (IBM SP2) • Cluster Operating Systems (Solaris MC, MOSIX (academic project) ) • OS gluing layers: (Berkeley Glunix) CSI668 HPCC
Cluster Components…3 High Performance Networks • Ethernet (10Mbps), • Fast Ethernet (100Mbps), • Gigabit Ethernet (1Gbps) • SCI (Dolphin - MPI- 12micro-sec latency) • Myrinet (1.2Gbps) • Digital Memory Channel • FDDI CSI668 HPCC
Cluster Components…4 Communication Software • Traditional OS supported facilities (heavy weight due to protocol processing).. • Sockets (TCP/IP), Pipes, etc. • Light weight protocols (User Level) • Active Messages (Berkeley) • Fast Messages (Illinois) • U-net (Cornell) • XTP (Virginia) • System can be built on top of the above protocols CSI668 HPCC
Cluster Components…5 Cluster Middleware • Resides Between OS and Applications and offers in infrastructure for supporting: • Single System Image (SSI) • System Availability (SA) • SSI makes clusters appear as single machine (globalizes view of system resources). • SA - Check pointing and process migration.. CSI668 HPCC
Cluster Components…6a Programming environments • Shared Memory Based • DSM • OpenMP (enabled for clusters) • Message Passing Based • PVM • MPI (portable to SM based as well) CSI668 HPCC
Cluster Components…6b Development Tools • Compilers • C/C++/Java/ ; • Parallel programming with C++ (MIT Press book) • Debuggers • Performance Analysis Tools • Visualization Tools CSI668 HPCC
Several Topics in Cluster Computing CSI668 HPCC
Several Topics in CC • MPI (Message Passing Interface) • SSI (Single System Image) • Parallel I/O & Parallel File System CSI668 HPCC
Message-Passing Model • A Process is a program counter and address space • Interprocess communication consists of • Synchronization • Movement of data from one process’s address space to another CSI668 HPCC
What is MPI • A message-passing library specification • extends the message-passing model • not a language or product • For parallel computers, cluster, heterogeneous networks • Designed to provide access to advanced parallel hardware for • end users, library writer, tool developers CSI668 HPCC
Some Basic Concepts • Process can be collected into groups • Each message is sent in a context, must be received in the same context. • A group and context together form a communicator. • A process is identified by its rank in the group associated with a communicator • Default communicator whose group contains all initial processes, called MPI_COMM_WORLD CSI668 HPCC
Basic Set of Functions • MPI_INIT • MPI_FINALIZE • MPI_COMM_SIZE • MPI_COMM_RANK • MPI_SEND • MPI_RECV • MPI_BCAST • MPI_REDUCE CSI668 HPCC
A Sample MPI Program... # include <stdio.h> # include <string.h> #include “mpi.h” main( int argc, char *argv[ ]) { int my_rank; /* process rank */ int p; /*no. of processes*/ int source; /* rank of sender */ int dest; /* rank of receiver */ int tag = 0; /* message tag, like “email subject” */ char message[100]; /* buffer */ MPI_Status status; /* function return status */ /* Start up MPI */ MPI_Init( &argc, &argv ); /* Find our process rank/id */ MPI_Comm_rank( MPI_COM_WORLD, &my_rank); /*Find out how many processes/tasks part of this run */ MPI_Comm_size( MPI_COM_WORLD, &p); CSI668 HPCC
A Sample MPI Program if( my_rank == 0) /* Master Process */ { for( source = 1; source < p; source++) { MPI_Recv( message, 100, MPI_CHAR, source, tag, MPI_COM_WORLD, &status); printf(“%s \n”, message); } } else /* Worker Process */ { sprintf( message, “Hello, I am your worker process %d!”, my_rank ); dest = 0; MPI_Send( message, strlen(message)+1, MPI_CHAR, dest, tag, MPI_COM_WORLD); } /* Shutdown MPI environment */ MPI_Finalise(); } CSI668 HPCC
Execution % cc -o hello hello.c -lmpi % mpirun -p2 hello Hello, I am your worker process 1! % mpirun -p4 hello Hello, I am your worker process 1! Hello, I am your worker process 2! Hello, I am your worker process 3! % mpirun hello (no output, there are no workers.., no greetings) CSI668 HPCC
Single System Image • Problem • each nodes has a certain amount of resources that can only be used from that node • This restriction limits the power of a cluster • Solution • implementing a middle-ware layer that glues all operating systems on all nodes • offer a unified access to system resources CSI668 HPCC
What is Single System Image (SSI) ? • A single system image is the illusion, created by software or hardware, that presents a collection of resources as one, more powerful resource. • SSI makes the cluster appear like a single machine to the user, to applications, and to the network. • A cluster without a SSI is not a cluster CSI668 HPCC
Key SSI Services • Single Entry Point • telnet cluster.my_institute.edu • telnet node1.cluster. institute.edu • Single File Hierarchy: Solaris MC Proxy • Single Control Point: Management from single GUI • Single virtual networking • Single memory space - Network RAM / DSM • Single Job Management: Glunix • Single User Interface: Like workstation/PC windowing environment (CDE in Solaris/NT) CSI668 HPCC
Implementing Layers • Hardware Layers • hardware DSM • Gluing layer (operating system) • single file system, software DSM, • e.g. Sun Solaris-MC • Applications and subsystem layer • Single window GUI based tool CSI668 HPCC
Parallel I/O • Needed for I/O intensive applications • Multiple processes participate. • Application is aware of parallelism • Preferably the “file” is itself stored on a parallel file system with multiple disks • That is, I/O is parallel at both ends: • application program • I/O hardware CSI668 HPCC
Parallel File System • A typical PFS: • Compute nodes • I/O nodes • Interconnect • Physical distribution of data across multiple disks in multiple cluster nodes • Sample PFSs • Galley Parallel File System (Dartmouth) • PVFS (Clemson) CSI668 HPCC
PVFS-Parallel Virtual File System • File System • Allow users to store and retrieve data using common file access method(open, close, read, write..) • Parallel • Stores data on multiple independent machines, with separate network connections • Virtual • exists as set of user-space daemons storing data on local file system CSI668 HPCC
PVFS Components... • Two Servers: • mgr - file manager, handles metadata for files • iods - I/O servers, store and retrieve file data • libpvfs: • links clients to PVFS servers • hides details of PVFS access from App. Tasks • multiple interfaces CSI668 HPCC
…PVFS Components • PVFS Linux kernel support • PVFS kernel module registers PVFS file system type • PVFS file system can be mounted • Converts VFS operations to PVFS operations • Requests pass through device file CSI668 HPCC
Access PVFS File Through VFS • I/O operations pass through VFS • PVFS code in kernel pass operation through device • Daemon pvfsd reads requests from /dev/pvfsd • Requests converted to PVFS operations by libpvfs, and send to servers • Data passed back through device CSI668 HPCC
Advantages of PVFS • provide high bandwidth for concurrent read/write operations from multiple processes or threads to a common file • support multiple APIs: • native PVFS API • UNIX/POSIX I/O API • MPI-IO ROMIO • Common Unix shell commands work with PVFS files • ls , cp, rm... • Robust and scalable • Easy to install and use CSI668 HPCC
A Lot More... • Algorithms and Applications • Java Technologies • Software Engineering • Storage Technology • Etc.. CSI668 HPCC
Representative Cluster System CSI668 HPCC
Berkeley NOW • 100 Sun UltraSparcs • 200 disks • Myrinet SAN • 160 MB/s • Fast comm. • AM, MPI, ... • Global OS CSI668 HPCC
Cluster of SMPs (CLUMPS) • 4 Sun E5000s • 8 processors • 4 Myricom NICs each • Multiprocessor, Multi-NIC, Multi-Protocol CSI668 HPCC
Beowulf Cluster in SUNY Albany • Particle physics group • Beowulf Cluster with: • 8 nodes with Pentium III dual processor • Redhat Linux • MPI • Monte Carlo package • Using for data analysis CSI668 HPCC
Resources And Conclusion CSI668 HPCC
Resources • IEEE Task Force on Cluster Computing • http://www.ieeetfcc.org • Beowulf: • http://www.beowulf.org • PFS & Parallel I/O • http://www.cs.dartmouth.edu/pario/ • PVFS • http://parlweb.parl.clemson.edu/pvfs/ CSI668 HPCC
Conclusions Clusters are promising.. Offer incremental growth and matches with funding pattern. New trends in hardware and software technologies are likely to make clusters more promising..so that Clusters based supercomputers can be seen everywhere! CSI668 HPCC