1 / 54

Account Setup and MPI Introduction

Account Setup and MPI Introduction. Parallel Computing & Bioinformatics Lab. Sylvain Pitre ( spitre@scs.carleton.ca ) Web : http://cgmlab.carleton.ca. Overview. CGM Cluster specs Account Creation Logging in Remotely (Putty, X-Win32) Account Setup for MPI Checking Cluster Load

uma-wagner
Download Presentation

Account Setup and MPI Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Account Setup and MPI Introduction Parallel Computing & Bioinformatics Lab Sylvain Pitre (spitre@scs.carleton.ca) Web: http://cgmlab.carleton.ca

  2. Overview • CGM Cluster specs • Account Creation • Logging in Remotely (Putty, X-Win32) • Account Setup for MPI • Checking Cluster Load • Listing Your Jobs • MPI Introduction and Basics Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  3. CGM Lab Cluster Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  4. CGM Lab Cluster (2) • 8 dual-core workstations (total of 16 processors) • Named cgm01, cgm02…cgm08. • Intel Core 2 Duo 1.6GHz, 4GB DDR2 RAM, 320GB disks • Server (cgm01) has an extra terabyte (1TB) disk space. • Connected through a dedicated gigabit switch. • Running Fedora 8 (64-bit). • OpenMPI (http://www.open-mpi.org/) • cgmXX.carleton.ca (SSH, where XX=01 to 08) • Putty (terminal): http://www.putty.nl/download.html • WinSCP (file transfer): http://winscp.net/eng/index.php • XWin-32 (http://www.starnet.com/) Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  5. CGM Lab Cluster (3) • Accounts are handled by LDAP (Lightweight Directory Access Protocol) on the server. • User files are stored on the server and accessed by every workstation using NFS (Network File System). • Same login and password will work on any workstation. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  6. CGM Lab Cluster (4) cgm01 cgm02 cgm03 cgm04 NFS Server LDAP Server Carleton Network cgm05 cgm06 cgm07 cgm08 Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  7. Account Creation • To get an account send an email to Sylvain Pitre (spitre@scs.carleton.ca) • Include in your email • your full name • your email address (if different from the one used to send the email). • your supervisor name (or course professor). • your preferred login name (8 characters max) Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  8. Logging In Remotely • You can login remotely to the cluster by SSH (Secure Shell). • Users familiar to unix/linux should already know how to do this. • Windows users can use Putty, a lightweight SSH client (see link on slide 4) • Windows users can also log in by X-Win32 • DNS names: cgmXX.carleton.ca (XX=01 to 08) • Log in any node except cgm01 (server) Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  9. Logging in with Putty • Under Host Name, enter the cgm machine you want to log into (cgm03 in this case) then click Open. • A terminal will open and ask you for your username then password. • That’s it! You are logged into one of the cgm nodes. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  10. Login with X-Win32 • You can also log in to the nodes using X-Win32 • Open the X-Win32 Configuration program (X-Config) • Under the Sessions Tab, click on Wizard. • Enter a name for the session (ex: cgm03) and under Type click on ssh then click Next. • As host enter the name of the node you wish to connect to (ex: cgm03.carleton.ca) then click Next. • Enter your login name and password and Click Next. • For Command, click on Linux then click Finish. • The new session is now added to your Sessions Window. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  11. Login with X-Win32 (2) • Click on the newly created session then click on Launch. • After launching the session, you might get asked to accept a key (click on “yes”). • You should now be at a terminal. • You can work in this terminal if you wish (like in Putty) but if you wish to have the visual interface type: • gnome-session & • After a few seconds the visual interface will start up. • Now you have access to all the menus and windows of the Fedora 8 interface (using Gnome). Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  12. Login with X-Win32 (3) Demonstration Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  13. Account Setup • First time login: • Once you have your account (send me an email to get one) and login, change your password with the passwd command. • If you are unfamiliar with unix/linux: • I strongly recommend reading some tutorials and playing around with commands (but be careful!). • I assume you have some basic unix/linux knowledge in the rest of the slides. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  14. “Password-less” SSH • In order to run MPI on different nodes transparently, we need to setup SSH to it doesn’t constantly ask us for a password. Type: ssh-keygen -t rsa     cd .ssh cp id_rsa.pub authorized_keys2 chmod go-rwx authorized_keys2 ssh-agent $SHELL ssh-add cd .. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  15. “Password-less” SSH (2) • Now after your initial login you should be able to SSH into any other cgmXX machine without a password. SSH to every workstation in order to add that node to your known_hosts. Type: ssh cgm01 date(answer “yes” when asked) ssh cgm02 date … ssh cgm08 date Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  16. Ready for MPI! • After completing the steps above your account is now ready to run MPI jobs. • Running big jobs on multiple processors • Since there is no job scheduler jobs are launched manually so please be considerate. Use nodes that are not in use or that have less load (I’ll show you how to check). • If you need all the nodes for a longer period of time we’ll try to reserve them for you. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  17. Network Vs. Local Files • If you need to do a lot of disk I/O, it is preferable to use the local disk’s /tmp directory. • Since your account is mounted by NFS, all files written to your home directory are sent to the server (network bottleneck). • To reduce network transfers, place your large input/output files in /tmp on your local node. • Make the filename “unique”. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  18. Checking Cluster Load • To check the load on each workstation type the command: load Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  19. Listing Your Jobs • To check all of your jobs (processes) across the cluster type: listjobs Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  20. MPI Introduction • Message Passing Interface (MPI) • Portable message-passing standard that facilitates the development of parallel applications and libraries. • For parallel computers, clusters… • Not a language in it’s own. It is used as a package with another language, like C or Fortran. • Different implementations: OpenMPI, LAM/MPI, MPICH… • Portable = not limited to a specific architecture. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  21. MPI Basics • Every node (process) executes the same code. • Nodes can follow different paths (Master/slave model) but don’t abuse! • Communication is done by message passing. • Every node has a unique rank (ID) from 0 to p. • The total number of nodes is known to every node. • Synchronous or asynchronous messages. • Thread safe. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  22. Compiling/Running MPI Programs • Compiler: mpicc • Command line: mpirun –n <p> --hostfile <hostfile> <prog> <params> Where <p> is the number of processes you want to use. Can be greater than the number of processors available (used for overloading or simulation). Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  23. Hostfile • For running a job on more than one node, a hostfile must be used. • What’s in a hostfile: • Node name or IP. • How many processors on each node (1 by default). • Example: cgm01 slots=2 cgm02 slots=2 … Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  24. MPI Startup/Finalize #include "mpi.h" int main(int argc, char *argv[]) { int rank, wsize; MPI_Init (&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &wsize); /* CODE */ MPI_Finalize(); return 0; } Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  25. MPI Types Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  26. MPI Functions • Send/receive • Broadcast • All to all • Gather/Scatter • Reduce • Barrier • Other Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  27. MPI Send/Receive (synch) Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  28. MPI Send/Receive (synch) • Communication between nodes (processors). • Blocking int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) *buf send buffer address count number of entries in buffer datatype data type of entries dest destination process rank tag message tag comm communicator *status status after operation (returned) Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  29. MPI Send/Receive (asynch) • A buffer can be used with asynchronous messages. • Problems occur when the buffer becomes empty or full. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  30. MPI Send/Receive (asynch) • Non-blocking (not guaranteed to be received) int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm) Parameters are the same as MPI_Send() and MPI_Recv() Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  31. MPI Broadcast • One to all (including itself). Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  32. MPI Broadcast (syntax) int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm) *buf send buffer address count number of entries in buffer datatype data type of entries root rank of root Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  33. MPI All to All • Flood a message from every process to every process. MPI_AlltoAll(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype datatype, MPI_Comm comm) *sendbuf send buffer address sendcount number of send buffer elements sendtype data type of send elements *recvbuf receive buffer address (loaded) recvcount number of elements each receive recvtype data type of receiving process comm communicator Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  34. MPI All to All Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  35. MPI All to All (alternative) • MPI_AlltoAllv() • Sends data to all processes, with displacement. MPI_Alltoallv ( void *sendbuf, int *sendcounts, int *sdispls, MPI_Datatype sendtype, void *recvbuf, int *recvcnts, int *rdispls, MPI_Datatype recvtype, MPI_Comm comm ) Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  36. MPI Gather (Description) • MPI_Gather() • Each process in comm sends the contents of send buf to the process with rank root. The process root concatenates the received data in process rank order in recvbuf That is the data from process is followed by the data from process which is followed by the data from process, etc. The recv arguments are signicant only on the process with rank root. The argument recv count indicates the number of items received from each process not the total number received Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  37. MPI Scatter (Description) • MPI_Scatter() • The process with rank root distributes the contents of sendbuf among the processes. The contents of sendbuf are split into p segments each consisting of sendcount items The first segment goes to process 0, the second to process 1, etc. The send arguments are significant only on process root. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  38. MPI Gather/Scatter Gather Scatter Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  39. MPI Gather/Scatter (syntax) int MPI_Gather(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) int MPI_Scatter(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) *sendbuf send buffer address sendcount number of send buffer elements sendtype data type of send elements *recvbuf receive buffer address (loaded) recvcount number of elements each receive recvtype data type of receiving process root rank of sending (scatter) or receiving (gather) process comm communicator Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  40. MPI Gatherv/Scatterv • Similar functions than gather/scatter, but allows for varying amounts of data to be sent instead of a fixed amount. • For example, varying parts of an array can be scattered/gathered in one step. • See Parallel Image Processing example to see how they can be used. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  41. MPI Gatherv/Scatterv (Syntax) int MPI_Scatterv(void* sendbuf,int *sendcounts,int *displs,MPI_Datatype sendtype,void* recvbuf,int recvcount,MPI_Datatype recvtype,int root,MPI_Comm comm); int MPI_Gatherv(void* sendbuf,int sendcount,MPI_Datatype sendtype,void* recvbuf,int *recvcounts,int *displs,MPI_Datatype recvtype,int root,MPI_Comm comm); sendcounts* number of send buffer elements for each processes recvcounts* number of elements each receive from each processes *displs displacement for each processor Other parameters are the same as gather/scatter. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  42. MPI Reduce • Gather results and reduce them to one value using an operation (Max, Min, Sum, Product). Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  43. MPI Reduce (syntax) int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm) *sendbuf send buffer address *recvbuf receive buffer address count number of send buffer elements datatype data type of send elements op reduce operation: - MPI_MAX Maximum - MPI_MIN Minimum - MPI_SUM Sum - MPI_PROD Product root root process rank for result comm communicator Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  44. MPI Barrier • Blocks until all processes have called it. int MPI_Barrier(MPI_Comm comm) comm communicator Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  45. Other MPI Routines • MPI_Allgather():Gather values and distribute to all. • MPI_Allgatherv():Gather values into specified locations and distribute to all. • MPI_Reduce_scatter():Combine values and scatter results. • MPI_Wait():Waits for a MPI send/receive to complete then returns. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  46. Parallel Problem Examples • Embarrassingly Parallel • Simple Image Processing (Brightness, Negative…) • Pipelined computations • Sorting • Synchronouscomputations • Heat Distribution Problem • Cellular Automata • Divide and Conquer • N-Body Problem Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  47. MPI Hello World! #include "mpi.h" int main(int argc, char *argv[]) { int rank, wsize; MPI_Status status; MPI_Init (&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &wsize); printf("Hello World!, I am processor %d.\n",rank); MPI_Finalize(); return 0; } Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  48. Parallel Image processing • Input: Image of size MxN. • Output: Negative of the image. • Each processor should have an equal share of the work, roughly (MxN)/P. • Master/slave model • The master will read in the image and distribute the pixels to the slave nodes. Once done the slaves will return the results to the master who will output the negative image. Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  49. Parallel Image processing (2) • Workload • If we have 32 pixels to process, and 4 CPUs, each CPU will process 8 pixels. • For P0, the work will start at pixel 0 (displacement) and process 8 pixels (count). Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

  50. Parallel Image processing (3) • Find the displacement/count for each processor. • Master processor scatters the image: • Execute the negative operation • Gather the results on the master processor. • Displacement (displs) tells you where to start, count (counts) tells you how many to do. MPI_Scatterv (image, counts, displs, MPI_CHAR, image, counts [myId], MPI_CHAR, 0, MPI_COMM_WORLD); MPI_Gatherv (image, counts [myId], MPI_CHAR, image, counts, displs, MPI_CHAR, 0, MPI_COMM_WORLD); Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction

More Related