1 / 46

Programming the CoW!

Programming the CoW!. Tools to start with on the new cluster. What’s it good for?. Net DOOM? It should be good for computation, and to a lesser extent visualization. It’s a shame about Ray. So the CoW is ~4x faster right?. Fortunately for SGI, no!.

iokina
Download Presentation

Programming the CoW!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming the CoW! Tools to start with on the new cluster.

  2. What’s it good for? Net DOOM? It should be good for computation, and to a lesser extent visualization.

  3. It’s a shame about Ray.

  4. So the CoW is ~4x faster right?

  5. Fortunately for SGI, no!

  6. So it should be great for high granularity computations.That is, design your programs to have long processing cycles and infrequent inter-node communication needs, and you should be just fine.

  7. How do we program it? MEMORY Shared Memory – A global memory space is available to all nodes.Nodes use synchronization primitives to avoid contention. NODE NODE NODE NODE NODE MEM MEM MEM MEM MEM NODE NODE NODE NODE NODE Message Passing – Every node has only private memory space. All communications between nodes have to be explicitly directed.

  8. Thread Matrix Multiply Workers split L, and each multiplies with all of R to get a part of RES. L R RES x =

  9. Thread Matrix Multiply Example

  10. On the cluster we have no hardware support for SM, so MP is the natural alternative. Unix supports sockets for MP. People have built higher level MP libraries out of sockets that make life easier. Two that I am familiar with are PVM and MPI.

  11. PVM: Parallel Virtual Machine. Started in 1989. http://www.csm.ornl.gov/pvm A PVM is a virtual machine made of a collection of independent nodes. It has a lot of support for heterogeneous clusters. It’s easy to use, and maybe lower performing than MPI.

  12. PVM Each node runs one pvmd daemon. Each node can run one or more tasks. Tasks use the pvmd to communicate with other tasks. Task can start new tasks, stop tasks, or delete nodes from the PVM at will. Tasks can be grouped. PVM comes with a console program that lets you control the PVM easily.

  13. PVM: Setup #Where PVM is installed. setenv PVM_ROOT /home/demarle/sci/distrib/mps/pvm3 #What type of machine this node is. setenv PVM_ARCH LINUX #Where the ssh command is. setenv PVM_RSH /local/bin/ssh #Where your PVM applications are. setenv PVMBIN $PVM_ROOT/bin/LINUX #Where the pvm executables are. setenv PATH ${PATH}:$PVM_ROOT/lib setenv PATH ${PATH}:$PVM_ROOT/bin/LINUX

  14. PVM CONSOLE:[demarle@labnix13 scisem]$ pvmpvm> add labnix14add labnix141 successful HOST DTID labnix14 80000pvm> confconf2 hosts, 1 data format HOST DTID ARCH SPEED DSIG labnix14 40000 LINUX 1000 0x00408841 labnix13 80000 LINUX 1000 0x00408841pvm> quitquitConsole: exit handler calledpvmd still running.[demarle@labnix13 scisem]$

  15. PVM CONSOLE, continued: [demarle@labnix13 scisem]$ cord_racerSuspended[demarle@labnix13 scisem]$ pvmpvmd already running.pvm> psps HOST TID FLAG 0x COMMAND labnix13 40016 4/c - labnix13 40017 6/c,f adsmd use "pvm> help" to get a list of commands.use "pvm> kill" to kill tasks use "pvm> delete" to delete nodes from the PVM.use "pvm> halt" to stop every pvm task and daemon.

  16. PVM: IMPORTANT LIBRARY CALLS

  17. PVM: IMPORTANT LIBRARY CALLS

  18. PVM: IMPORTANT LIBRARY CALLS

  19. PVM: IMPORTANT LIBRARY CALLS

  20. Message Passing Matrix Multiply Workers split L and R. They always multiply their L’, and take turns broadcasting their R’. L R RES x =

  21. PVM Matrix Multiply Example

  22. MPI: Message Passing Interface. Started in 1992. http:// www-unix.mcs.anl.gov/mpi/index.html Goal - to standardize message passing so that parallel code can be portable. Unlike PVM it does not specify the virtual machine environment. For instance, it does say how to start a program. It has more basic operations than PVM. It's supposed to be lower level and faster.

  23. MPICH A free implementation of the MPI standard. http://www-unix.mcs.anl.gov/mpi/mpich + it comes with some extras, like scripts that give you some of PVM’s niceties. • mpirun - a script to start your programs with. • mpicc, mpiCC, mpif77, and mpif90. • MPE – a set of performance analysis and program visualization tools.

  24. MPI: Setup #where MPI is installed. setenv MYMPI /home/demarle/sci/distrib/mps/mpi/mpich-1.2.3 #Where the ssh command is. setenv RSHCOMMAND /local/bin/ssh #where the executables are. setenv PATH ${PATH}:${MYMPI}/bin Uses a file to specify which machines you can use. ${MYMPI}/util/machines/machines.LINUX To start an executable: mpirun <-dbg-gdb> -np # filename

  25. MPI: IMPORTANT LIBRARY CALLS

  26. MPI: IMPORTANT LIBRARY CALLS

  27. MPI: IMPORTANT LIBRARY CALLS

  28. If you don't want the overhead of the PVM and MPI libraries and daemons, you can do essentially the same thing with sockets. Sockets will be faster, but also harder to use. They don’t come with groups, barriers, reductions, etc. You have to create these yourself.

  29. SOCKETS Think of file desriptors: sock = socket() ~ fd = fopen int sock = socket(Domain, Type, Protocol); Domain AF_INET over the net AF_UNIX local to a node Type SOCK_STREAM 2ended connections, reliable, no limit. ie TCP SOCK_DGRAM connectionless, unreliable, ~1500 bytes ie UDP Protocol - like a flavor of the domain, these two just take 0

  30. Basic Process for a Master Task //open a socket, like a file descriptor sock=socket(AF_INET, SOCK_STREAM, 0); //bind your end to this machine's IP address and this programs PORT int ret = bind (sock, (struct sockaddr *) &servAddr, sizeof(servAddr)); //let the socket listen for connections from remote machines ret = listen(sock, BACKLOG); //start remote programs system("ssh labnix14 worker.exe"); TO BE CONTINUED …

  31. Basic Process for a Worker //put yourself in background and nohup, to let the master continue ret = daemon(1,0); //open a socket int sock = socket(AF_INET,SOCK_STREAM,0); //bind your end with this machine's IP address and this program’s PORT ret = bind(sock, (struct sockaddr *) &cliAddr, sizeof(cliAddr)); //connect this socket to the listening one in the master ret = connect(sock, (struct sockaddr *) &servAddr, sizeof(servAddr)); TO BE CONTINUED…

  32. Basic Process for a Master Task, cont. //accept each worker’s connection to finish a new two ended socket. children[c].sock = accept(sock, (struct sockaddr *)&children[c].cliAddr, &children[c].cliAddrLen ); //send and receive over the socket as you like ret = send(children[c].sock, parms, 8*sizeof(double), 0); ret = recv(children[c].sock, RES+rr*rsc, rpr*rpc, MSG_WAITALL); //close the sockets when you are done with them close(children[c].sock);

  33. Basic Process for a Worker, cont. //send and receive data as you please ret = recv(sock, parms, 7*sizeof(int), 0); ret = send(sock, (void *)RET, len2, 0); //close the socket when you are done with it close(sock);

  34. Shared Memory on cluster? SM code was so much simpler. So a lot of people have built DSM Systems. • Adsmith, CRL, CVM, DIPC, DSM-PM2, PVMSYNC, Quarks, SENSE, TreadMarks to name a few…

  35. Two types of Software DSMs

  36. PAGE Based DSMs Use of the Virtual Memory Manager. • Install a signal handler to catch segfaults. • Use mprotect to protect virtual memory pages assigned to remote nodes. • On a segfault - the process blocks - the segfault handler gets a page from a remote node – returns to the process. It suffers when two or more nodes want to write to different and unrelated places on the same memory page.

  37. Object Based DSMs • Let the programmer define the unit of sharing and then provide each shared object with something like load, modify and save methods. • They can eliminate false sharing, but they often aren’t as easy to use.

  38. DIPC • Distributed Inter Process Communication • Page Based. • It’s an extension to the Linux Kernel Specifically it extends SYSTEM V IPC

  39. SYSTEM V IPC? • Like an alternative to threads, it lets arbitrary unrelated processes work together. • Threads share the program's entire global space. • For shmem, processes explicitly declare what is shared. • SYSTEM V IPC also means messages and semaphores.

  40. Basic idea //create an object to share volatile struct shared { int i; } *shared; //make the object shareable shmid = shmget(IPC_PRIVATE, sizeof(struct shared), (IPC_CREAT | 0600)); shared = ((volatile struct shared *) shmat(shmid, 0, 0)); shmctl(shmid, IPC_RMID, 0); //start children, now they don't have copies of “shared”, they all actually access the original one. fork() //all children can access the shared whenever they want shared->i = 0;

  41. How would this change for DIPC? #define IPC_DIPC 00010000 shmid = shmget(IPC_PRIVATE, sizeof(struct shared), (IPC_CREAT | IPC_DIPC | 0600) ); //Same thing applies for semget and msgget.

  42. DIPC works by adding a small modification to the Linux kernel. The kernel looks for IPC_DIPC structures, and bumps them out to a user level daemon. Structures without the flag are treated normally. The daemon satisfies the request over the network, and then returns the data to the kernel. Which in turn returns the data to the user process.

  43. The great thing about DIPC is that it is very compatible with normal Linux. A DIPC program will run just fine on an isolated machine without DIPC, the flag will just be ignored. This means you can develop your software off the cluster and then just throw it on to make use of all the CPU's.

  44. DIPC Problems? Does strict sequential consistency, which is very easy to use but wastes a lot of network traffic. The version for the 2.4.X kernel isn't finished yet.

  45. Summary • CPU , COMMUNICATIONS  • MP: PVM, MPI, SOCKETS • DSM: DIPC?, Quarks?, …

  46. REFERENCES PVM http://www.csm.ornl.gov/pvm MPI http://www-unix.mcs.anl.gov/mpi/index.html MPICH http://www-unix.mcs.anl.gov/mpi/mpich DIPC http://wallybox.cei.net/dipc

More Related