1 / 42

Distributed & Parallel Computing Cluster

Distributed & Parallel Computing Cluster. Patrick McGuigan mcguigan@cse.uta.edu. DPCC Background. NSF funded Major Research Instrumentation (MRI) grant Goals Personnel PI Co-PI’s Senior Personnel Systems Administrator. DPCC Goals .

maree
Download Presentation

Distributed & Parallel Computing Cluster

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed & Parallel Computing Cluster Patrick McGuigan mcguigan@cse.uta.edu 2/12/04

  2. DPCC Background • NSF funded Major Research Instrumentation (MRI) grant • Goals • Personnel • PI • Co-PI’s • Senior Personnel • Systems Administrator 2/12/04

  3. DPCC Goals • Establish a regional Distributed and Parallel Computing Cluster at UTA (DPCC@UTA) • An inter-departmental and inter-institutional facility • Facilitate collaborative research that requires large scale storage (tera to peta bytes), high speed access (gigabit or more) and mega processing (100’s of processors) 2/12/04

  4. DPCC Research Areas • Data mining / KDD • Association rules, graph-mining, Stream processing etc. • High Energy Physics • Simulation, moving towards a regional Dø Center • Dermatology/skin cancer • Image database, lesion detection and monitoring • Distributed computing • Grid computing, PICO • Networking • Non-intrusive network performance evaluation • Software Engineering • Formal specification and verification • Multimedia • Video streaming, scene analysis • Facilitate collaborative efforts that need high-performance computing 2/12/04

  5. DPCC Personnel • PI Dr. Chakravarthy • CO-PI’s • Drs. Aslandogan, Das, Holder, Yu • Senior Personnel • Paul Bergstresser, Kaushik De, Farhad Kamangar, David Kung, Mohan Kumar, David Levine, Jung-Hwan Oh, Gregely Zaruba • Systems Administrator • Patrick McGuigan 2/12/04

  6. DPCC Components • Establish a distributed memory cluster (150+ processors) • Establish a Symmetric or shared multiprocessor system • Establish a large shareable high speed storage (100’s of Terabytes) 2/12/04

  7. DPCC Cluster as of 2/1/2004 • Located in 101 GACB • Inauguration 2/23 as part of E-Week • 5 racks of equipment + UPS 2/12/04

  8. 2/12/04

  9. 2/12/04

  10. 2/12/04

  11. Photos (scaled for presentation) 2/12/04

  12. DPCC Resources • 97 machines • 81 worker nodes • 2 interactive nodes • 10 IDE based RAID servers • 4 nodes support Fibre Channel SAN • 50+ TB storage • 4.5 TB in each IDE RAID • 5.2 TB in FC SAN 2/12/04

  13. DPCC Resources (continued) • 1 Gb/s network interconnections • core switch • satellite switches • 1 Gb/s SAN network • UPS 2/12/04

  14. DPCC Layout 2/12/04

  15. DPCC Resource Details • Worker nodes • Dual Xeon processors • 32 machines @ 2.4GHz • 49 machines @ 2.6GHz • 2 GB RAM • IDE Storage • 32 machines @ 60 GB • 49 machines @ 80 GB • Redhat 7.3 Linux (2.4.20 kernel) 2/12/04

  16. DPCC Resource Details (cont.) • Raid Server • Dual Xeon processors (2.4 GHz) • 2 GB RAM • 4 Raid Controllers • 2 port controller (qty 1) Mirrored OS disks • 8 port controller (qty 3) RAID5 with hot spare • 24 250GB disks • 2 40GB disk • NFS used to support worker nodes 2/12/04

  17. DPCC Resource Details (cont.) • FC SAN • RAID5 Array • 42 142GB FC disks • FC Switch • 3 GFS nodes • Dual Xeon (2.4 Ghz) • 2 GB RAM • Global File System (GFS) • Serve to cluster via NFS • 1 GFS Lockserver 2/12/04

  18. Using DPCC • Two nodes available for interactive use • master.dpcc.uta.edu • grid.dpcc.uta.edu • More nodes are likely to support other services (Web, DB access) • Access through SSH (version 2 client) • Freeware Windows clients are available (ssh.com) • File transfers through SCP/SFTP 2/12/04

  19. Using DPCC (continued) • User quotas not implemented on home directory yet. Be sensible in your usage. • Large data sets will be stored on RAID’s (requires coordination with sys admin) • All storage visible to all nodes. 2/12/04

  20. Getting Accounts • Have your supervisor request account • Account will be created • Bring ID to 101 GACB to receive password • Keep password safe • Login to any interactive machine • master.dpcc.uta.edu • grid.dpcc.uta.edu • USE yppasswd command to change password • If you forget your password • See me in my office, I will reset your password (with ID) • Call or e-mail me, I will reset your password to the original password 2/12/04

  21. User environment • Default shell is bash • Change with ypchsh • Customize user environment using startup files • .bash_profile (login session) • .bashrc (non-login) • Customize with statements like: • export <variable>=<value> • source <shell file> • Much more information in man page 2/12/04

  22. Program development tools • GCC 2.96 • C • C++ • Java (gcj) • Objective C • Chill • Java • JDK Sun J2SDK 1.4.2 2/12/04

  23. Development tools (cont.) • Python • python = version 1.5.2 • python2 = version 2.2.2 • Perl • Version 5.6.1 • Flex, Bison, gdb… • If your favorite tool is not available, we’ll consider adding it! 2/12/04

  24. Batch Queue System • OpenPBS • Server runs on master • pbs_mom runs on worker nodes • Scheduler runs on master • Jobs can be submitted from any interactive node • User commands • qsub – submit a job for execution • qstat – determine status of job, queue, server • qdel – delete a job from the queue • qalter – modify attributes of a job • Single queue (workq) 2/12/04

  25. PBS qsub • qsub used to submit jobs to PBS • A job is represented by a shell script • Shell script can alter environment and proceed with execution • Script may contain embedded PBS directives • Script is responsible for starting parallel jobs (not PBS) 2/12/04

  26. Hello World [mcguigan@master pbs_examples]$ cat helloworld echo Hello World from $HOSTNAME [mcguigan@master pbs_examples]$ qsub helloworld 15795.master.cluster [mcguigan@master pbs_examples]$ ls helloworld helloworld.e15795 helloworld.o15795 [mcguigan@master pbs_examples]$ more helloworld.o15795 Hello World from node1.cluster 2/12/04

  27. Hello World (continued) • Job ID is returned from qsub • Default attributes allow job to run • 1 Node • 1 CPU • 36:00:00 CPU time • standard out and standard error streams are returned 2/12/04

  28. Hello World (continued) • Environment of job • Defaults to login shell (overide with #!) or –S switch • Login environment variable list with PBS additions: • PBS_O_HOST • PBS_O_QUEUE • PBS_O_WORKDIR • PBS_ENVIRONMENT • PBS_JOBID • PBS_JOBNAME • PBS_NODEFILE • PBS_QUEUE • Additional environment variables may be transferred using “-v” switch 2/12/04

  29. PBS Environment Variables • PBS_ENVIRONMENT • PBS_JOBCOOKIE • PBS_JOBID • PBS_JOBNAME • PBS_MOMPORT • PBS_NODENUM • PBS_O_HOME • PBS_O_HOST • PBS_O_LANG • PBS_O_LOGNAME • PBS_O_MAIL • PBS_O_PATH • PBS_O_QUEUE • PBS_O_SHELL • PBS_O_WORKDIR • PBS_QUEUE=workq • PBS_TASKNUM=1 2/12/04

  30. qsub options • Output streams: • -e (error output path) • -o (standard output path) • -j (join error + output as either output or error) • Mail options • -m [aben] when to mail (abort, begin, end, none) • -M who to mail • Name of job • -N (15 printable characters MAX first is alphabetical) • Which queue to submit job to • -q [name] Unimportant for now • Environment variables • -v pass specific variables • -V pass all environment variables of qsub to job • Additional attributes -w specify dependencies 2/12/04

  31. Qsub options (continued) • -l switch used to specify needed resources • Number of nodes • nodes = x • Number of processors • ncpus = x • CPU time • cput=hh:mm:ss • Walltime • walltime=hh:mm:ss • See man page for pbs_resources 2/12/04

  32. Hello World • qsub –l nodes=1 –l ncpus=1 –l cput=36:00:00 –N helloworld –m a –q workq helloworld • Options can be included in script: #PBS -l nodes=1 #PBS -l ncpus=1 #PBS -m a #PBS -N helloworld2 #PBS -l cput=36:00:00 echo Hello World from $HOSTNAME 2/12/04

  33. qstat • Used to determine status of jobs, queues, server • $ qstat • $ qstat <job id> • Switches • -u <user> list jobs of user • -f provides extra output • -n provides nodes given to job • -q status of the queue • -i show idle jobs 2/12/04

  34. qdel & qalter • qdel used to remove a job from a queue • qdel <job ID> • qalter used to alter attributes of currently queued job • qalter <job id> attributes (similar to qsub) 2/12/04

  35. Processing on a worker node • All RAID storage visible to all nodes • /dataxy where x is raid ID, y is Volume (1-3) • /gfsx where x is gfs volume (1-3) • Local storage on each worker node • /scratch • Data intensive applications should copy input data (when possible) to /scratch for manipulation and copy results back to raid storage 2/12/04

  36. Parallel Processing • MPI installed on interactive + worker nodes • MPICH 1.2.5 • Path: /usr/local/mpich-1.2.5 • Asking for multiple processors • -l nodes=x • -l ncpus=2x 2/12/04

  37. Parallel Processing (continued) • PBS node file created when job executes • Available to job via $PBS_NODEFILE • Used to start processes on remote nodes • mpirun • rsh 2/12/04

  38. Using node file (example job) #!/bin/sh #PBS -m n #PBS -l nodes=3:ppn=2 #PBS -l walltime=00:30:00 #PBS -j oe #PBS -o helloworld.out #PBS -N helloword_mpi NN=`cat $PBS_NODEFILE | wc -l` echo "Processors received = "$NN echo "script running on host `hostname`" cd $PBS_O_WORKDIR echo echo "PBS NODE FILE" cat $PBS_NODEFILE echo /usr/local/mpich-1.2.5/bin/mpirun -machinefile $PBS_NODEFILE -np $NN ~/mpi-example/helloworld 2/12/04

  39. MPI • Shared Memory vs. Message Passing • MPI • C based library to allow programs to communicate • Each cooperating execution is running the same program image • Different images can do different computations based on notion of rank • MPI primitives allow for construction of more sophisticated synchronization mechanisms (barrier, mutex) 2/12/04

  40. helloworld.c #include <stdio.h> #include <unistd.h> #include <string.h> #include "mpi.h" int main( argc, argv ) int argc; char **argv; { int rank, size; char host[256]; int val; val = gethostname(host,255); if ( val != 0 ){ strcpy(host,"UNKNOWN"); } MPI_Init( &argc, &argv ); MPI_Comm_size( MPI_COMM_WORLD, &size ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); printf( "Hello world from node %s: process %d of %d\n", host, rank, size ); MPI_Finalize(); return 0; } 2/12/04

  41. Using MPI programs • Compiling • $ /usr/local/mpich-1.2.5/bin/mpicc helloworld.c • Executing • $ /usr/local/mpich-1.2.5/bin/mpirun <options> \ helloworld • Common options: • -np number of processes to create • -machinefile list of nodes to run on 2/12/04

  42. Resources for MPI • http://www-hep.uta.edu/~mcguigan/dpcc/mpi • MPI documentation • http://www-unix.mcs.anl.gov/mpi/indexold.html • Links to various tutorials • Parallel programming course 2/12/04

More Related