160 likes | 193 Views
This guide provides information on the IITJ HPC system software, including the operating system RHEL 6.9, job management system PBS with Torque, MPI libraries, Intel/GNU compilers, SSH login, Linux commands, job submission and monitoring, and OpenMP programming.
E N D
Software List Operating System RHEL 6.9 Job Management System Portable Batch System (pbs) PBS with torque Compilers, Languages Intel Fortran/C/C++ Compiler for Linux V11 GNU :Fortran/C/C++ Compiler 2
Message Passing Interface (MPI) Libraries MVAPICH OPEN MPI 3
Hostnames Master node (Use for Job Submit) hpc-login (192.168.1.33) Compute nodes node01 ........node42 4
Basic login Remote login to the master node Terminal login using secure shell ssh –X username@192.168.1.33 Login From Windows Host PuTTY & Mobaxterm e.g. 5
Login from Windows Host Using PuTTY to setup a secured connection: Host Name=192.168.1.33 6
Linux commands Both master and compute nodes are installed with Linux Frequently used Linux command in HPC cluster 7
Job Submission Procedure Prepare and compile a program, e.g. mpicc –o hello hello.c Prepare a job submission script, e.g. Qhello.pbs Submit the job using qsub. e.g. qsub Qhello.pbs Note the jobID. Monitor with qstat Examine the error and output file. e.g. hello.oJobID, hello.eJobID 8
Compiling & Running MPI Programs • Using Openmpi • Setting path, at the command prompt, type: export PATH=/opt/openmpi/bin:$PATH • Compile using mpicc, mpiCC, mpif77 or mpif90, e.g. mpicc –o hello hello.c • Prepare hostfile (e.g. machines) number of compute nodes: hpc-login node01 node02 node03 • Run the program with a number of processor node: mpirun –np 4 –hostfile machines ./hello 9
Prepare parallel job script, Qhello.pbs #!/bin/sh ### Job name #PBS -N hello #PBS -q large #PBS -l nodes=2:ppn=12 #PBS -l walltime=700:00:00 #PBS -e error #PBS -o output cd $PBS_O_WORKDIR mpirun –np 24 -machinefile $PBS_NODEFILE <Your executable PATH> <-any option> <Your input file name> <Your output file name> 10
Job submission and monitoring • Submit the job qsub <your job submit file> • Note the jobID. e.g. 15238.NAME • Monitor by qstat. e.g qstat 15238 Job id Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 15238.hpc-login hello test 0 R large
Job monitoringShow the status of submitted jobs • qstat: • Delete jobID by qdel. e.g. qdel<JobID>
OpenMP The OpenMP Application Program Interface (API) supports multi-platform shared-memory parallel programming in C/C++ and Fortran on all architectures, including Unix platforms and Windows NT platforms. Jointly defined by a group of major computer hardware and software vendors. OpenMP is a portable, scalable model that gives shared-memory parallel programmers a simple and flexible interface for developing parallel applications for platforms ranging from the desktop to the supercomputer. 13
Sample openmp example #include <omp.h> #include <stdio.h> int main() { #pragma omp parallelprintf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(),omp_get_num_threads()); } 14
Good Practice in using IITJ HPC • Every user shall apply for his/her own computer user account to login to the master node of the HPC • The account must not be shared his/her account and password with the other users. • Every user must deliver jobs to the HPC cluster from the master node (hpc-login via the PBS (qsub) job queuing system. Automatically dispatching of job using scripts are not allowed. • Foreground jobs on the HPC cluster are restricted to program testing and the time duration should not exceed 1 minutes CPU time per job.
Good Practice in using IITJ HPC • logout from the master node/compute nodes after use • delete unused files or compress temporary data • estimate the walltime for running jobs and acquire just enough walltime for running. • never run foreground job within the master node and the compute node • report abnormal behaviors.