230 likes | 244 Views
Access and manage remote HPC systems, run applications, handle files, enhance research, and get help. Includes interactive and batch jobs with Slurm.
E N D
Requesting Resources on an HPC Facility (Using the Slurm Workload Manager) Michael Griffiths and Norbert Gyenge Corporate Information and Computing Services The University of Sheffield www.sheffield.ac.uk/cics/research
Review: Objectives • Understand what High Performance Computing is • Be able to access remote HPC Systems by different methods • Run Applications on a remote HPC system • Manage files using the Linux Operating Systems • Know how to use the different kinds of file storage systems • Run applications using a Scheduling System or Workload Manager • Know how to get more resources and how to get resources dedicated for your research • Know how to enhance your research through shell scripting • Know how to get help and training
Outline • Using the Job Scheduler – Interactive Jobs • Batch Jobs • Task arrays • Running Parallel Jobs • Beyond Bessemer Accessing tier 2 resources • Course examples available using • git clone --single-branch --branch bessemer https://github.com/rcgsheffield/hpc_intro
1. Using the Job Scheduler Interactive Jobs https://docs.hpc.shef.ac.uk/en/latest/bessemer/slurm.html#request-an-interactive-shell Batch Jobs https://docs.hpc.shef.ac.uk/en/latest/bessemer/slurm.html#submitting-non-interactive-jobs SLURM Documentation https://slurm.schedmd.com/pdfs/summary.pdf https://slurm.schedmd.com/man_index.html
Running JobsA note on interactive jobs Software that requires intensive computing should be run on the worker nodes and not the head node. You should run compute intensive interactive jobs on the worker nodes by using the command srun --pty bash –I Maximum ( and also default) time limit for interactive jobs is 8 hours.
SLURM Bessemer login nodes are gateways to the cluster of worker nodes. Login nodes’ main purpose is to allow access to the worker nodes but NOT to run cpu intensive programs. All cpu intensive computations must be performed on the worker nodes. This is achieved by; srun --pty bash –I for interactive jobs sbatch submission.sh for batch jobs Once you log into Bessmer, taking advantage of the power of a worker-node for interactive work is done simply by typing. srun --pty bash –I and working in the shell window. The next set of slides assume that you are already working on one of the worker node.
Practice Session 1: Running Applications on BESSEMER (Problem 1) • Case Studies • Analysis of Patient Inflammation Data • Running an R application how to submit jobs and run R interactively • List available and loaded modules load the module for the R package • Start the R Application and plot the inflammation data
Managing Your Jobs SLURM Overview SLURM is the workload management system, job scheduling and batch control system. (Others available such as PBS, Torque/Maui, Platform LSF ) Starts up interactive jobs on available workers Schedules all batch orientated ‘i.e. non-interactive’ jobs Fault Tolerant, highly scalable cluster management and job scheduling system Optimizes resource utilization
B Slot 1 B Slot1 C Slot 1 A Slot 1 A Slot 1 B Slot 2 B Slot 1 A Slot 2 C Slot1 C Slot 2 B Slot 1 C Slot 3 C Slot 1 C Slot 2 B Slot 3 JOB N JOB O JOB U JOB Y JOB Z JOB X Scheduling batch jobs on the cluster SLURM workernode SLURM workernode SLURM workernode SLURM workernode SLURM workernode Queue-A Queue-B Queue-C • Queues • Policies • Priorities • Share/Tickets • Resources • Users/Projects SLURM MASTERnode
Managing Jobs monitoring and controlling your jobs There are a number of commands for querying and modifying the status of a job running or waiting to run. These are; squeue (query job status) squeue –jobs jobid squeue –-users “username” squeue –-users “*” scancel (delete a job) scanceljobid
Demonstration 1 Using the R package to analyse patient data sbatch example: sbatch myjob the first few lines of the submit script myjob contains - $!/bin/bash #SBATCH --time=10:00:00 #SBATCH --output myoutputfile #SBATCH –error myerroroutput and you simply type; SBATCH myjob Running Jobs batch job example
Practice Session: Submitting Jobs To BESSEMER (Problem 2 & 3) • Patient Inflammation Study run the R example as a batch job • Case Study • Fish population simulation • Submitting jobs to SLURM • Instructions are in the readme file in the slurm folder of the course examples • From an interactive session • Load the compiler module • Compile the fish program • Run test1, test2 and test3
Managing Jobs: Reasons for job failures SLURM cannot find the binary file specified in the job script You ran out of file storage. It is possible to exceed your filestore allocation limits during a job that is producing large output files. Use the quota command to check this. Required input files are missing from the startup directory Environment variable is not set correctly (LM_LICENSE_FILE etc) Hardware failure
Finding out the memory requirements of a job Real Memory Limits: Default real memory allocation is 2 Gbytes Request 64GB memory using a batch file #SBATCH --mem=64000 Real memory resource can be requested by using --mem="NN"G Determining the memory requirements for a job; scontrol show jobid –dd <jobid>
Managing Jobs : Running cpu-parallel jobs More many processor tasks Shared memory Distributed Memory #!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks-per-node=40 #SBATCH --mem=64000 #SBATCH --mail-user=username@sheffield.ac.uk module load apps/openmpi/4.0.1/binary Jobs limited to a single node with a maximum of 40 tasks Compilers that support MPI. PGI , Intel, GNU
Demonstration 3 Test 6 provides an opportunity to practice submitting parallel jobs to the scheduler. To run testmpi6, compile the mpi example Load the openmpi compiler module module load apps/openmpi/4.0.1/binary compile the diffuse program mpicc diffuse.c -o diffuse -lm sbatch testmpi6 Use squeue to monitor the job examine the output Running a parallel job
Managing Jobs: Running arrays of jobs Many processors running a copy of a task independently Add the –-array parameter to the script file (with #SBATCH at beginning of the line) Example: #SBATCH --array=1-4:1 This will create 4 tasks from one job Each task will have its environment variable $SLURM_ARRAY_TASK_ID set to a single unique value ranging from 1 to 10. There is no guarantee that task number m will start before task number n , where m<n https://slurm.schedmd.com/job_array.html.
Practice Session: Submitting A Task Array To BESSEMER (Problem 4) • Case Study • Fish population simulation • Submitting jobs to Slurm • Instructions are in the readme file in the Slurm folder of the course examples • From an interactive session • Run the Slurm task array example • Run test4, test5
Beyond BESSEMER • Bessemer and ShARC OK for many compute problems • Purchasing dedicated resource • National tier 2 facility for more demanding compute problems • Archer Larger facility for grand challenge problems (pier review process to access) https://www.sheffield.ac.uk/cics/research/hpc/costs
High Performance Computing Tiers • Tier 1 computing • Archer • Tier 2 Computing • Peta-5, jade • Tier 3 Computing • Bessemer, ShARC
Purchasing Resource https://www.sheffield.ac.uk/cics/research/hpc/costs • Buying nodes using framework • Research Groups purchase HPC equipment against their research grant this hardware is integrated with Iceberg cluster • Buying slice of time • Research groups can purchase servers for a length of time specified by the research group (cost is 1.0p/core per hour) • Servers are reserved for dedicated usage by the research group using a provided project name • When reserved nodes are idle they become available to the general short queues. They are quickly released for use by the research group when required. • For information e-mail research-it@Sheffield.ac.uk
National HPC Services • Tier-2 Facilities • http://www.hpc-uk.ac.uk/ • https://goo.gl/j7UvBa • Archer • UK National Supercomputing Service • Hardware – CRAY XC30 • 2632 Standard nodes • Each node contains two Intel E5-2697 v2 12-core processors • Therefore 2632*2*12 63168 cores. • 64 GB of memory per node • 376 high memory nodes with128GB memory • Nodes connected to each other via ARIES low latency interconnect • Research Data File System – 7.8PB disk • http://www.archer.ac.uk/ • EPCC • HPCC Facilities • http://www.epcc.ed.ac.uk/facilities/national-facilities • Training and expertise in parallel computing
Links for Software Downloads • Moba X-term https://mobaxterm.mobatek.net/ • Putty http://www.chiark.greenend.org.uk/~sgtatham/putty/ • WinSCP http://winscp.net/eng/download.php • TigerVNC http://sourceforge.net/projects/tigervnc/