1 / 53

Using Kure and Killdevil

Using Kure and Killdevil. Mark Reed Grant Murphy ITS Research Computing. Outline. Compute Clusters Killdevil Kure Logging In File Spaces User Environment and Applications, Compiling Job Management. Logistics. Course Format Lab Exercises Breaks. Links. UNC Research Computing

gella
Download Presentation

Using Kure and Killdevil

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Kure and Killdevil Mark Reed Grant Murphy ITS Research Computing

  2. Outline • Compute Clusters • Killdevil • Kure • Logging In • File Spaces • User Environment and Applications, Compiling • Job Management

  3. Logistics • Course Format • Lab Exercises • Breaks

  4. Links • UNC Research Computing • http://its.unc.edu/research • Getting started Killdevil page • http://help.unc.edu/CCM3_031537 • Killdevil FAQ • http://help.unc.edu/CCM3_031548 • Getting started Kure page • http://help.unc.edu/ccm3_015682

  5. What is a compute cluster?What exactly is Killdevil? Kure?

  6. What is a compute cluster? Some Typical Components • Compute Nodes • Interconnect • Shared File System • Software • Operating System (OS) • Job Scheduler/Manager • Mass Storage

  7. Compute Cluster Advantages • fast interconnect, tightly coupled • aggregated compute resources • can run parallel jobs to access more compute power and more memory • large (scratch) file spaces • installed software base • scheduling and job management • high availability • data backup

  8. Multi-Core Computing • The trend in High Performance Computing is towards multi-core or many core computing. • More cores at slower clock speeds for less heat • Dual and quad core processors are now common. • Soon 64+ core processors will be common • And these may be heterogeneous!

  9. The Heat Problem Taken From: Jack Dongarra, UT

  10. More Parallelism Taken From: Jack Dongarra, UT

  11. Kure • A HPC/HTC research compute cluster in RC • Named after the beach in North Carolina • It’s pronounced like the Nobel prize winning physicist and chemist, Madame Curie

  12. Kure Compute Cluster priority usage for patrons Buy in is cheap Storage Scratch space same as emerald No AFS home • Heterogeneous Research Cluster • Hewlett Packard Blades • 200+Compute Nodes, mostly • Xeon 5560 2.8 GHz • Nehalem Microarchitecture • Dual socket, quad core • 48 GB memory • over 1800 cores • some higher memory nodes • Infiniband4x QDR

  13. Kure Cont. • The original configuration of Kure was mostly homogeneous but it became increasingly heterogeneous as patrons added to it. • Most (non-patron) compute nodes are 48 GB but there are additional high memory nodes • 3 nodes each with 128 GB of memory • 2 nodes each with 96 GB of memory • patron nodes with 72 GB of memory

  14. Multi-Purpose Killdevil Cluster • High Performance Computing • Large parallel jobs, high speed interconnect • High Throughput Computing (HTC) • high volume serial jobs • Large memory jobs • special nodes for extreme memory • GPGPU computing • computing on Nvidia processors

  15. Killdevil Nodes • Three types of nodes: • compute nodes • large memory nodes • GPGPU nodes

  16. Killdevil Cluster – Compute Nodes • Intel Xeon processors, Model X5670 • Dual socket hex core (12 cores per node) • 2.93 GHz processors for each core • 12M L3 cache per socket • 604nodes with 48 GB memory per node • 68 nodes with 64 GB memory • total of 672 nodes with 8064 cores • plus GPU and large memory nodes

  17. Killdevil Extreme Memory Nodes • 2 nodes each with 1 TB of memory • extremely large shared memory node! • Intel Xeon Model X7550 • 32 cores per node • 2.0 GHz processors • Use the bigmem queue

  18. Killdevil GPGPU Computing • General Purpose computing on Graphics Processing Units (GPGPU) • 32 compute nodes are paired with 64 GPU’s in a 2:1 ratio • this is configurable and may vary • compute nodes are Intel Xeon X5650, 2.67 GHz, 12 cores, 48 GB memory nodes • GPUs are Nvidia Tesla (M2070), each with 448 compute cores • Specify gpu resource on bsub

  19. Infiniband Connections • Connection comes in single (SDR), double (DDR), and quad data rates (QDR). • Killdevil is QDR. • Single data rate is 2.5 Gbit/s in each direction per link. • Links can be aggregated - 1x, 4x, 12x. • Killdevil is 4x. • Links use 8B/10B encoding —10 bits carry 8 bits of data — useful data transmission rate is four-fifths the raw rate. Thus single, double, and quad data rates carry 2, 4, or 8 Gbit/s respectively. • Data rate for Killdevil is 32 Gb/s or 4 GB/s (4x QDR).

  20. Login to Killdevil/Kure • Use ssh to connect: • ssh killdevil.unc.edu • ssh kure.unc.edu • SSH Secure Shell with Windows • see http://shareware.unc.edu/software.html • For use with X-Windows Display: • ssh –X killdevil.unc.edu orssh –X kure.unc.edu • ssh –Y killdevil.unc.edu orssh –Y kure.unc.edu • Off-campus users (i.e. domains outside of unc.edu) must use VPN connection

  21. File Spaces

  22. Killdevil File Spaces • Home directories • /nas02/home/<a>/<b>/<onyen> • a = first letter of onyen, b = second letter of onyen • hard limit of 15 GB • Scratch Space • NOT backed up • purged regularly (21 days or less) • run jobs with large output in these spaces • /netscr – 15 TB (tuned for small files) • /lustre – 126 TB TB (tuned for large files) • Mass Storage • ~/ms

  23. Kure File Spaces • Home directories • /nas02/home/<a>/<b>/<onyen> • a = first letter of onyen, b = second letter of onyen • hard limit of 15 GB • Scratch Space • NOT backed up • purged regularly (21 days or less) • run jobs with large output in these spaces • /netscr – 15 TB (tuned for small files) • /largefs – 24 TB (tuned for large files) • Mass Storage • ~/ms

  24. File System Notes • Note that the same home directory is mounted on Killdevil and Kure • Check your home file space usage with the quota command • quota –s (this uses more readable units) • Lustre file space in Killdevil is attached via Infiniband and may be faster • Best practice for jobs with large output is to run them in scratch space, tar and compress results, and store them in mass storage.

  25. Mass Storage • long term archival storage • access via ~/ms • looks like ordinary disk file system – data is actually stored on tape • “limitless” capacity • data is backed up • For storage only, not a work directory (i.e. don’t run jobs from here) • if you have many small files, use tar or zip to create a single file for better performance • Sign up for this service on onyen.unc.edu “To infinity … and beyond” - Buzz Lightyear

  26. User Environment and Applications, Compiling Code Modules

  27. Modules • The user environment is managed by modules. They provide a convenient way to access software applications • Modules modify the user environment by modifying and adding environment variables such as PATH or LD_LIBRARY_PATH • Typically you set these once and leave them • Note there are two module settings, one for your current environment and one to take affect on your next login (e.g. batch jobs running on compute nodes)

  28. Common Module Commands • module avail • module avail apps • module help Change Current Shell • module list • module add • module rm Login version • module initlist • module initadd • module initrm More on modules see http://help.unc.edu/CCM3_006660

  29. Compiling on Killdevil/Kure Serial Programming • Suites for C, C++, Fortran90, Fortran77, etc • Intel Compilers • icc, icpc, ifort • GNU • gcc, g++, gfortran • Portland Group (PGI) • pgcc, pgCC, pgf90, pgf77 • Generally speaking the Intel or PGI compilers will give slightly better performance

  30. Parallel Jobs with MPI • There are three implementations of the MPI standard installed on both systems: • mvapich • mvapich2 • openmpi • Platform MPI may be added as a module soon on Killdevil • Performance is similar for all three, all three run on the IB fabric. Mvapich is the default. Openmpi and mvapich2 have more the the MPI-2 features implemented.

  31. Compiling MPI programs • Use the MPI wrappers to compile your program • mpicc, mpiCC, mpif90, mpif77 • the wrappers will find the appropriate include files and libraries and then invoke the actual compiler • for example, mpicc will invoke either gcc, pgccor icc depending upon which module you have loaded

  32. Compiling on Killdevil/Kure Parallel Programming • MPI (see previous page) • OpenMP • Compiler flag: • -openmp for Intel • -fopenmp for GNU • -mp for PGI • Must set OMP_NUM_THREADS in submission script

  33. Debugging - Totalview • If you are debugging code there is a powerful commercial debugger, totalview • See http://help.unc.edu/CCM3_021717 • parallel and serial code • Fortran/C/C++ • GUI for source level control • too many features to list!

  34. Job Scheduling and Management LSF

  35. What does a Job Scheduler and batch system do? Manage Resources • allocate user tasks to resource • monitor tasks • process control • manage input and output • report status, availability, etc • enforce usage policies

  36. Job Scheduling Systems • Allocates compute nodes to job submissions based on user priority, requested resources, execution time, etc. • Many types of schedulers • Load Sharing Facility (LSF) – Used by Killdevil/Kure • IBM LoadLeveler • Portable Batch System (PBS) • Sun Grid Engine (SGE)

  37. LSF • All Research Computing clusters use LSF to do job scheduling and management • LSF (Load Sharing Facility) is a (licensed) product from Platform Computing • Fairly distribute compute nodes among users • enforce usage policies for established queues • most common queues: int, now, week, month • RC uses Fair Share scheduling, not first come, first served (FCFS) • LSF commands typically start with the letter b (as in batch), e.g. bsub, bqueues, bjobs, bhosts, … • see man pages for much more info!

  38. Simplified view of LSF job dispatched to run on available host which satisfies job requirements Jobs Queued job_J job_F myjob job_7 Login Node job routed to queue bsub–n 64 –a mvapich–q week mpirunmyjob user logged in to login node submits job

  39. Running Programs on Killdevil • Upon ssh to Killdevil/Kure, you are on the Login node. • Programs SHOULD NOT be run on Login node. • Submit programs to one of the many, many compute nodes. • Submit jobs using Load Sharing Facility (LSF) via the bsub command.

  40. Common batch commands • bsub - submit jobs • bqueues – view info on defined queues • bqueues –l week • bkill – stop/cancel submitted job • bjobs – view submitted jobs • bjobs –u all • bhist – job history • bhist –l <jobID>

  41. Common batch commands • bhosts – status and resources of hosts (nodes) • bpeek – display output of running job • Use man pages to get much more info! • man bjobs

  42. Submitting Jobs: bsub Command Submit Jobs - bsub Run large jobs out of scratch space, smaller jobs can run out of your home space bsub [-bsub_opts] executable [-exec_opts] Common bsub options: –o <filename> –o out.%J -q <queue name> -q week -R “resource specification” -R “span[ptile=8]” -n <number of processes> used for parallel, MPI jobs -a <application specific esub> -a mvapich(used on MPI jobs)

  43. Two methods to submit jobs: • bsub example: submit the executable job, myexe, to the week queue and redirect output to the file out.<jobID> (default is to mail output) • Method 1: Command Line • bsub –q week –o out.%Jmyexe • Method 2: Create a file (details to follow) called, for example, myexe.bsub, and then submit that file. Note the redirect symbol, < • bsub < myexe.bsub

  44. Method 2 cont. • The file you submitted will contain all the bsub options you want in it, so for this example myexe.bsub will look like this #BSUB –q week #BSUB –o out.%J myexe • This is actually a shell script so the top line could be the normal #!/bin/csh, etc and you can run any commands you would like. • if this doesn’t mean anything to you then nevermind :)

  45. Parallel Job example Batch Command Line Method • bsub –q week –o out.%J-n 64 -a mvapich mpirun myParallelExe Batch File Method • bsub < myexe.bsub • where myexe.bsub will look like this #BSUB –q week #BSUB –o out.%J #BSUB –a mvapich #BSUB –n 64 mpirunmyParallelExe

  46. Some Killdevil Queues Most users have a 1024 job slots run limit unless they have been granted extra slots. Queues are always subject to change. Use the bqueues command to find the current status.

  47. Some Kure Queues Most users have a 64 job slots limit unless they have been granted extra slots. Queues are always subject to change. Use the bqueues command to find the current status.

  48. Common Error 1 • If job immediately dies, check err.%J file • err.%J file has error: • Can't read MPIRUN_HOST • Problem: MPI enivronment settings were not correctly applied on compute node • Solution: Include mpirun in bsub command

  49. Common Error 2 • Job immediately dies after submission • err.%J file is blank • Problem: ssh passwords and keys were not correctly setup at initial login to Killdevil • Solution: • cd ~/.ssh/ • mvid_rsaid_rsa-orig • mv id_rsa.pub id_rsa.pub-orig • Logout of Killdevil • Login to Killdevil and accept all defaults

  50. Interactive Jobs • To run long shell scripts on Kure, use int queue • bsub –q int –Ip /bin/bash • This bsub command provides a prompt on compute node • Can run program or shell script interactively from compute node • on Killdevil use hour or day as needed • bsub –q hour –Ip /bin/bash

More Related