1 / 24

Using the Batch System at NERSC

Using the Batch System at NERSC. Mark Durst NERSC/USG ERSUG Training, Argonne, IL 28 April 1999. Outline. Quick example How batch processing works Batch and pipe queues How to submit jobs Monitoring jobs Reminders and Pointers. #!/bin/csh # # file: simple1 # #QSUB -q serial

taormina
Download Presentation

Using the Batch System at NERSC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using the Batch Systemat NERSC Mark Durst NERSC/USG ERSUG Training, Argonne, IL 28 April 1999

  2. Outline • Quick example • How batch processing works • Batch and pipe queues • How to submit jobs • Monitoring jobs • Reminders and Pointers

  3. #!/bin/csh # # file: simple1 # #QSUB -q serial #QSUB -J y # keep job log set myname=`whoami` set now=`date` set mylocn=`pwd` echo "" echo "Hello $myname, this is your shell script $0," echo "running at $now." echo "" echo "Your current directory is $mylocn, which should" echo "be the same as $HOME." echo "" echo "I'm going to sleep now." echo "" sleep 90 exit

  4. % cqsub simple1 Task id t51847 inserted into database nqedb. % cqstatl t51847 ----------------------------- NQE 3.3.0.9 Database Task Summary ----------------------------- IDENTIFIER NAME SYSTEM-OWNER OWNER LOCATION ST -------------------- ------- ---------------- -------- ------------------- ---- t51847 simple1 scheduler.main mjdurst NQE Database NNew % cqstatl t51847 ----------------------------- NQE 3.3.0.9 Database Task Summary ----------------------------- IDENTIFIER NAME SYSTEM-OWNER OWNER LOCATION ST -------------------- ------- ---------------- -------- ------------------- ---- t51847 simple1 scheduler.main mjdurst NQE Database NPend % cqstatl t51847 ----------------------------- NQE 3.3.0.9 Database Task Summary ----------------------------- IDENTIFIER NAME SYSTEM-OWNER OWNER LOCATION ST -------------------- ------- ---------------- -------- ------------------- ---- t51847 simple1 lws.mcurie mjdurst NQE Database NSche % cqstatl t51847 ----------------------------- NQE 3.3.0.9 Database Task Summary ----------------------------- IDENTIFIER NAME SYSTEM-OWNER OWNER LOCATION ST -------------------- ------- ---------------- -------- ------------------- ---- t51847 (49939.mcurie) simple1 lws.mcurie mjdurst nqs@mcurie NSubm

  5. % qstat 49939 --------------------------------- NQS 3.3.0.9 BATCH REQUEST SUMMARY --------------------------------- IDENTIFIER NAME USER LOCATION/QUEUE JID PRTY REQMEM REQTIM ST ------------- ------- -------- --------------------- ---- ---- ------ ------ --- 49939.mcurie simple1 mjdurst serial_short@mcurie 3753 25 364 1800 R03 % qstat 49939 nqs-100 qstat: CAUTION Request <49939>: not found. % cqstatl t51847 ----------------------------- NQE 3.3.0.9 Database Task Summary ----------------------------- IDENTIFIER NAME SYSTEM-OWNER OWNER LOCATION ST -------------------- ------- ---------------- -------- ------------------- ---- t51847 (49939.mcurie) simple1 monitor.main mjdurst NQE Database NComp % ls -l total 12 -rwxrw-r-- 1 mjdurst mpccc 365 Jan 15 10:47 simple1* -rw-r--r-- 1 mjdurst mpccc 0 Jan 15 10:50 simple1.e51847 -rw-r--r-- 1 mjdurst mpccc 1285 Jan 15 10:50 simple1.l51847 -rw-r--r-- 1 mjdurst mpccc 2638 Jan 15 10:50 simple1.o51847

  6. % cat simple1.l51847 01/15 10:48:13 Submitting to queue <serial> by <mjdurst(12113)> 01/15 10:48:13 Command line options: <-e /u1/mjdurst/tests/bat.simple/simple1.e51847 -J y -j /u1/mjdurst/tests/bat.simple/simple1.l51847 -lM 28mw 28mw -lT 1800 1800 -mu mjdurst@mcurie -o /u1/mjdurst/tests/bat.simple/simple1.o51847 -r simple1 -x -q serial>. 01/15 10:48:13 Script file options: <-q serial -J y # keep job log>. 01/15 10:48:15 Arrived in <serial@mcurie> from <mcurie>. 01/15 10:48:15 Request-id is <49939.mcurie>, Request name=<simple1>. 01/15 10:48:15 NQE Task ID is <nqedb.t51847>. 01/15 10:48:15 Origin uid=<12113>, Target username=<mjdurst>. 01/15 10:48:15 Account/Project name=<mpccc>, Account/Project ID=<105>. 01/15 10:48:15 Submission security level=<0>, compartments=<0>. 01/15 10:48:17 Account/Project name=<mpccc>, Account/Project ID=<105>. 01/15 10:48:17 Arrived in <serial_short@mcurie> from <serial@mcurie>. 01/15 10:48:20 Submission security level=<0>, compartments=<0>. 01/15 10:48:20 Execution security level=<0>, compartments=<0>. 01/15 10:48:23 Started, pid=<36967>, jid=<3753>, shell=</bin/csh>, umask=<18>. 01/15 10:48:23 Running in queue <serial_short>. 01/15 10:50:02 Finished. 01/15 10:50:02 Returning stderr output file. 01/15 10:50:03 Returning stdout output file.

  7. % cat simple1.o51847 mcurie.nersc.gov, a Cray T3E-900 running UNICOS/mk 2.0.3.32 ------------------------------Contact Information------------------------------ NERSC Web http://www.nersc.gov/ ESnet Web http://www.es.net/ ESCHER Web http://www.nersc.gov/hardware/servers/vis-server.html <snip> CFS CONVERSION CFS to HPSS conversion was successfully completed on January 7, 1999. Users can access all of their CFS files on the new HPSS system, "archive". The cfs command on the NERSC Crays now points to the new HPSS interface, hsi. For more info on using hsi reference this URL: http://www.nersc.gov/hardware/storage/hsi.ch1.html. If your HPSS password fails or you don't have an HPSS account, contact the Account Support group at 1-800-66NERSC, option 2, or (510) 486-8612 ------------------------------------------------------------------------------ Your current working directory is /u/mpccc/mjdurst. Hello mjdurst, this is your shell script /usr/spool/nqe/spool/scripts/++BBI+++++0+++, running at Fri Jan 15 10:48:31 PST 1999. Your current directory is /u1/mjdurst, which should be the same as /u/mpccc/mjdurst. I'm going to sleep now. logout

  8. Why Batch Processing? • Batch queues are necessary: • On systems with many jobs • When scheduling is difficult • To assure greater throughput • Interactive jobs are limited • J90: 10 hrs. • T3E: < 64 PEs, < 30 minutes parallel (1 hr serial) • Some machines/processors batch-only • J90: all batch machines • T3E: many APP PEs (at night, almost all)

  9. The Batch Process • User creates shell script myscript • Submits to NQE with cqsub myscript • Returns NQE task id (e.g., t4913) • NQE forwards to NQS • J90: selects a machine (J90 wait time here) • NQS runs the job • Assign NQS job id (e.g., 6859.mcurie) • Select a batch queue • Place the job there (T3E wait time here) • Run it when appropriate • NQS/NQE returns job logs at completion

  10. Pipe Queues • Groups of batch queues • Direct to a pipe with #QSUB -q serial • Default is production • To see them: qstat -p • T3E: • serial,debug, production,long • J90: • production • batchk (for evening, weekend killeen queues) • batch{b,f,s,c,j} (not recommended)

  11. Preparing for Batch Submission • Write your shell script • C shell or Bourne/Korn shell • Starts in user’s home directory • Debug interactively (if possible) • Decide on needed resources • J90: CPU time, memory • T3E: amount of parallel, serial time; number of PEs • Select other #QSUB options • Check for appropriate queue and submit

  12. Essential options to cqsub (#QSUB directives) • J90: • -lM <mem> • -lT <time> • T3E: • -l mpp_p <num> • -l mpp_t <par_time> • -lT <ser_time> • don’t use -lM

  13. Other cqsub options • -Jy : save job log (recommended) • -j <file>: save it in file • -mb: send mail when job starts (-me: ends) • -a <time>: hold job until after time • -o <file>: put standard output in file • default name: <batfile>.o<id>) • -eo: combine standard error and output • makes output look like terminal record • -x: exports user’s environment to job • -s <shell>: specify shell

  14. Job Submission • cqsub <file> • Can give options at submission time • Override file options • Less dependable • If no file name, expects commands from terminal • Useful behavior in automated script generation & submission • Response: Task id t16839 inserted into database nqedb. • Task id useful for tracking with cqstatl. • Don’t break (Ctrl-C) out of cqsub! • Instead, allow to finish, then use cqdel

  15. Monitoring Jobs • cqstatl<taskid> • cqstatl -a | grep <username> (if no <taskid>) • ST column (“status”) indicates progress • NNew, NPend, NSche: still in NQE • NSubm: submitted to NQS • NComp: done • NTerm: killed • NFail: job failed (user or system error) • IDENTIFIER column holds NQS job id (once submitted) • cqstatl -f<taskid>: details for your job

  16. Monitoring Jobs (cont’d) • T3E: qstat<jobid> once your job reaches NQS • cqstatl -d nqs = qstat • qstat -au <username> (if no <jobid>) • J90: qstat -h<hostname> <jobid> • Find hostname from NQS id (from cqstatl) • e.g., 2861.seymour • ST column (“status”) now indicates • RNN: Running (with NNprocesses) • Qxy: waiting in the queue (xy encodes reason) • man qstat to decode

  17. % cqstatl -a ----------------------------- NQE 3.3.0.9 Database Task Summary ----------------------------- IDENTIFIER NAME SYSTEM-OWNER OWNER LOCATION ST -------------------- ------- ---------------- -------- ------------------- ---- t48217 (46356.mcurie) PCM lws.mcurie alewife nqs@mcurie NSubm t48713 (46848.mcurie) third lws.mcurie u6670 nqs@mcurie NSubm t49200 (47518.mcurie) int566A lws.mcurie u61176 nqs@mcurie NSubm t49245 (47368.mcurie) xqcd_ho lws.mcurie snm nqs@mcurie NSubm t50349 (48480.mcurie) int650 lws.mcurie u61176 nqs@mcurie NSubm t50881 (49338.mcurie) lte34-0 lws.mcurie lungfish nqs@mcurie NSubm <snip> t51870 case17c scheduler.main salmon NQE Database NTerm t51871 case1c9 scheduler.main salmon NQE Database NFail t51872 case16c scheduler.main salmon NQE Database NPend t51873 (49967.mcurie) q_lsms lws.mcurie marlin nqs@mcurie NSubm t51875 case11c scheduler.main salmon NQE Database NPend t51877 (49970.mcurie) G08 lws.mcurie u66870 nqs@mcurie NSubm t51878 (49971.mcurie) qHsig.3 lws.mcurie bass nqs@mcurie NSubm t51881 (49975.mcurie) Jobge_b lws.mcurie carp nqs@mcurie NSubm t51884 (49979.mcurie) job16.a lws.mcurie adt nqs@mcurie NSubm t51885 (49980.mcurie) run_dyn lws.mcurie flounder nqs@mcurie NSubm t51886 (49981.mcurie) jupiter lws.mcurie grouper nqs@mcurie NSubm t51887 (49983.mcurie) JobCZ.b lws.mcurie tarpon nqs@mcurie NComp (output greatly abridged)

  18. % qstat -a --------------------------------- NQS 3.3.0.9 BATCH REQUEST SUMMARY --------------------------------- IDENTIFIER NAME USER LOCATION/QUEUE JID PRTY REQMEM REQTIM ST ------------- ------- -------- --------------------- ---- ---- ------ ------ --- 49979.mcurie job16.ag adt pe32@mcurie 4164 25 255 1520 R03 49936.mcurie akr520 u6677 pe32@mcurie 3732 25 323 1800 R03 49964.mcurie case14c9 salmon pe32@mcurie 3944 25 255 1795 R03 49967.mcurie q_lsms marlin pe32@mcurie 999 28672 1800 Cge 49983.mcurie JobCZ.bb tarpon pe32@mcurie 317 28672 1800 Qge 49984.mcurie bitgc11 u62098 pe32@mcurie 244 28672 1800 Qge 49985.mcurie bitgc11 u62098 pe32@mcurie 242 28672 1800 Qge 49362.mcurie Job_a2 carp pe128@mcurie 5308 25 323 1800 R03 49335.mcurie script.2 sturgeon pe256@mcurie 999 28672 1800 Qqs 49033.mcurie uo2_3h2o dorado gc128@mcurie --- 28672 7200 Hop 49255.mcurie run010_A bluegill long128@mcurie 4617 25 255 1800 R03 49276.mcurie sg3D10 aku long128@mcurie 999 4096 1800 Qce 49277.mcurie sg3D10 aku long128@mcurie 999 4096 1800 Qqu 49867.mcurie run_t4 flounder long128@mcurie 70 28672 1800 Cgg no pipe queue entries (output greatly abridged)

  19. % qstat -f pe32 ------------------------------------ NQS 3.3.0.9 BATCH QUEUE: pe32@mcurie Status: ENABLED/RUNNING ------------------------------------ Priority: 15 <ENTRIES> Total: 17 Running: 5 Queued: 12 Waiting: 0 Holding: 0 Arriving: 0 Exiting: 0 <RUN LIMITS> Queue: 13 User: 2 Group: 20 <COMPLEX MEMBERSHIP> regular <LOCAL SCHEDULER EXTENSIONS> Miser Queue: unspecified Scheduling Window: 0:0.0 <RESOURCE USAGE> LIMIT ALLOCATED Memory Size unlimited 143360kw Quick File Space 0b 0kw MPP Processor Elements 416 60 <REQUEST LIMITS> PER-PROCESS PER-REQUEST type a Tape Drives unspecified (0) type b Tape Drives unspecified (0) type c Tape Drives unspecified (0) type d Tape Drives unspecified (0) (cont’d)

  20. type e Tape Drives unspecified (0) type f Tape Drives unspecified (0) type g Tape Drives unspecified (0) type h Tape Drives unspecified (0) Core File Size unspecified (256mw) Data Size unspecified (256mw) Permanent File Space 20gb 25gb Memory Size 28mw 29mw Nice Increment 5 Quick File Space unspecified (0b) 0b Stack Size unspecified (256mw) CPU Time Limit 3600sec 7200sec Temporary File Space unspecified (0b) unspecified (0b) Working Set Limit unspecified (256mw) MPP Processor Elements 32 MPP Time Limit 15000sec 15000sec Shared Memory Limit unspecified (0mw) Shared Memory Segments unspecified (0) MPP Memory Size unspecified (256mw) unlimited <ACCESS> Route: Pipe Only Users: Unrestricted <CUMULATIVE TIME> System Time: 3563114615067464.00 secs User Time: 281421545294442428.00 secs (qstat -f output, cont’d from previous slide)

  21. Troubleshooting • No task id returned • Typically means NQE down • message like “Can’t connect” • Job doesn’t make it to NQS: try cqstatl <taskid> • NFail usually indicates submission error • Nabort could be a system problem • No listing if many days old (NQE database is purged frequently) • Stuck in NPend status • J90: Many jobs ahead of you? • T3E: over pipe queue limit?

  22. Troubleshooting (cont’d) • Stuck in NSubm : use qstat • Q: normal on T3E, rare on J90 • T3E: • Hop can be allocation problem • C (“checkpointed”) may be daily shuffling • May need both pslist and qstat -m to sort it all out • Job crashes • Read job log, stdout, stderr • ...limit exceeded: ran out of time (or memory, or…) • Job vanishes • Did machine(s) crash? If not, collect info and contact Consultants

  23. Pointers • Batch job is like a login session • Starts in your home directory • Uses your startup files • But doesn’t inherit environment (unless you use -x) • Environment variable ENVIRONMENT • Not set in interactive work, set to BATCH in batch jobs • Can exclude parts of startup files • /usr/tmp faster than home directory • $TMPDIR vanishes (avoids littering) • Just one quota for $TMPDIR , rest of /usr/tmp/ • Can’t monitor batch J90 temp file systems

  24. Pointers (cont’d) • Don’t submit blindly • Debug executables, scripts first • Don’t trust inherited shell scripts • Spend time with man pages • J90: large memory jobs should/must multitask • T3E: reduce serial time in parallel jobs • “Stage” HPSS retrievals (dmget) • Submit follow-on serial jobs within your job

More Related