140 likes | 256 Views
Running jobs on SDSC Resources. Krishna Muriki Oct 16 , 2006 kmuriki@sdsc.edu SDSC User Services. Path directions. DataStar system overview Batch Job environment Simple job compilation Job queues/scripts Job submission Access to HPSS resources. Access to IA64 cluster, job management.
E N D
Running jobs on SDSC Resources Krishna Muriki Oct 16 , 2006 kmuriki@sdsc.edu SDSC User Services
Path directions • DataStar system overview • Batch Job environment • Simple job compilation • Job queues/scripts • Job submission • Access to HPSS resources. • Access to IA64 cluster, job management.
Batch/Interactive computing • Batch job environment • Job Manager – Load Leveler (tool from IBM) • Job Scheduler – Catalina (SDSC internal tool) • Job Monitoring – Various commands • Batch & Interactive use on different nodes. • DataStar Login Nodes • dslogin.sdsc.edu • dspoe.sdsc.edu • dsdirect.sdsc.edu
Queues & nodes • Start with dspoe (interactive queues) • Do production runs from dslogin (normal & normal32 queues) • Use express queues from dspoe to get it right now. • Use dsdirect for special needs.
Now lets do it ! • Example files are located here: • /gpfs/projects/workshop/running_jobs • Copy the whole directory • Use Makefile to compile the source code. • Edit the parameters in the job submission scripts. • Communicate with job manager using his language.
Job Manager language • Ask him to show the queue: llq • Ask him to submit your job to queue: llsubmit • Ask him to cancel your job in the queue: llcancel • Special (more useful commands from SDSC’s inhouse tool – Catalina – plz bare with me – I’m slow ) • ‘showq’ to look at the status of the queue. • ‘show_bf’ to look at the backfill window opportunities
Access to HPSS - 1 • What is HPSS: The centralized, long-term data storage system at SDSC is the High Performance Storage System (HPSS) • currently stores more than 3 PB of data (as of June 2006) • total system capacity of 7.2 PB of data. • Data added at an average rate of 100 TB per month (between Aug’0 5 and Feb’ 06).
Access to HPSS - 2 • First thing – setup your authentication: • run ‘get_hpss_keytab’ script. • Know HPSS language to talk to it: • hsi • htar
Batch/Interactive computing on IA64 • Batch job environment • Job Manager – PBS (Open source tool) • Job Scheduler – Catalina (SDSC internal tool) • Job Monitoring – Various commands & ‘Clumon’ • Batch & Interactive use on different nodes. • IA64 Login Nodes • tg-login1.sdsc.edu ( alias to tg-login.sdsc.edu ) • tg-login2.sdsc.edu • tg-c127.sdsc.edu,tg-c128.sdsc.edu, • tg-c129.sdsc.edu & tg-c130.sdsc.edu.
Queues & Nodes. • Total around 260 nodes • With 2 processors each. • All in single batch queue – ‘dque’ • That’s sufficient now lets do it! • Example files in • /gpfs/projects/workshop/running_jobs • PBS commands – qstat, qsub, qdel
Running Interactive Interactive use is via PBS: qsub -I -V -l walltime=00:30:00 -l nodes=4:ppn=2 • This request is for 4 nodes for interactive use (using 2 cpus/node) for a maximum wall-clock time of 30 minutes. Once the scheduler can honor the request, PBS responds with: “ready” and gives the node names. • Once nodes are assigned, user can now run any interactive command. For example, to run an MPI program, parallel-test on the 4 nodes, 8 cpus: mpirun -np 8 -machinefile $PBS_NODEFILE parallel-test
References • See all web links at • http://www.sdsc.edu/user_services