300 likes | 448 Views
Getting acquainted to PDC. Nils Smeds <smeds@pdc.kth.se. Using PDC resources. Acquiring information http://www.pdc.kth.se Guided Tours http://www.pdc.kth.se/support/ AFS Strindberg IBM-SP Helpdesk FAQs Contact information pdc-staff@pdc.kth.se 08-790 7800. http://www.pdc.kth.se/doc.
E N D
Getting acquainted to PDC Nils Smeds <smeds@pdc.kth.se
Using PDC resources • Acquiring information http://www.pdc.kth.se • Guided Tourshttp://www.pdc.kth.se/support/ • AFS • Strindberg IBM-SP • Helpdesk • FAQs • Contact information • pdc-staff@pdc.kth.se • 08-790 7800
Your environment as a user • File systems • AFS — Home directories • GPFS — Parallel file system (IBM SP) • HSM — Hierarchical storage management • /scratch — scratch file systems • Modules — handles $PATH, $MANPATH • module add sp2 local • module show local • E-mail — when you leave • Create $HOME/.forward publicly readablehttp://www.pdc.kth.se/support/misc-tour.html#EMAIL
Kerberos commands • kauth — Proves your identity > ./kauth -n nissej@NADA.KTH.SE -l 60 • klist — List your kerberos tokens > ./klist Ticket file: /tmp/tkt58016 Principal: nissej@NADA.KTH.SE Issued Expires Principal Mar 9 12:18:59 Mar 9 13:18:59 krbtgt.NADA.KTH.SE Mar 9 12:19:25 Mar 9 13:18:59 rcmd.r11n07.pdc.kth.se • kdestroy — removes your ticket file > ./klist Ticket file: /tmp/tkt58016 klist: No ticket file (tf_util) • kpasswd — change your passwd
Commands that rely on kerberos • Getting a shell > ./rxtelnet -l username strindberg.pdc.kth.se > ./telnet -l username strindberg.pdc.kth.se • Transferring files > ./ftp strindberg.pdc.kth.se Connected to r11n07-f.pdc.kth.se 220 r11n07.pdc.kth.se FTP server ready. Name (strindberg.pdc.kth.se:smeds): <RET> S:232- //PDC// S:232- //PDC// Welcome to Strindberg, an IBM SP ... S:232 User smeds logged in. ftp> kauth Password for smeds@NADA.KTH.SE: mypassword S:200 Tickets will be destroyed on exit ftp> binary ftp> put filename.dat ftp> get otherfile.dat ftp> quit
AFS commands http://www.pdc.kth.se/support/afs-tour.html • tokens — List your afs tokens – - smeds> kauth smeds@NADA.KTH.SE's Password: mypassword smeds> unlog smeds> tokens Tokens held by the Cache Manager: --End of list-- smeds> afslog smeds> tokens Tokens held by the Cache Manager: (AFS ID 22557) tokens for afs@nada.kth.se [Expires Aug 19 03:38] (AFS ID 22557) tokens for afs@pdc.kth.se [Expires Aug 19 03:38] --End of list-- smeds> • kauth/kdestroy automatically does afslog/unlog
More AFS • fs — Directory access management smeds> fs setacl directorynameusername rl smeds> fs listacl directoryname smeds> fs setacl directorynameusername none smeds> fs setacl directoryname system:anyuser rl smeds> fs help smeds> fs setacl -h • pts — ACL group management smeds> pts mem username smeds> pts creategroup username:bs106 smeds> pts adduser mybuddymygroup smeds> pts examine mygroup smeds> pts adduser -h • Putting it all together smeds> fs setacl MyProject smeds:buddies rl smeds> fs setacl MyProject smeds:REALbuddies rlidwk
HSM usage • Use tar to pack many files into one file which can be saved in HSM smeds> module add hsm smeds> hsmls -l smeds> tar cvf /scratch/MyAnalysis.tar Results-980812/Run1/ smeds> hsmcopyto /scratch/MyAnalysis.tar Res-980812-1.tar • HSM location is kallsup:/hsm/home/u/username/..., see output from hsmmyhome • You may use kerberized rcp to move files to and from this location. • On line help is available: smeds> hsmls -h smeds> hsmcopyfrom -h
Strindberg usage • http://www.pdc.kth.se/cgi-bin/strindberg-usage.pl
Node types • http://www.pdc.kth.se/compresc/hardware/ • Batch nodes (T) • 160 MHz (640 MFlop/s), 256MB RAM, 2 GB /scratch • Batch nodes with more memory (W,Z) • 160 MHz (640 MFlop/s), 512/1024 MB RAM, 2 GB /scratch • 4-way SMP nodes (M) • 4332 MHz (4664 MFlop/s), 512 MB RAM, 4 GB /scratch • 8-way SMP nodes (N,H) • 8222 MHz (8888 Mflop/s), 4/16 GB RAM • Serial nodes (G, S) • One 135MHz wide node w. 2 GB RAM, some 67MHz nodes
Login node(s) • The node of the SP that you are connected to after./rxtelnet -l name strindberg.pdc.kth.se./rxtelnet -l name august.pdc.kth.se./rxtelnet -l name nf01r01.pdc.kth.se • Interactive nodes • Nodes that are shared among several users. Used for eg debugging and compiling. spattach -i -p# • These nodes must be used with IP communication:export MP_EUILIB=ip • Dedicated (or batch) nodes • Nodes used for production codes and/or longer pre/post-jobsspsubmit -p# -t time -c CAC scriptfilespattach -p# -t time -c CAC
rxtelnet strindberg.pdc.kth.se (New window) klist kauth cd workdir mpcc -g -o myprog myprog.c spattach -i -p5 (wait) ./myprog ./myprog -procs 3 ./myprog -procs 3 -stdoutmode ordered -labelio yes A full interactive example
Interacting with the EASY scheduler smeds> spsubmit -h spsubmit [-h][-inWvCb][-c cac][-I#][-s#][[-p# -t#][-j#][-M] file[args]] -h: help -p processors: number of processors. (Example: -p2W) -t minutes: number of minutes (Wall-clock) -j Job Type: available job types mpi, task, pvm3... -c CAC: optional, submit for accounting group cac. -I InitialDir: optional, default current working directory. -b: optional, hold job until all jobs completed -i: optional, use IP instead of UserSpace. -v: optional, verbose. -C: optional, commit before submit. -s Filename: optional, save EASY generated script [...] program: executable or script. args: optional arguments to program. User smeds can specify: staff free ta.smeds
spsubmit examples • Submitting an MPI program smeds> spsubmit -p 4T -t 30 -j mpi ./mympiprog • Saving the generated script for later re-use smeds> spsubmit -p 4T -t 30 -j mpi -s myscript.esy ./mympiprog • To have a mix of nodes and start on a Z-node smeds> spsubmit -p 1Z8T -t 30 -j mpi ./mympiprog "arg1 'arg2 here'" • Redirecting STDOUT for your program smeds> spsubmit -p 4T -t 30 -j mpi ./mympiprog "> job.out" • Submitting an MPI program smeds> spsubmit -p 4T -t 30 -j mpi ./mympiprog
The batch script file #!/bin/bash #------ Customizable part ------ # (Use submitting directory as working directory) cd $SP_INITIALDIR OUT=MyProgram.out #------ End customizable part ------ #------ Generic part ------ PROGRAM="MyProgram" ; PROGRAMDIR="$HOME/Public/MyProgramDir" export MP_HOSTFILE=$SP_HOSTFILE export MP_PROCS=$SP_PROCS export MP_EUILIB=us ; export MP_EUIDEVICE=css0 export MP_INFOLEVEL=0 export MP_CSS_INTERRUPT="yes” export TMPDIR=/scratch echo "Executing $PROGRAM in directory `pwd` at `date`" poe ${PROGRAMDIR}/${PROGRAM} > $OUT echo "Program finished `date`"
http://wwww.pdc.kth.se/info/qwatch/ • A snapshot from different queues at PDC • Updated at regular intervals • It is generated from the same information you get using the command spq smeds> spq -a smeds> spq -r smeds> spq -u smeds
Scheduling limits smeds> spq -h Usage: spq [-h] [-l] [-L] [-r] [-q] … ... smeds> spq -l NICKNAME SATURATE CAC NJOB Wall Total - - weekend - r1149 1 16h - - weekend saturate tf109 3 169h40 - - night saturate gw11 4 48h40 - - day - gw11 1 3h30 . . . smeds> spq -L INTERVAL NICKNAME MAXNJOB MAXWALLTIME [15h,60h] weekend - - - 30h [4h,15h] night - - - 16h [1h,4h] day - - - 16h . . . [0m01s,2h] Nshort - - 4 -
The concept of CACs • Computer cycle "accounts" smeds> cac members smeds CAC groups smeds is a member of: ta.smeds staff summer-2000 free smeds> cac -h smeds> spjobsummary -c summer-2000 usr jid req npe treq tstart r-cpu ucpu smeds ###### 1G2Z2T 5 0h30 yyhhmm 2h30 1h49 mike ###### 4T 4 0h15 yyhhmm 1h 0h56 . . . smeds> cac -h smeds> spjobsummary -u smeds -f 200003 -l smeds> spjobsummary -h smeds> spsummary -h
Compilers (IBM SP) • cc, mpcc • IBM C-compiler, mpcc adds special flags for compiling MPI parallel programs. Include file search path, tags binary to be parallel etc. • xlC, mpCC • IBM C++-compiler. Not fully ANSI compliant. • xlf, mpxlf, xlf90, mpxlf90, f90 • Fortran, Fortran90/95 compilers • Reentrant code generation (thread safety) • xlc_r, xlf_r, mpxlf90_r … • OpenMP directives only available in Fortran currently
Code optimization • -O2 -O3 • Code restructure. Code in-lining. Level 3 may cause arithmetic reorganization. • -qhot • Higher order transformations of generated code. Uses cache size information. Occasionally slows code down. • -qipa, -O4, -O5 • Interprocedure analysis. Mainly code inlining across file boundaries. -O4 => -O3 -qhot -qipa -qtune=arch -qcache=arch • -qsmp=omp, -qsmp=auto, -qreport=smplist • All of the above. Long compile time. Needs thorough checking of results
The lab session • The object of the exercise is to get familiar to the PDC environment by a hands on experience • The lab session has three parts • Install a kerberos travel kit • The workstation is a Sun Solaris 2.6 workstation • Install a travel kit and verify that you can use that to log in to the SP • Experiment with file systems and storage media • Try AFS, tokens and ACLs • Use the HSM data migration system • Run a Fortran90 program on the IBM SP2 • Serial and parallel - interactive and batch • Play, experiment, think and ask!
Topics that can not be covered in this talk • Compiler options • Optimization options, linker options, file name convention options • Programming tools • Tracing, Sampling, Debugging, F90 conversion • See http://www.pdc.kth.se/compresc/software • Totalview, Foresys, Vampir, Dimemas • Running parallel programs on other computers • Running MPICH in the NADA computer lab rooms
Totalview and “How to trick the OS” • Have the program read from the keyboard as early in the program flow as possible after MPI_Init() • Start the process and attach to the running poe process module add totalview ./myprog & (Start your program) totalview -no_stop_all & (Or start totalview in other window) • "Show all unattached processes" • Attach to the poe process, the debugger locates all MPI processes • Select one of the MPI-processes (not the poe process) • Set break-points later in the program flow if you want • In one of the MPI-process windows say "Go group <G> " • Give the program the input it is waiting for
MPICH - 1.2.0 Argonne National Laboratories Reference MPI implementation module add workshop/5.0 module add mpich/1.2 mpicc -o myprog-sun myprog.c mpirun -np 4 -machinefile LOCAL ./myprog-sun The machinefile is reused up to the number of processes requested by -np Further information on MPI at KTH:http://www.nada.kth.se/datorer/unix/ Running MPI programs on SUNs (locally) LOCAL red01.nada.kth.se red01.nada.kth.se
Running on several hosts You may need to set up AFS tokens on the remote hosts kauth -h red02.nada.kth.se -l 30kauth -h red03.nada.kth.se -l 30kauth -h red04.nada.kth.se -l 30kauth (For your local rights) mpirun -np 4 -machinefile RED ./myprog-sun mpirun -np 6 -nolocal -machinefile RED ./myprog-sun The remote processes are started of by a kerberos rsh to the remote host. Modern kerberos rsh has a call to the command afslog in them. The remote ticket must be there in advance Running MPI programs on SUNs (remote) RED red02.nada.kth.se red03.nada.kth.se red04.nada.kth.se