1 / 10

Cray XT3 Programming Introduction

Cray XT3 Programming Introduction. John Levesque Cray, Inc levesque@cray.com. Lawrence Livermore National Laboratory XT3 Workshop Sept 26-28. XT3 Programming Environment. Compiler, basic libraries and communication libraries PGI C/C++/Fortran compiler 6.0.1 MPI-2 message passing library

Download Presentation

Cray XT3 Programming Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cray XT3 Programming Introduction • John Levesque • Cray, Inc • levesque@cray.com Lawrence Livermore National Laboratory XT3 Workshop Sept 26-28

  2. XT3 Programming Environment • Compiler, basic libraries and communication libraries • PGI C/C++/Fortran compiler 6.0.1 • MPI-2 message passing library • Cray SHMEM library • Execution environment (”yod” launcher and batch scheduler). • Other talks will address the following... • Math librariies • ACML 2.5 - BLAS, LAPACK, FFT and other numerical routines. • Cray SciLib • Scalapack, BLACS, SuperLU subroutine libraries • Debugger: Etnus Totalview 6.5.0 • Performance tools: Craypat and Apprentice²

  3. Modules Package options (compiler, library, tool, etc.) are arbitrated with the module tool. A standard set of modules are provided as defaults. % module avail PrgEnv/1.0 glib/2.4.2 xt-boot/1.0 xt-mpt/1.0 acml/2.5 gnet/2.0.5 xt-catamount/1.15 xt-os/1.0 acml-gnu/2.5 pgi/5.2.4 xt-crms/1.0 xt-pbs/1.4 acml-mp/2.5 pgi32/5.2.4 xt-libc/1.0 xt-pe/1.0 gcc/3.2.3 pkg-config/0.15.0 xt-libsci/1.0 xt-service/1.0 gcc-catamount/3.3 totalview/6.5.0 xt-lustre-ss/1.0 yath/1.0 % module show xxx will print out the environment variables in a specific module. These can be useful in user modified Makefiles.

  4. Hello World % cat >hello.f include ‘mpif.h’ call MPI_Init( MPI_COMM_WORLD, ie ) call MPI_Comm_rank( MPI_COMM_WORLD, me, ie ) print *, “Hello world, from “, me call MPI_Finalize( MPI_COMM_WORLD ) end % ftn -o hello --target=catamount hello.f % yod -sz 2 hello Hello world, from 0 Hello world, from 1 % No flags are needed for include or libraries for mpich, SciLib, ACML or PAPI. Try “-dryrun” if you’re curious what is hidden under the covers. The compiler driver scripts are part of the PrgEnv module, and are named “cc”, “CC”, “f77” or “ftn”. The synonyms “mpicc”, “mpicxx”, “mpif77” and “mpif90” may also be used but are not recommended.

  5. Compiling • Its by far easiest to use the wrapper scripts “cc”, “ftn”, etc. when compiling or linking to a Catamount executable. These default to parameters suitable for a Catamount target, and they tell you that: • /opt/xt-pe/1.0/bin/snos64/ftn: INFO: catamount target is being used • You can eliminate this warning with --target=catamount • If you do intend to build compute only (no MPI) objects, you can carefully use the base names “pgicc”, “pgiCC”, etc. Do not use gcc or g++ directly unless you intend to build a SIO/Linux object or executable. • Existing packages tend to have user modifiable Makefiles or clever “autoconfig” configure scripts. The latter, sometimes works well if the writer allowed for the cross-compilation case we have with the XT3. • Hint: some configure scripts handle a specification like this correctly: “--target=x86_64_Linux”.

  6. Running Applications • The “yod” command launches executables across a set of Catamount compute nodes. In the interactive case shown, we simply specify the node count. • yod -size # executable argument. • It is possible to use different executables for subsets of nodes. One would make a text file (”textfile”) with lines like: • yod -sz 10 executable-1 • yod -sz 30 executable-2 • yod -sz 20 executable-3 • One would launch the 60 node run with: • yod -F textfile • With initial releases it is not possible to build Linux (SIO node) MPI sets, or mixes of SIO/Catamount nodes.

  7. PBS scripts • Here is an example script • #!/bin/csh • #PBS -l size=64 • #PBS -j oe • #PBS -N myjob • cd ${O_PBS_WORKDIR} • yod -sz $PBS_NNODES executable arguments • To see the status of running, held or pending batch jobs, use “qstat -a” command. See the qstat man page for details of other options. • Job ID Username Queue Jobname SessID Queue Nodes Time S Time------ -------- ------ -------- ------ ------- ------ ----- - -----2983 cat workq STDIN 15951 536:53 10 R 47:25Total compute nodes allocated: 10 • For other options specific to the Xt3, see the pbs_resources_xt3 (7B) man page.

  8. Catamount Programming Considerations • Being a light-weight kernel (QK, or “quintessential kernel”) not all features and complications of Linux are present. • No threading (pthreads or OpenMP). • No TCP/IP facitities (pipes, sockets or IP messages). • No process creation with fork(), exec() and no shell execution system() calls. • No dynamic (shared) libraries • The /proc file-system tree is not present on Catamount nodes, so no shortcuts like /proc/meminfo, /proc/cpuinfo, etc. • No IPC calls (shared memory shmem, limited signal handling). • Catamount I/O is special - in the simple case, I/O is forwarded as RPC requests to the launching yod process. • If the Lustre parallel file system is not available, all file I/O is forwarded across the HPN. • Obvious scalability problem; using large buffer sizes is helpful. (setvbuf() etc. There is also a way to increase buffer sizes in PGI fortran: setvbuf_f()

  9. Catamount Timers • There is no real distinction made between “elapsed time” and “cpu time” for Catamount. • The getrusage() call does not return meaningful timers for “system”, “user” time. • The commonly used timers “clock”, “etime” and “times” cause dummy stubs to be loaded along with a warning message. • Instead, use: • Fortran: cpu_time() intrinsic returns processor seconds. • Also portably, MPI_Wtime() is a useful elapsed-time clock. • gettimeofday() works, but is the usual Unix/Linux low-accuracy (100Hz rate). • dclock() is another C/Fortran cpu time clock, with a nominal resolution of 100 nanoseconds. However it is based on the processor clock, with CPU frequency uncalibrated. It may be off as much as +/-50 micro-seconds per second.

  10. For further information, see the following documents: Cray XT3 System Overview Glossary of Cray XT3 Terms PGI User’s Guide PGF77 Workstation Reference Manual PGI Tools Guide Cray MPICH2 man pages (read intro_mpi(1) first) SHMEM man pages (read intro_shmem(1) first) AMD Core Math Library User’s Guide AMDC man pages Cray XT3 LibSci man pages Portals routines (see Appendix C, page 59) SuperLU User’s Guide PBS Pro 5.3 Quick Start Guide, PBS-3BQ01 PBS Pro 5.3 User Guide, PBS-3BU01 PBS Pro 5.3 External Reference Specification, PBS-3BE01 PAPI User Guide PAPI Programmer’s Reference PAPI Software Specification PAPI man pages

More Related