210 likes | 333 Views
The Jacquard Programming Environment Mike Stewart NUG User Training, 10/3/05. Outline. Compiling and Linking. Optimization. Libraries. Debugging. Porting from Seaborg and other systems. Pathscale Compilers. Default compilers: Pathscale Fortran 90, C, and C++.
E N D
The Jacquard Programming Environment Mike Stewart NUG User Training, 10/3/05
Outline • Compiling and Linking. • Optimization. • Libraries. • Debugging. • Porting from Seaborg and other systems.
Pathscale Compilers • Default compilers: Pathscale Fortran 90, C, and C++. • Module “path” is loaded by default and points to the current default version of the Pathscale compilers (currently 2.2.1). • Other versions available: module avail path. • Extensive vendor documentation available on-line at http://pathscale.com/docs.html. • Commercial product: well supported and optimized.
Compiling Code • Compiler invocation: • No MPI: pathf90, pathcc, pathCC. • MPI: mpif90, mpicc, mpicxx • The mpi compiler invocation will use the currently loaded compiler version. • The mpi and non-mpi compiler invocations have the same options and arguments.
Compiler Optimization Options • 4 numeric levels –On where n ranges from 0 (no optimization) to 3. • Default level: -O2 (unlike IBM) • –g without a –O option changes the default to –O0.
-O1 Optimization • Minimal impact on compilation time compared to –O0 compile. • Only optimizations applied to straight line code (basic blocks) like instruction scheduling.
-O2 Optimization • Default when no optimization arguments given. • Optimizations that always increase performance. • Can significantly increase compilation time. • -O2 optimization examples: • Loop nest optimization. • Global optimization within a function scope. • 2 passes of instruction scheduling. • Dead code elimination. • Global register allocation.
-O3 Optimization • More extensive optimizations that may in some cases slow down performance. • Optimizes loop nests rather than just inner loops, i.e. inverts indices, etc. • “Safe” optimizations – produces answers identical with those produced by –O0. • NERSC recommendation based on experiences with benchmarks.
-Ofast Optimization • Equivalent to -O3 -ipa -fno-math-errno -OPT:roundoff=2:Olimit=0:div_split=ON:alias=typed. • ipa – interprocedural analysis. • Optimizes across functional boundaries. • Must be specified both at compile and link time. • Aggressive “unsafe” optimizations: • Changes order of evaluation. • Deviates from IEEE 754 standard to obtain better performance. • There are some known problems with this level of optimization in the current release, 2.2.1.
SuperLU MPI Benchmark • Based on the SuperLU general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations. • Mostly C with some Fortran 90 routines. • Run on 64 processors/32 nodes. • Uses BLAS routines from ACML.
ACML Library • AMD Core Math Library - set of numerical routines tuned specifically for AMD64 platform processors. • BLAS • LAPACK • FFT • To use with pathscale: • module load acml (built with pathscale compilers) • Compile and link with $ACML • To use with gcc: • module load acml_gcc (build with pathscale compilers) • Compile and link with $ACML
Matrix Multiply Optimization Example • 3 ways to multiply 2 dense matrices • Directly in Fortran with nested loops • Matmul F90 intrinsic • dgemm from ACML • Example 2 1000 by 1000 double precision matrices. • Order of indices: ijk means • do i=1,n • do j=1,n • do k=1,n
Debugging • Etnus Totalview debugger has been installed on the system. • Still in testing mode, but it should be available to users soon.
Porting codes • Jacquard is a linux system so gnu tools like gmake are the defaults. • Pathscale compilers are good, but new, so please report any evident compiler bugs to consult.