SGI Altix ICE Architecture Rev. 1.2a Kevin Nolte, SGI, Professional Services

SGI Altix ICE ArchitectureRev. 1.2aKevin Nolte, SGI, Professional Services

Altix ICE 8400 • Altix ICE 8400 Rack: • 42U rack (30” W x 40” D) • 4 bladeenclosures, each up to 16 two-socket nodes • Single- or dual-plane IB 4 interconnect • Minimal switch topology scales to 1000s of nodes Rack Blade SGI®Altix® ICE Compute Blade Up to two 4-core sockets, 96GB, 2-IB

Overview of IRU • The basic building block is an 18U-high IRU that contains the following: • Sixteen IP93 compute blades • Four network extender blades • One or two CMC blades • Six 2837 watt 12V power supplies and two 2837 watt 48V power supplies • IRU = Individual Rack Unit • CMC = Chassis Management Controller

Terminology: Socket/Processor Sockets socket = processor = node

SGI Altix ICE Application Environment PrimerRev. 1.2aKen Taylor, SGI, Professional Services

Agenda • Application porting • Code optimization • Programming environment and libraries • Pinning for OpenMP and MPI • SGI-provided software tools

Application Porting • Intel Xeon X5690 – x86_64 • 64-bit compiler and lib64 • -g –traceback –fpe0 (sets -ftz) • Data Representation: Little ENDIAN • -convert big_endian|ibm • env F_UFMTENDIAN=bigenv FORT_CONVERTn=big_endian • OPEN (UNIT=n, CONVERT=…) • Conversion performance impact

Application Porting • Basic I/O Architecture Considerations • No local disk drive (NFS and Lustre FS) • /tmp is tmpfs 150 MB • Torque standard out and err in /var/spool 2 GB

Application Porting • Fortran I/O • Fortran record length • 4 Byte unit • -assume byterecl • Fortran standard portable RECL specification using INQUIRE statement INQUIRE (IOLENGTH=iol) I, A, B, J

Code Optimization • Compute • I/O • Communication

Code Optimization • Key Parallel Programming Models • MPI-2.2 Standard • OpenMP 3.1 Standard • New Parallel Programming Models • SGI UPC • Fortran 2008 Coarrays (Intel ifort 12.1)

Code Optimization • Code Vectorization • Intel SIMD • -xSSE4.2 (Westmere-EP processor) • -opt-report=3

Code Optimization • I/O • Well-formed I/O • Lustre File System • Big I/O striping – lfs setstripe • Lustre caching and direct I/O • MPI I/O Lustre accelerator (SGI, Intel, MVAPICH2) • NFS • Better for small, random I/O (e.g. code compilations) • Parallel I/O issues • Shared file • Read all versus read one then broadcast

Code Optimization • I/O • Intel Fortran I/O Library • FORT_BUFFERED, FORT_BLOCKSIZE, FORT_BUFFERCOUNT • Disable for small, random I/O • Fortran 2003 ASYNCHRONOUS='YES‘ • Linux • Linux Pagecache Scaling • cached too large • Direct I/O • st_blksize (stat command)

Code Optimization • Communication • SGI MPT • MPI_BUFS_PER_PROC • MPI_STATS • MPInside 3.5.4 • MPI_BUFFER_MAX (single-copy) • MPI_IB_RAILS 2 • MPI_COLL_ • MPI_FASTSTART • IB Failover • MPI_IB_RAILS 2|1+

Code Optimization • Communication • SGI MPT Always Set • MPI_VERBOSE • MPI_DISPLAY_SETTINGS • MPI_DSM_VERBOSE

Programming Environment and Libraries • Module environment • Csh: source /usr/share/Modules/init/csh • Bash: . /usr/share/Modules/init/bash (module purge) module load modules # RHEL error module avail (prefix) module load mpt-2.06 module load intel-fc intel-cc intel-mkl

Programming Environment and Libraries • SGI Libraries • SGI MPI 1.4 • SGI MPT 2.05 • SGI perfboost,perfcatcher,test • SGI omplace • SGI MPInside • SGI PerfSuite • SGI FFIO • Upcoming MPT 2.06 IB fail-over fixes and others

SGI Provided Software Tools • SGI Tools • SGI perfboost,perfcatcher,test • SGI omplace • SGI MPInside • SGI PerfSuite • SGI FFIO • NUMA Tools • cpumap, dplace, dlook • Linux /sys/devices/system

Pinning for OpenMP and MPI: SGI MPT • Placement Control for Mix of MPI and OpenMP • MPI_OPENMP_INTEROP • Preferred SGI MPT Method: mpirun –np ranks omplace [OPTIONS] program args • [OPTIONS] • -b basecpu: base cpu to begin allocating threads [default 0]. Relative to current cpuset • -c cpulist: defines effective cpulist • -nt threads: Defines the number of threads per MPI process [defaults to 1 or OMP_NUM_THREADS] • -vv: shows created dplace placement file • Distribute evenly between processors and LLC • Check topology

Pinning for OpenMP and MPI: SGI MPT % mpirun -np 2 omplace -nt 4 -vv ./testmpiomp.x omplace information: MPI type is SGI MPI, 4 threads, thread model is intel placement file /tmp/omplace.file.13498: fork skip=0 exact cpu=0-23:4 thread oncpu=0 cpu=1-3 noplace=1 exact thread oncpu=4 cpu=5-7 noplace=1 exact thread oncpu=8 cpu=9-11 noplace=1 exact thread oncpu=12 cpu=13-15 noplace=1 exact thread oncpu=16 cpu=17-19 noplace=1 exact thread oncpu=20 cpu=21-23 noplace=1 exact MPI: dplace use detected, MPI_DSM_... environment variables ignored rank 0 name cam rank 1 name cam rank 0 np 2 nt 4 thread 0 i 1 cpu 0 rank 0 np 2 nt 4 thread 3 i 4 cpu 3 rank 0 np 2 nt 4 thread 1 i 2 cpu 1 rank 0 np 2 nt 4 thread 2 i 3 cpu 2 rank 1 np 2 nt 4 thread 0 i 1 cpu 4 rank 1 np 2 nt 4 thread 2 i 3 cpu 6 rank 1 np 2 nt 4 thread 3 i 4 cpu 7 rank 1 np 2 nt 4 thread 1 i 2 cpu 5

Pinning for OpenMP and MPI: Intel MPI • Placement Control for Mix of MPI and OpenMP • Intel MPI and Intel OpenMP abstract specifications % mpirun-genv I_MPI_PIN_DOMAIN=cache -np 2 ./testmpiomp-impi.x rank 0 name cam rank 1 name cam rank 0 np 2 nt 4 thread 0 i 1 cpu 17 rank 0 np 2 nt 4 thread 3 i 4 cpu 16 rank 0 np 2 nt 4 thread 1 i 2 cpu 14 rank 0 np 2 nt 4 thread 2 i 3 cpu 15 rank 1 np 2 nt 4 thread 0 i 1 cpu 23 rank 1 np 2 nt 4 thread 2 i 3 cpu 21 rank 1 np 2 nt 4 thread 1 i 2 cpu 20 rank 1 np 2 nt 4 thread 3 i 4 cpu22 • Add KMP_AFFINITY for Intel OpenMP thread placements and pinning

SGI Altix ICE Architecture Rev. 1.2a Kevin Nolte, SGI, Professional Services

SGI Altix ICE Architecture Rev. 1.2a Kevin Nolte, SGI, Professional Services

Presentation Transcript

SGI ® InfiniteStorage 5600

SGI ® Modular InfiniteStorage

SGI ® Rackable ® Servers

SmartGridIreland (SGI)

Getting the Most Out of the TeraGrid SGI Altix UV Systems

Martin Sirák (CED/SGI )

Parallel Matlab: RTExpress on 64-bit SGI Altix with SCSL and MPT

SGI Video Servers

SGI

Introducing the SGI

Parallel/Concurrent Programming on the SGI Altix

SGI Origin 3000