260 likes | 284 Views
Workshop 10 Distributed ANSYS. Running the Distributed Version of ANSYS. Distributed ANSYS. Topics Setup Distributed ANSYS Run Distributed ANSYS launcher90 Command line Troubleshooting Tips. Distributed ANSYS…Installation. Distributed ANSYS is installed by default
E N D
Workshop 10 Distributed ANSYS Running the Distributed Version of ANSYS
Distributed ANSYS • Topics • Setup Distributed ANSYS • Run Distributed ANSYS • launcher90 • Command line • Troubleshooting Tips
Distributed ANSYS…Installation • Distributed ANSYS is installed by default • Full ANSYS MUST be installed on all machines • MPICH is also installed by default on Linux machines • If using MPI and not MPICH, it must be installed separately.
Distributed ANSYS - Setup • hosts90.ans • .rhosts • machines.LINUX Required Setup files
Distributed ANSYS - Setup • hosts90.ans • Created using ANS_ADMIN Utility • Choose Configuration Options Configure Cluster
Distributed ANSYS - Setup • Configure Machine type, Maximum number of Jobs, and Directory to use.
Distributed ANSYS – Setup • The search directories for the host90.ans: • Current Working directory • User’s home directory • /ansys_inc/v90/ansys/apdl directory Sample hosts90.ans file: pghosiris.ansys.com linia64 0 4 0 0 /temp MPI 1 1 pghisis.ansys.com linia64 0 2 0 0 /temp MPI 1 1
Distributed ANSYS – Setup • .rhosts • Contains all hostnames of all machines being used and the user’s login id • Used by rsh to communicate to other machines • Must exist in the user’s home directory • Permissions of .rhosts file must be set to 600 (chmod 600 .rhosts) Sample .rhosts file: pghosiris.ansys.com glk pghisis.ansys.com glk
Distributed ANSYS - Setup • Contains machine hostnames of the machines to be used • Used by MPICH as a list of machines • Use “uname –n” to get node name of machine • Located in: /ansys_inc/v90/ansys/mpich/linia##/share • One line for every processor to be used on each machine • machines.LINUX Sample machines.LINUX file: pghosiris.ansys.com pghosiris.ansys.com pghosiris.ansys.com pghosiris.ansys.com pghisis.ansys.com pghisis.ansys.com
Distributed ANSYS – Setting the Environment • Running Distributed ANSYS with MPICH: The ANSYS90_DIR and the dynamic load library path (e.g., LD_LIBRARY_PATH) must be set by the appropriate shell startup script in order to run Distributed ANSYS with MPICH. Use the following scripts (supplied with ANSYS) to configure the distributed environment correctly for MPICH. • For csh or tcsh shells, add the following line to your .cshrc, .tcshrc, or equivalent shell startup file: • source /ansys_inc/v90/ansys/bin/confdismpich90.csh • For sh or bash shells, add the following line to your .login, .profile, or equivalent shell startup file: • . /ansys_inc/v90/ansys/bin/confdismpich90.sh • The following line must be added to the user’s path if using MPICH: /ansys_inc/v90/ansys/MPICH/linia##/bin/ • (Where ## is 32 or 64)
Distributed ANSYS – Setting the Environment • As a test, rsh into all machines in the cluster (including the master) and verify that the ANSYS90_DIR and the LD_LIBRARY_PATH are set correctly. • For example: • rsh pghosiris env | grep ANSYS90_DIR • The output should read: • ANSYS90_DIR=/ansys_inc/v90/ansys • and • rsh pghosiris env | grep LD_LIBRARY • The output should read: • LD_LIBRARY_PATH=/ansys_inc/v90/ansys/lib/<platform>:/ansys_inc/v90/ansys/ syslib/<platform>:/ansys_inc/v90/commonfiles/tcl/lib/<platform>
Distributed ANSYS – Running (launcher90) • Start the ANSYS 9.0 Launcher • (launcher90) • Parallel Performance License is needed to run Distributed ANSYS
Distributed ANSYS - Running (launcher90) • New “Solver Setup” tab. • Select: • “Run Distributed ANSYS” • All other configuration fields become available. • Run Distributed ANSYS will not be accessible if a valid Parallel Performance for ANSYS license is not available or it has not been chosen on the Launch tab.
Distributed ANSYS - Running (launcher90) • MPI type • (MPI, MPICH, MPICH_SHMEM) • MPI –Native MPI for each unix platform or MPI/Pro. (HP, IBM, SGI,Sun) • MPICH –MPICH that is included with the ANSYS installation Media. (Linux and Windows Platforms) • MPICH_SHMEM–Shared Memory version of MPICH…Used for Shared-memory Linux machines.
Distributed ANSYS - Running (launcher90) • Use local machine only • You can specify the Number of Processors to be used.
Distributed ANSYS - Running (launcher90 – Multiple Hosts) • Use Multiple Hosts • “Available hosts” is list of machines retrieved from hosts90.ans file. • “Selected Hosts” can be added from “Available hosts or “New Host” can be added by selecting the “New Host…” button.
Distributed ANSYS - Running (launcher90 – Multiple Hosts) • “New Host…” button. • Opens Window that allows user to specify host and Number of Processors • “ Edit…” opens similar window “New Host…” opens • Number of Processors can be increased or decreased
Distributed ANSYS - Running (Launcher90 – Multiple Hosts) • After all options are set, pressing Run will start the Distributed version of ANSYS 9.0
Distributed ANSYS – Running (Command Line Local) • You can also start Distributed ANSYS via the command line using the following procedures. • Local Host. • If you are running distributed ANSYS locally (i.e., running across multiple processors on a single local machine), you need to specify the number of processors: • For native MPI or MPI/Pro: • ansys90 -pp -dis -np n • For MPICH: • ansys90 -pp -mpi mpich -dis -np n • where n is the number of processors • For example, if you run a job in batch mode on a local host using four processors and MPI, with an input file named input1 and an output file named output1, the launch command would be: • ansys90 -pp -dis -np 4 -b -i input1 -o output1 • Command Line Run:
Distributed ANSYS – Running (Command Line Local) • Multiple Hosts. • If you are running distributed ANSYS across multiple hosts, • you need to specify the number of processors on each machine: • For native MPI or MPI/Pro: • ansys90 -pp -dis -machines machine1:np:machine2:np:machine3:np • For MPICH: • ansys90 -pp -mpi mpich -dis -machines machine1:np:machine2:np:machine3:np • where machine1 (or 2 or 3) is the name of the machine and np is the • number of processors you want to use on the corresponding machine. • For example, if you run a job in batch mode using two machines (one with four processors and one with two processors) and MPI, with an input file named input1 and an output file named output1, the launch command would be: • ansys90 -pp -dis -b -machines machine1:4:machine2:2 -i input1 -o output1 • Command Line Run:
Troubleshooting Tips • Most errors occur due to improper environment setup. • One can test the MPI implementation by running the provided scripts: • mpitest90 – for native MPI or MPI/Pro installations • mpitestmpich90 – for MPICH installations • Both scripts are located under: /ansys_inc/v90/ansys/bin • Output from a successful test: • latency = 19.5312 microseconds • Bytes Bandwidth(MB/s) • ------- ----------------- • 8 0.409600 • 1024 47.662545 • 4096 161.319385 • 16384 310.689185 • 65536 532.610032 • 262144 607.320036 • 1048576 406.720388 • 4194304 340.330214 • MPI Test has completed successfully!
Troubleshooting Tips • Verify that the correct MPICH is being used: • # which mpirun • The above command should return the location of MPICH provided by ANSYS: /ansys_inc/v90/ansys/MPICH/linia##/bin/ • Again, check the environment to verify the • LD_LIBRARY_PATH and ANSYS90_DIR variables are being set. • rsh pghosiris env | grep ANSYS90_DIR • The output should read: • ANSYS90_DIR=/ansys_inc/v90/ansys • and • rsh pghosiris env | grep LD_LIBRARY • The output should read: • LD_LIBRARY_PATH=/ansys_inc/v90/ansys/lib/<platform>:/ansys_inc/v90/ansys/ syslib/<platform>:/ansys_inc/v90/commonfiles/tcl/lib/<platform>
Troubleshooting Tips • Some errors encountered due to incorrect environment setup: p0_26702: p4_error: interrupt SIGSEGV:11 p2_12443: p4_error: : 14 p4_error: latest msg from perror: Broken pipe p0_12371: (0.621094) net_send: could not write to fd=5, errno = 32 p0_12371: p4_error: net_send write: -1 3 - MPI_RECV : Message truncated [3] Aborting program !