200 likes | 328 Views
2.4 Parallel Performance Enhancements. In this section, we will discuss the following topics: A. New add-on product Parallel Performance for ANSYS B. Distributed Domain Solver (DDS) C. Algebraic Multigrid Solver (AMG). Parallel Performance Enhancements Overview.
E N D
2.4 Parallel Performance Enhancements • In this section, we will discuss the following topics: A. New add-on product Parallel Performance for ANSYS B. Distributed Domain Solver (DDS) C. Algebraic Multigrid Solver (AMG) Training Manual 001419 15 Aug 2000 2.4-1
Parallel Performance EnhancementsOverview • Driven by user requirements of higher accuracy and fidelity in solution • e.g. mesh refinement and adaptive meshing • Desire to solve assemblies instead of individual component analysis • e.g. assembly contact problems Training Manual 001419 15 Aug 2000 2.4-2
Parallel Performance EnhancementsA. Parallel Performance for ANSYS • A new, add-on product for shared memory and distributed memory environments • Offers powerful new solvers enabling quick, accurate solutions to large models using multiple processors • Algebraic MultiGrid (AMG) solver • Solves static/ transient nonlinear analyses using multiple processors (up to 8) on a single system (shared memory parallel) • Distributed Domain Solver (DDS) • Solves large static / transient nonlinear analyses over multiple systems (Distributed memory parallel) as well as multiple processors on a single machine (Shared memory parallel) or any combination Training Manual 001419 15 Aug 2000 2.4-3
Parallel Performance Enhancements B. DDS What is DDS? • Breaks large problems (up to 10 million DOFs) into smaller domains (1000 to 10000 DOFs) automatically • Compatibility among domains obtained by solving for interface variables (Lagrange multipliers) Training Manual 001419 15 Aug 2000 2.4-4
14 12 10 8 Speed up ratio 6 4 2 0 0 2 4 6 8 10 12 14 16 18 20 22 24 Number of CPUs Parallel Performance Enhancements … DDS ...What is DDS? • Transfers and factorizes the subdomains on slave machines using direct solver • Master machine retrieves and assembles subdomain solutions; solves for interface variables using an iterative solver and computes results for entire model Training Manual 001419 15 Aug 2000 2.4-5
Parallel Performance Enhancements… DDS Why DDS? • Highly scalable • More processors / less elapsed time • Example below shows a 3.5 million-DOF SOLID92 model • 2020 subdomains on an SGI Origin 2000, 12GB memory Speed-up = 21.0 Training Manual 001419 15 Aug 2000 2.4-6
Parallel Performance Enhancements … DDS Memory / Disk requirements • 2 to 4 times more memory than PCG; however, this is not a problem for distributed memory architecture. • Memory required is a sum of all master & individual slave machine memories • In general Master machine will need large memory Training Manual 001419 15 Aug 2000 2.4-7
Parallel Performance Enhancements … DDS - Under the Hood • DDS has 2 components: • Domain decomposer • Embedded in ANSYS • Divides domain into n subdomains • Creates scratch.dds, file.dds, and file.erot • Issues ‘mpirun’ command and launches appropriate ansdds.e57 executable • ANSDDS.E57 • A stand-alone, MPI enabled executable • Computes solution for subdomain on the slave processor • Writes out a file called scratch.u, which is later retrieved by the Master to calculate element results Training Manual 001419 15 Aug 2000 2.4-8
Parallel Performance Enhancements… DDS • System requirements • Network must be homogeneous (same operating system) • Message Passing Interface (MPI) used to communicate • Master (where the job is submitted) • “Performance Parallel for ANSYS” add-on required • ANSYS 5.7 must be installed (including ansdds.e57) • Installation of MPI • 256 MB ram / 10 GB disk required • Slave • Installation of MPI on all slave machines • ansdds.e57 executable must be installed Training Manual 001419 15 Aug 2000 2.4-9
Parallel Performance Enhancements … DDS How to use DDS • Specify “Parallel Peformance for ANSYS” add-on when starting ANSYS • ansys57 -pp • Choose DDS Solver • EQSLV,DOMAIN • Specify information about slave processors • DDSOPT command* *DDSOPT command covered in Systems Training Training Manual 001419 15 Aug 2000 2.4-10
Parallel Performance Enhancements … DDS How to use DDS (cont'd) • Solve • Postprocessing • You get a results file as usual • /PNUM,DOMAIN,ON will display domains by colors / numbers Training Manual 001419 15 Aug 2000 2.4-11
Parallel Performance Enhancements … DDS Solver Are there any modeling restrictions for using DDS? • Structural static/transient only (linear or nonlinear) • Symmetric matrices • “h” elements only • No coupling / constraint equations • No inertia relief Training Manual 001419 15 Aug 2000 2.4-12
Parallel Performance Enhancements … DDS Solver Training Manual 001419 15 Aug 2000 2.4-13
Parallel Performance Enhancements C. AMG Solver What is AMG solver? • A preconditioned conjugate gradient solver similar to PCG solver • The preconditioner used in AMG solver is derived using Algebraic MultiGrid technique • MultiGrid techniques derive a preconditioner that is very close to [K]-1 by working on a coarser mesh of the FE model supplied • Algebraic MultiGrid methods work on a coarsened version of the full [K] matrix instead of the mesh (that is mesh independent) Training Manual 001419 15 Aug 2000 2.4-14
Parallel Performance Enhancements… AMG Solver Why do we need AMG solver? • Sensitivity to ill-conditioning • Much less sensitive to ill-conditioned problems than PCG • Will get solutions in fewer iterations than PCG for ill-conditioned problems • Expected to perform as well as PCG for well conditioned problems • Scalability • Up to 5 times for 8 processors • Scales much better than PCG • Used in shared memory parallel (single machine with multiple processors) only Training Manual 001419 15 Aug 2000 2.4-15
Parallel Performance Enhancements … AMG Solver Scalability Training Manual 001419 15 Aug 2000 2.4-16
Parallel Performance Enhancements … AMG Solver How to use AMG solver • Specify “Parallel Peformance for ANSYS” add-on when starting ANSYS • ansys57 -pp • Specify number of processors: • /CONFIG,NPROC,N • or config57.ans • or use the macro SETNPROC • Choose AMG Solver • EQSLV,AMG,Toler • Tolerance defaults to 1e-8 similar to PCG • Solve Training Manual 001419 15 Aug 2000 2.4-17
Parallel Performance Enhancements … AMG Solver • When to use AMG solver • Structural Static & Transient analyses • Nonlinear analyses • Large aspect ratio elements, reduced integration elements • Models with combination of shells/ solids/ beams • Shared memory parallel machines • When not to use AMG solver • Non-structural problems (it works but is less efficient) • Models made of only shell63 elements do not seem to be as cpu efficient as PCG Training Manual 001419 15 Aug 2000 2.4-18
Parallel Performance Enhancements … AMG Solver Memory / Disk requirements • 1.3 to 2 times more memory than PCG solver • Rule of thumb is 130 MB per 100,000 dof for solid92s • Memory required is also a function of number of processors used (overhead) • Files created during AMG solution are very similar to PCG and about the same size Training Manual 001419 15 Aug 2000 2.4-19