820 likes | 935 Views
Progress in Unstructured Mesh Techniques. Dimitri J. Mavriplis Department of Mechanical Engineering University of Wyoming and Scientific Simulations Laramie, WY. Overview. NSU3D Unstructured Multigrid Navier-Stokes Solver 2 nd order finite-volume discretization
E N D
Progress in Unstructured Mesh Techniques Dimitri J. Mavriplis Department of Mechanical Engineering University of Wyoming and Scientific Simulations Laramie, WY
Overview • NSU3D Unstructured Multigrid Navier-Stokes Solver • 2nd order finite-volume discretization • Fast steady state solutions • (~100M pts in 15 minutes NASA Columbia Supercomputer) • Extension to Design Optimization • Extension to Aeroelasticity • Enabling techniques: Accuracy and Efficiency • High-Order Discontinuous Galerkin Methods (Longer term) • High accuracy discretizations through increased p order • Fast combined h-p multigrid solver • Steady-State (2-D and 3-D Euler) • Unsteady Time-Implicit (2-D Euler)
NSU3D Discretization • Vertex based unstructured meshes • Finite volume / finite element • Arbitrary Elements • Single edge-based data structure • Central Difference with matrix dissipation • Roe solver with MUSCL reconstruction
NSU3D Spatial Discretization • Mixed Element Meshes • Tetrahedra, Prisms, Pyramids, Hexahedra • Control Volume Based on Median Duals • Fluxes based on edges • Single edge-based data-structure represents all element types
Mixed-Element Discretizations • Edge-based data structure • Building block for all element types • Reduces memory requirements • Minimizes indirect addressing / gather-scatter • Graph of grid = Discretization stencil • Implications for solvers, Partitioners
NSU3D Convergence Acceleration Methods for Steady-State (and Unsteady) Problems • Multigrid Methods • Fully automated agglomeration techniques • Provides convergence rates independent of grid size (usually < 500 MG Cycles) • Implicit Line Solver • Used on each MG Level • Reduces stiffness due to grid anisotropy in Blayer • No Wall Fctns
Multigrid Methods • High-frequency (local) error rapidly reduced by explicit methods • Low-frequency (global) error converges slowly • On coarser grid: • Low-frequency viewed as high frequency
Coarse Level Construction • Agglomeration Multigrid solvers for unstructured meshes • Coarse level meshes constructed by agglomerating fine grid cells/equations
Anisotropy Induced Stiffness • Convergence rates for RANS (viscous) problems much slower then inviscid flows • Mainly due to grid stretching • Thin boundary and wake regions • Mixed element (prism-tet) grids • Use directional solver to relieve stiffness • Line solver in anisotropic regions
Directional Solver for Navier-Stokes Problems • Line Solvers for Anisotropic Problems • Lines Constructed in Mesh using weighted graph algorithm • Strong Connections Assigned Large Graph Weight • (Block) Tridiagonal Line Solver similar to structured grids
Multigrid Line Solver Convergence • DLR-F4 wing-body, Mach=0.75, 1o, Re=3M • Baseline Mesh: 1.65M pts
Parallelization through Domain Decomposition • Intersected edges resolved by ghost vertices • Generates communication between original and ghost vertex • Handled using MPI and/or OpenMP (Hybrid implementation) • Local reordering within partition for cache-locality • Multigrid levels partitioned independently • Match levels using greedy algorithm • Optimize intra-grid communication vs inter-grid communication
Partitioning • (Block) Tridiagonal Lines solver inherently sequential • Contract graph along implicit lines • Weight edges and vertices • Partition contracted graph • Decontract graph • Guaranteed lines never broken • Possible small increase in imbalance/cut edges
NASA Columbia Supercluster • 20 SGI Atix Nodes • 512 Itanium2 cpus each • 1 Tbyte memory each • 1.5Ghz / 1.6Ghz • Total 10,240 cpus • 3 Interconnects • SGI NUMAlink (shared memory in node) • Infiniband (across nodes) • 10Gig Ethernet (File I/O) • Subsystems: • 8 Nodes: Double density Altix 3700BX2 • 4 Nodes: NUMAlink4 interconnect between nodes • BX2 Nodes, 1.6GHz cpus
NSU3D TEST CASE • Wing-Body Configuration • 72 million grid points • Transonic Flow • Mach=0.75, Incidence = 0 degrees, Reynolds number=3,000,000
NSU3D Scalability • 72M pt grid • Assume perfect speedup on 128 cpus • Good scalability up to 2008 using NUMAlink • Superlinear ! • Multigrid slowdown due to coarse grid communication • ~3TFlops on 2008 cpus
Single Grid Performance up to 4016 cpus • 1 OMP possible for IB on 2008 (8 hosts) • 2 OMP required for IB on 4016 (8 hosts) • Good scalability up to 4016 • 5.2 Tflops at 4016 First real world application on Columbia using > 2048 cpus
Unstructured NS Solver/NASA Columbia Supercomputer • ~100M pt solutions in 15 minutes • 109 pt solutions can become routine • Ease other bottlenecks (I/O for 109 pts = 400 GB) • High resolution MDO • High resolution Aeroelasticity
Enabling Techniques • Design Optimization • Robust Mesh Deformation (Linear elasticity) • Discrete Adjoint for Flow equations • Discrete Adjoint for Mesh Motion Equations • Mesh sensitivites (Park and Nielsen) • Line-Implicit Agglomeration Multigrid Solver • Flow, flow adjoint, mesh motion, mesh adjoint • Duality preserving formulation • Adjoint discretization requires almost no additional memory over first order-Jacobian used for implicit solver • Modular (subroutine) construction for adjoint and mesh sensitivities • dR/dx = dr/d(edge) . d(edge)/dx • Similar convegence rates for tangent and adjoint problems
Enabling Techniques • Aeroelasticity • Robust Mesh Deformation (Linear elasticity) • Line-Implicit Agglomeration Multigrid Solver • Flow (implicit time step), mesh motion • Linear multigrid formulation • High-order temporal discretization • Backwards Difference (up to 3rd order) • Implicit Runge-Kutta (up to fourth order) • Formulation of Geometric Conservation Law for high-order time-stepping • Necessary for non-linear stability
Mesh Motion • Developed for MDO and Aeroelasticity Problems • Emphasis on Robustness • Spring Analogy • Truss Analogy, Beam Analogy • Linear Elasticity: Variable Modulus • Emphasis on Efficiency • Edge based formulation • Gauss Seidel Line Solver with Agglomeration Multigrid • Fully integrated into flow solver
Formulation • Mesh motion strategies • Tension spring analogy Laplace equation, maximum principle, incapable of reproducing solid body rotation • Truss analogy (Farhat et al, 1998)
Formulation • Linear Elasticity Equations • Prescription of E very important • Reproduces solid body translation/rotation for stiff E regions • Prescribe large E in critical regions • Relegates deformation to less critical regions of mesh
truss spring LE variable E LE constant E Results and Discussion • Mesh motion strategies for 2D viscous mesh
IMPORTANT DIFFERENCES (FUN3D) • Navier-Equations for displacement • Derived assuming constant E • Variations only in Poisson ratio
Method of Solution • Linear Elasticity Equations can be difficult to solve • Apply same techniques as for flow solver • Linear agglomeration multigrid(LMG) method • Line-implicit solver • Using same line/AMG structures
Method of Solution • Agglomeration multigrid
Strong coupling Method of Solution • Line-implicit solver
iter= 0 iter= 1 iter= 2 iter= 3 iter= 4 iter= 5 iter= 6 iter= 7 iter= 8 iter= 9 iter= 10 Results and Discussion • Line solver + MG4 , first 10 iterations Viscous mesh, linear elasticity with variable E
3D Dynamic Meshes (NS mesh) DLR wing-body configuration, 473,025 vertices
2D viscous mesh, linear elasticity 3D viscous mesh, linear elasticity Results and Discussion • Convergence rates for different iterative methods
Unsteady Flow Solver Formulation • Flow governing equations in Arbitrary-Lagrangian-Eulerian(ALE) form: • After discretization (in space):
Unsteady Flow Solver Formulation • Flow governing equations in Arbitrary-Lagrangian-Eulerian(ALE) form GCL: Maintain Uniform Flow Exactly (discrete soln)
Implicit Runge-Kutta Schemes • Dalquist Barrier: No complete A and L stability above BDF2 • BDF3 often works…. But… • For higher order: Implicit Runge Kutta Schemes • Backwards Difference (BDF2, BDF3) and Implicit Runge Kutta (up to 4th order in time) previously compared for unsteady flows with static grids • For moving grids, must obey Geometric Conservation Law GCL • 2nd and 3rd order BDF-GCL relatively straight-forward • How to construct high-order Runge-Kutta GCL schemes ?
New Approach to GCL • Use APPROXIMATE FaceVelocities evaluated at the RK Quadrature Points to Respect GCL, but still maintain Design Accuracy • i.e. For low order schemes: dx/dt = (xn+1 - xn ) /dt • For High-order RK: Solve system at each time step given by DGCL:
2D Example • Periodic Pitching NACA0012 (exaggerated) • Mach=0.755, AoA=0.016o +- 2.51o • RK accurate with large time steps
2D Pitching Airfoil • Error measured as RMS difference in all flow variables between solution integrated from t=0 to t=54 with reference solution at t=54 • Reference solution: RK64 with 256 time steps/period • Slope of accuracy curves: • BDF2: 1.9 • RK64: 3.5
3D Example: Twisting OneraM6 Wing • Mach=0.755, AoA=0.016o +- 2.51o • Reduced frequency=0.1628
3D Validation (RK64 +GCL) • Twisting ONERA M6 Wing • Same error measure as in 2D (ref.solution=128 steps/period) • IRK64 enables huge time steps • Slopes of error curves: BDF2=2.0, RK64=3.3
AGARD WING Aeroelastic Test Case Modal Analysis 1st Mode 2nd Mode
AGARD WING Aeroelastic Test Case Modal Analysis 3rd Mode 4th Mode
AGARD WING Aeroelastic Test Case • First 4 structural modes • Coarse Euler Simulation • 45,000 points, 250K cells • Linear Elasticity Mesh Motion • Multigrid solver • 2nd order BDF Time stepping • Multigrid solver • Flow/Structure solved fully coupled at each implicit time step • 2 hours on 1 cpu per analysis run
Flutter Boundary Prediction Flutter Boundary Generalized Displacements
Current and Future Work • Investigate benefits of Implicit Runge-Kutta • 4th order temporal accuracy • Investigate optimal time-step size and convergence criteria • Develop automated temporal-error control scheme • Viscous simulations, Finer Meshes • 5M pt Unsteady Navier-Stokes solutions : • 2-4 hours on 128 cpus of Columbia • Adjoint for unsteady problems • Time domain • Frequency domain
Higher-Order Methods • Simple asymptotic arguments indicate benefit of higher-order discretizations • Most beneficial for: • High accuracy requirements • Smooth functions
Motivation • Higher-order methods successes • Acoustics • Large Eddy Simulation (structured grids) • Other areas • High-order methods not demonstrated in: • Aerodynamics, Hydrodynamics • Unstructured mesh LES • Industrial CFD • Cost effectiveness not demonstrated: • Cost of discretization • Efficient solution of complex discrete equations
Motivation • Discretizations well developed • Spectral Methods, Spectral Elements • Streamwise Upwind Petrov Galerkin (SUPG) • Discontinuous Galerkin • Most implementations employ explicit or semi-implicit time stepping • e.g. Multi-Stage Runge Kutta ( ) • Need efficient solvers for: • Steady-State Problems • Time-Implicit Problems ( )
Multigrid Solver for Euler Equations • Develop efficient solvers (O(N)) for steady-state and time-implicit high-order spatial discretizations • Discontinuous Galerkin • Well suited for hyperbolic problems • Compact-element-based stencil • Use of Riemann solver at inter-element boundaries • Reduces to 1st order finite-volume at p=0 • Natural extension of FV unstructured mesh techniques • Closely related to spectral element methods
Discontinuous Galerkin (DG) Mass Matrix Spatial (convective or Stiffness) Matrix Element Based-Matrix Element-Boundary (Edge) Matrix