370 likes | 563 Views
COMPUTATIONAL ELEMENT S FOR VERY LARGE-SCALE, HIGH-FIDELITY AERODYNAMIC ANALYSIS AND DESIGN. 2006 년 11 월 20 일. 김종암 Aerodynamic Simulation & Design Lab. 서울대학교 기계항공공학부. Contents. Introduction Aerodynamic Solvers for High Performance Computing
E N D
COMPUTATIONAL ELEMENTS FOR VERY LARGE-SCALE, HIGH-FIDELITY AERODYNAMIC ANALYSIS AND DESIGN 2006년 11월 20일 김종암 Aerodynamic Simulation & Design Lab. 서울대학교 기계항공공학부
Contents • Introduction • Aerodynamic Solvers for High Performance Computing • Characteristics of International Standard Codes • Essential Elements for Teraflops CFD • High-Fidelity Numerical Methods for Flow Analysis and Design • Parallel Efficiency Enhancement • Geometric Representation for Complex Geometry • Some Examples • Conclusion
Introduction - Bio & Astrophysics [ Simulation of supernovae ] ORNL (Oak Ridge National Laboratory) Researchers using an ORNL supercomputer have found that the organized flow beneath the shock wave in a previous two-dimensional model of a stellar explosion persists in three dimensions, as shown here. [ Molecules in motion ] - 10.4 teraflops SDSC (San Diego Supercomputer Center ) Understanding how molecules naturally behave inside cells. Predicting how the molecules might react to the presence of prospective drugs. [ Computationally predicting protein structures ] ORNL (Oak Ridge National Laboratory) A protein structure, predicted at ORNL (left) and the actual structure, determined experimentally (right). [ Blood-flow patterns at an instant during the systolic cycle ] CITI (Computer and Information Technology Institute) [ 2-D Rayleigh-Taylor Instability ] FLASH center / Pittsburgh Supercomputing Center.
Introduction - Weather forecasting [ Global atmospheric circulation ] DKRZ (Deutsches Klimarechenzentrum GmbH) The German High Performance Computing Centre for Climate and Earth System Research Animation of 1 month "simulated weather" with a global atmosphere model [ Typhoon ETAU in 2003 ] Earth Simulator Center Result of non-hydrostatic ultrahigh-resolution coupled atmosphere-ocean model - 26.58 Tflops was obtained by a global atmospheric circulation code. [ Global ocean circulation ] DKRZ 3-D Particles/Streamlines coloured by temperature are used to visualize important features of the annual mean ocean circulation [Twin typhoons over the Philippine Sea] Earth Simulator Center
Introduction - Aerospace & Other related fields [ the numerical simulation of the hydro-aerodynamic effects around the Shosholoza boat with the aim to gain an optimal design ] the Scientific Supercomputing Center at Karlsruhe University [ Full SSLV configuration ] NASA Columbia Supercomputer [ Aerodynamics simulation around a SAUBER PETRONAS C23 ] SAUBER PETRONAS, Switzerland [ Bio-Agent Blast Dispersion Simulations ] DTRA (Defense Threat Reduction Agency )
Introduction - System architecture • Primary Factors on Computing Speed • CPU clock speed of computer • Number of instruction per one clock • CPU clock speed is represented ‘Hz’ : frequency per one second • 1 Tflops = A trillion floating-point operation per second • Example • Pentium Xeon 2.4 Ghz : 2.4Ghz * 2 (Hyper-Threading) = 4.8 GFlops • Ia64(Itanium) 1.4 Ghz : 1.4 * 2 (Hyper-Threading) * 2(Instruction) = 5.6 Gflops
Computing Power Nowadays • Top500 List (June 2006) • Fastest machine : BlueGene/L by IBM (at DOE/NNSA/LLNL) • Ranked at #500 : 2.026 Tflops • Era of teraflops computing has already come! • BlueGene/L • 100,000+ processors • Performance : 280.6 teraflops
Introduction • Application Characteristics Aerospace Engineering Usage of memory is higher than hard drive Requirement of high speed CPU and high speed I/O Network speed is sensitive Mechanical Engineering • Explicit Problem • Performance of CPU and network speed are important • Implicit Problem • Requirement of high speed I/O and mass memory storage Physical science • Monte Carlo : High dependence on network performance Chemical science • Molecular Dynamics • Performance of CPU and network speed are important • Low dependency of memory size, I/O capacity and speed • Quantum Dynamics • Performance of CPU, network speed and mass memory storage are important Life science • Protein folding • High speed CPU and memory size are a little important Astronomy Computing performance is sensitive to high speed CPU, network speed ( Enormous influence at pre-process and post-process)
Introduction • Application Characteristics
Specialized High Performance Baseline Codes • Standard Flow Solvers in NASA (USA) • Full Potential • CAPTSD • Block Structured • CFL3D, TLNS3D-MB, PAB3D, GASP , LAURA, VULCAN • Overset Structured • OVERFLOW • Unstructured • FUN3D, USM3D, 3D3U • Other Flow Solvers • MIRANDA • High-order hydrodynamics code for computing instabilities and turbulent mix • Coded by LLNL (Lawrence Livermore National Laboratory) • AVBP • A compressible flow solver running on unstructured and hybrid grids • Coded by CERFACS, France
Aerodynamic Solvers for High Performance Computing (USA) • General Features of Overflow • Right-hand side options : • central differencing with Jameson 4/2 dissipation • Roe upwinding. • Left-hand side options : • Pulliam-Chaussee diagonalized scheme • LU-SGS scheme • Low-Mach number preconditioning • First-order implicit time advance • Convergence acceleration options: • Time-accurate mode or local timestep scaling • Grid sequencing, multigrid • Performance Test • Block structured overset gridwith 126 million grid points in total, 2000 time steps • Weak scaling : About 123,000 meshpoints in each processor • Efficiency : About 70% with 1024 processors(Compared to 64 processors)
Aerodynamic Solvers for High Performance Computing (USA) • General Features of the CFL3D • 2-D or 3-D grid topologies • Inviscid, laminar and/or turbulent flows • Steady or unsteady (including moving-grid) flows • Spatial discretization • van Leer’s FVS, Roe’s FDS • Time integration • Implicit approximate-factorization, dual-time stepping • High order interpolation & limiting • TVD MUSCL • Multiple block options: • 1-1 blocking, patching, overlapping, embedding • Convergence acceleration options: • Multigrid, mesh sequencing • Turbulence model options: • Baldwin-Lomax • Baldwin-Lomax with Degani-Schiff Modification • Baldwin-Barth • Spalart-Allmaras (including DES option) • Wilcox k-omega • Menter's k-omega SST • Abid k-epsilon • Explicit Algebraic Stress Model (EASM) • K-enstrophy
Aerodynamic Solvers for High Performance Computing (USA) • PETSc-FUN3D (NASA) • Code Features • FUN3D code attached to PETSc framework • A tetrahedral vertex-centered unstructured code • Spatial discretization with Roe scheme • A Galerkin discretization for the viscous terms • Pseudo-transient Newton-Krylov-Schwarzblock-incomplete factorization on each subdomainof the Schwarz preconditioner for time integration • Used for design optimization of airplanes,automobiles and submarines with irregular meshes • Performance Test • Unstructured mesh with 2.7 million vertices,18 million edges • Weak scaling • Performance : Nearly scalable withO(1000) processors
Aerodynamic Solvers for High Performance Computing (USA) • MIRANDA (LLNL) • Code Features • High-order hydrodynamics code for computing instabilities and turbulent mix • Conducting direct numerical simulationand large-eddy simulation • FFTs and band-diagonal matrix solvers for spectrally-accurate derivatives • Studying Rayleigh-Taylor (R-T) and Richtmyer-Meshkov (R-M) instabilities • Performance Test • Weak scaling parallel efficiency nearly 100%with 128K processors • Strong scaling shows good efficiency with64K processors (Compared to performance with 8K processors) • All-to-all communication gives good performance Turbulent Flow Mixing of Two Fluids(LES of R-T Instability) Efficiency with Strong Scaling
Aerodynamic Solvers for High Performance Computing (Europe) • AVBP (CERFACS) • Code Features • A parallel CFD code for laminar and turbulent compressible Navier-Stokes equations on unstructured and hybrid grids • Unsteady reacting flow analysis based on the LES approach • Built upon a modular software library including integrated parallel domain partition and data reordering tools, message passing handler, supporting routines for dynamic memory allocation, routines for parallel I/O and iterative methods • Performance • Nearly 100% of parallel efficiency with4K processors (on BlueGene/L) • Strong scaling case • Code may run in the range of O(1000)s of processors
Aerodynamic Solvers for High Performance Computing • Efficiency of Various Applications Including CFD • From BlueGene/L reports • Both weak scaling and strong scaling parallelism ※ Weak scaling : Same domain size in each processor ※ Strong scaling : Same domain size in total
Essential Elements for Teraflops CFD - High-Fidelity Numerical Method - Numerical Flux Scheme : Accurate Shock Capturing Higher-Order Interpolation : Complex Flow Structure & Vortex Resolving Enhanced Accuracy of Aerodynamic Coefficients. Flow Analysis over Helicopter Full Body Configuration : A Very-Large Scale Problem Convergence Acceleration & Adaptive Grid Technique : Reduced Computational Cost N-S Simulation Around a Helicopter Fuselage with Actuator Disks U.C. Davis Center for CFD
RoeM Roe with E-Fix Damping & Feeding rate control using Mach number-based function RoeM • Shock Stability ( No Carbuncle ) • Total Enthalpy Conservation • Stability in Expansion Region • Exact Capturing of Contact Discontinuity • Accuracy comparable to Roe’s FDS Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – • RoeM Scheme Roe’s FDS • Sharp capturing of shock discontinuity • Unstable in expansion region (defect) • Carbuncle phenomena (defect)
AUSMPW+ Scheme Pressure wiggles cured by introducing weighting functions based on pressure AUSMPW+ • Eliminating expansion shock • Eliminating oscillations and overshoots • Reduced grid dependency • Improved convergence behavior Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – AUSM+ • Splitting the convective flux term and the pressure flux term • The hybrid form of FDS and FVS • Oscillation near a wall or across a strong shock (defect)
M-AUSMPW+ Scheme Much effective in the computations of multi-dimensional flows Achieve the complete monotonic characteristics Improved convergence characteristics Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – M-AUSMPW+ • Propose the criterion for accurate calculation of cell-interface fluxes • Pressure splitting function is modified
Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – • Higher Order Interpolation & Oscillation Control Scheme : MLP • TVD and ENO approach : based on 1-D flow physics. • Higher order interpolation with effective oscillations control in multiple dimension : Multi-dimensional Limiting Process. MLP5 + M-AUSMPW+ ( 350 * 175 * 175 ) MLP5 + M-AUSMPW+ ( 350 * 175 * 175 ) Feature 2: Profile of swirls near the corner Feature 3: Interacted profile of separated vortex & swirls plane of x = 0.842 plane of y = 0.078 : the center of primary separated vortex plane of x = 0.8725 Feature 1: Profile of separated vortex
Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – • Multigrid : Issues regarded in hypersonic flows • Non-linearity in shock regions • cause robustness problem in prolongation • Chemical reaction • time step restricted due to stiffness • Solutions to the problems • Modified Implicit residual smoothing • Damped prolongation & Implicit treatment of source term • Test Problem : Nonequilibrium viscous flow • M∞=10 , 60km altitude
Essential Elements for Teraflops CFD - Parallel Efficiency Enhancement - • Requirements for Systems • CPU - Fewer & powerful processors • Better for efficiency, management of resources, fault-prevention • More power consumption and heat emission • Memory – Faster access & efficient management • Most important factor for CFD applications • Network – Multiple interconnection networks • Separated communication channel between inter-processor communication and global communication • Ex) IBM BlueGene/L : 5 different communication types • I/O – Unpredicted broken data • Overload to storage server during data writing • Sometimes broken ASCII data are observed
Essential Elements for Teraflops CFD - Parallel Efficiency Enhancement - • Requirements for Software/Programming • Memory size – Different array range among processors • Computing domains can be different in range with same mesh points • Conventionally maximized memory size was allocated • Remedy : Variables stored in global memory (Shared memory system) Dynamic memory allocation in Fortran 90 (Distributed memory system) • I/O – Writing conducted in each processor • Conventional programs gathered all data set into one processor : Large-size array allocation required • Etc : Optimized compiler options, highly functional debugger, minimization of serial processing 80×40Domain 40×80Domain Dimension X(80,80), Y(80,80), …… 80×40Domain
Essential Elements for Teraflops CFD - Parallel Efficiency Enhancement - • Requirements for Algorithms • Scalability enhancement • Reduced global communication • Global communication along with inter-processor communication leading to synchronization problem • Residual gathering, aerodynamic coefficient computation routines should be improved • Dynamic load balancing • Processor allocation for faster inter-processor communication • Dynamic load balancing for the change of processor’s performance during computation • Fault-tolerance
Essential Elements for Teraflops CFD - Geometric Representation - Multiple Body Problems Complicated Geometry • Block topology is complicated for structured system. • Grid generation work is a time consuming work. • Manual preprocess is impossible. Multiblock Overset Unstructured • Preprocessor • for Partitioning & • Automatic Detection • of Block Topology • Preprocessor • for automatic • block connectivity • Postprocessor • Overset mesh • generator • Automatic • grid generator • & Grid Adaption • Method
Essential Elements for Teraflops CFD - Geometric Representation - • Multi-Block System • Modulation of Preprocessing Code • Evaluation of Metric, Minimum Wall Distance and their Exchange • Automatic Detection Block Topology Flow Analysis of Combustion Chamber (NS, 600,000 pts., ASDL)
Mesh A Mesh A Mesh A Mesh B Mesh C Mesh C Mesh B Mesh B Mesh A Mesh C Mesh A Mesh B Mesh C Mesh B Essential Elements for Teraflops CFD - Geometric Representation - • Overset Mesh System • Pre-processing for automatic finding process of hole, fringe and donor cells due to complicated block connectivity (Overlap Optimization for PEGASUS) • Post-processing for the evaluation of aerodynamic coefficients (Zipper Grid)
Essential Elements for Teraflops CFD - Geometric Representation - • Unstructured System • Automatic grid generation code (Mavriplis et al., NASA Langley) • Grid adaptation method Subdivision Method Adjoint Based Adaptation Method
Jet Off AOA 0 Jet On AOA 0 Jet Off AOA 10 Jet On AOA 10 Jet Off AOA 20 Jet On AOA 20 Streamlines and Iso-velocity Surfaces (Side Nozzle, N-S, M = 1.0) Some Examples • Multi-block System • Parametric study in various flight conditions for aerospace engineering Parametric Study of a Missile with Side Nozzle (N-S, M =1.75)
Mach Contour Static Pressure Contour Some Examples • Multi-block System • Flow Analysis & Design of Turbulent Intake Flow using Multiblock System Total Pressure Contour in the Duct Section & Streamlines
Some Examples • Design Optimization based on Large Scale Computation Turbulent Duct Design with Multi-block Mesh System Baseline Model : Designed Model :
Some Examples • Overset Mesh System Manually Assigned Block Connectivity Overlap Optimized Block Connectivity
Y/SPAN=33.1% Y/SPAN=18.5% Y/SPAN=40.9% Y/SPAN=23.8% Y/SPAN=51.2% Y/SPAN=63.6% Y/SPAN=84.4% Some Examples • Overset Mesh System
Baseline Designed Some Examples • Design Optimization based on Large Scale Computation Redesign of DLR-F4 W/B Conf. with Overset Mesh System
Some Examples • Launch Vehicle Analysis with Load Balancing • Parallel computation on the Grid • 32 processors in Seoul National University & KISTI • 3.5 million mesh points
Conclusion • Current Status • Many disciplines are conducting teraflops computing • Teraflops computing in CFD field has not been activated yet • Issues and Requirements • High-fidelity numerical schemes for the description of complex flowfield • Domain decomposition method and parallel algorithms for enhancement of efficiency / fault-tolerancing • Automatic pre- & post-processing techniques in geometric representation to resolve complicated multiple body problems • Target CFD Application Areas • Unsteady Aerodynamics with Massive Flow Separation • MDO and Fluid-Structure Interaction • Multi-Body Aerodynamics with Relative Motion • Multi-Scale Flow Computation