250 likes | 407 Views
Parallel Implementation of Impact Simulation. Seung Hoon Paik*, Ji-Joong Moon* and Seung Jo Kim** * School of Mechanical and Aerospace Engineering, Seoul National University, Seoul, Korea ** School of Mechanical and Aerospace Engineering, Flight Vehicle Research Center,
E N D
Parallel Implementation of Impact Simulation Seung Hoon Paik*, Ji-Joong Moon* and Seung Jo Kim** * School of Mechanical and Aerospace Engineering, Seoul National University, Seoul, Korea ** School of Mechanical and Aerospace Engineering, Flight Vehicle Research Center, Seoul National University, Seoul, Korea Supercomputing Korea 2006 November 21, 2006
Outline • Introduction • Lagrangian Scheme • FE Calculation Parallelization • Contact Parallelization • Verification & Performance Evaluation • Taylor Impact Test • Oblique Impact of Metal Sphere • Eulerian Scheme • Two-step strategy • Eulerian Scheme - Remap Module • Eulerian Scheme Parallelization • Verification – 2-D/3-D Square Bar Impact • Parallel Performance Evaluation - 3-D Square Bar Impact
Introduction Material velocity=Mesh velocity Impact problem Lagrangian Instability due to large distortion Includes complex phenomenon Various numerical schemes are required [Scheffler, 2000] • Fixed Mesh • Ambiguous material Interface Eulerian Applications of Eulerian Scheme in Aerospace Engineering Bird Strike on composite plate Impact of fuel filled wing [Hassen, 2006] LS-DYNA [Anderson, 1999], CTH
Introduction Impact problem Many time steps Complex contact Nonlinear material behavior Large scale FE model for whole structure Requires many computing times Includes complex phenomenon Various numerical schemes are required Parallel Computing • Objectives • Development of Impact Code based on the Eulerian & Lagrangian Scheme • Implementation of efficient parallel algorithm and achieving a good performance
IPSAP (Internet Parallel Structural Analysis Program) 5/27 IPSAP (http://ipsap.snu.ac.kr) IPSAP/standard IPSAP/Explicit Lagrangian Scheme Eulerian Scheme Non-linear Solver (under development) Linear Equation Solver Eigenvalue Solver FEM Modules Solver Engine
Linux Cluster Supercomputer : Pegasus system 6/27 Rack-20 Node & Multi Trunking (4 GB Uplink) -Nortel 380-24T (Giga) & Intel 24T (Fast) Gigabit Ethernet- Nortel 5510-48T Fast ethernet- Intel 24T Local Gigabit Local Fast Rack ( 20 Node ) NFS & Gatekeeper External Network
IPSAP/Explicit - Lagrangian Scheme FE Calculation Parallelization Contact Parallelization Verification & Performance Evaluation Taylor Impact Test Oblique Impact of Metal Sphere 7/27
8/27 Lagrangian Scheme IPSAP/Explicit (Internet Parallel Structural Analysis Program) • Explicit Time Integration, Automatic Time Step Control • Elastic, Orthotropic, Elastoplastic, Johnson-Cook • EOS (Equation of State) : Polynomial Model, JWL, Grüneisen • FE Model : 8 node Hexahedron, 4 node BLT Shell 1 point integration with Hourglass Control • Object Stress Update : Jaumann rate stress update • Artificial Bulk Viscosity • Contact Treatment : Contact Search : Bucket Sorting Master-Slave Algorithm, Penalty Method Single Surface Contact (or Self Contact) • Element Erosion and Automatic Exterior contact surface update • MPP Parallelization
9/27 1 2 3 4 5 6 7 8 9 FE Calculation Parallelization Array structure of send buffer 1. Compute at each processor independently. 2. Interface values are swapped and added. Array structure of Receive buffer Each Processor (or domain) knows • list of processors that share common interface • list of nodes in each shared interface. At the initialization stage and are not changed through the computation (Static)
10/27 ContactParallelization • ContactParallelization(Computers and Structures, 2006) • Contact segment를 FE decomposition 과 동일하게 분할하되, Segment가 공간상 차지하는 확장영역에 들어오는 Contact node의 data 를 Unstructured Communication 을 적용하여 송/수신 • Two-body contact 및 Single Surface contact 에 모두 적용 가능하도록 일반화 • 유한요소 내력벡터와 마스터 노드의 접촉력벡터 동시 통신 • 송/수신 자료 구조의 일관성 유지, • 비구조화 통신 하에서도 송신데이터의 최소화 • Maker 나 특별히 최적화 된 OS가 아닌 일반 linux cluster 에서 대규모 병렬성능 테스트 결과 제시 Contact Load Balancing
11/27 Verification : Taylor Bar Impact Test • Analysis Conditions • Material Model : Elastic-Plastic with Linear Hardening • Termination Time : 80 micro sec • Constraints : Sliding condition in bottom surface • Results Number of Node : 1369 Number of Element : 972 Initial & Deformed Configuration Material Constants & Geometric Configuration
12/27 Verification : Oblique Impact of Metal Sphere • Comparison with Experiment (Finnegan SA, Dimaranan LG, Heimdahl OER - 1993) • Model Configuration Comparison : Experiment vs. IPSAP/Explicit (b) 910 m/s (a) 610m/s IPSAP/Explicit Impact angle = 60° (b) 910 m/s (a) 610m/s
Parallel Performance Evaluation 13/27 • Taylor Impact Test • Domain Decomposition • Graph partitioning scheme (METIS) • Fixed Size Speed Up • 10 million DOF • 1CPU/Node • 122 Speed up at 128CPUs • 2CPUs/Node • 105 Speed up at128CPUs • 151 Speed up at 256CPUs • Scaled Speed Up • 55,296 elements / CPU • 7 million elements at 128CPUs • 128 Speed up at 128 CPUs(1CPU/Node) Fixed Size Speed Up Scaled Speed Up
Parallel Performance Evaluation :Oblique Impact of Metal Sphere 14/27 유한요소와 접촉처리 계산의 병렬 성능 • 접촉처리 계산으로 인한 효율 감소 접촉 처리의 병렬 성능 접촉처리 • 접촉처리를 위한 Load Balancing (Contact L/B) : CPU증가에 따라 증가하는 양상 접촉 영역이 전 범위에 걸쳐 있지 않고, 충격 부위에서 국부적으로 발생하기 때문에 접촉 계산의 불균형 • 접촉 처리(Contact Force) : 전체 계산시간에서 차지하는 비중이 미약 • 접촉 데이터의 통신 (Contact Comm.)
15/27 • Eulerian Scheme • Two-step strategy • Eulerian Scheme - Remap Module • Eulerian Scheme Parallelization • Verification – 2-D/3-D Square Bar Impact • Parallel Performance Evaluation - 3-D Square Bar Impact
16/27 Two-step strategy • Eulerian equations in conservation form Two-step Codes Eulerian Codes CELL, JOY, HULL, PISCES, CSQ, CTH, MESA, KRAKEN solved sequentially Two-Step strategy : Operator-Split Eulerian Step Remap Step/Part/Module
17/27 Eulerian Scheme - Remap Module • Compute Material Flux • Compute Volume Flux • Compute Material Flux by using Interface Tracking Algorithm • Material centered advection • Advect density, stress, strain, energy • Vertex centered advection • Advect momentum and kinetic energy • Compute nodal velocity
18/27 Two-step Eulerian Code Structure IPSAP/Explicit • Structure of Program • Serial Lagrangian • Serial Eulerian • Parallel Lagrangian • Parallel Eulerian
19/27 [i-2] [i-1] [ I ] [i+1] [i+2] J I Eulerian Scheme Parallelization *ST : Stress Tensor, EPS : Effective Plastic Strain, ABE : Artificial Bulk Viscosity, IE : Internal Energy
20/27 Verification : Square Bar Impact 2D Contour plot (VOF=0.5 ) .Not Equal. Material Interface • Model Configurations • Geometric Configuration 32x1x10(mmxmmxmm) Length of each cell : 1mm (Total 320 Cells) • Constraints Exterior Surface of Model: Sliding BC • Impact Velocity : 200 m/sec • Termination Time : 80 μsec • Results 0 μsec 20 μsec 40 μsec 80 μsec Deformation Configuration (Lagrangian : Left, Eulerian : Right)
21/27 Verification : Square Bar Impact 3D • Model Configurations • Geometric Configuration 32x10x10(mmxmmxmm) Length of each cell : 1mm (Total 3,200cells) • Constraints Exterior surface of Model : Sliding BC • Impact Velocity : 200 m/sec • Termination Time : 80 μsec 0 μsec 10 μsec 20 μsec 40 μsec IPSAP/Explicit (Lagrangian : Left, Eulerian : Right) 40 μsec LS/DYNA : 80 μsec (Lagrangian : Left, Eulerian : Right)
22/27 Parallel Performance Evaluation • 3-D Square Bar Impact • Model Configurations • Example : Bar Impact 3D • 1024x20x20 (409,600)element • 10μsec (1,500 cycle) • Domains are decomposed along the impact direction • IPSAP/Explicit shows 2 or 3 times of smaller elapsed time than LS-DYNA • LS-DYNA : use HIS algorithm, MM IPSAP/Explicit vs. LS-DYNA * clock time per zone cycle (Total Elapsed Time/(Total element*Nsteps))
23/27 Parallel Performance Evaluation Elapsed Time of Each Sub Function • elapsed time for remap part including the communication time takes about 90 % of total elapsed time • communication time for remap part is 30~40 times larger than that for the Lagrangian part. Speed-Up of Internal force calculation and Remap parallel efficiency for remap part is better than that of internal force. This is because the calculation of internal force of the void cell is skipped in the program
24/27 Summary & Future Work • Summary • A newly developed Lagrangian/Eulerian code has been described and its parallel procedure has been provided. • Parallel performance is compared with a commercial code and is shown to be very efficient as the number of CPUs increases. • The remap part is identified to be the most influent part to the serial and parallel performance since it takes over 90% of total elapsed time. • The first parallel Two-step Eulerian Code developed in Korea • Future Work • Multi-material capability • 2nd order accuracy • Lagrangian-Eulerian Interface
25/27 Thank You