200 likes | 662 Views
Parallelization of 2D Lid-Driven Cavity Flow. Asif Salahuddin Ahmad Sharif Jens Kehne. Objectives. Our objectives. Numerical simulation of fluid dynamics, using the Lattice-Boltzmann method Parallelize the code using MPI Study speedup and scalability
E N D
Parallelization of 2D Lid-Driven Cavity Flow AsifSalahuddin Ahmad Sharif Jens Kehne Parallelization of 2D Lid-Driven Cavity Flow
Objectives Parallelization of 2D Lid-Driven Cavity Flow
Our objectives • Numerical simulation of fluid dynamics, using the Lattice-Boltzmann method • Parallelize the code using MPI • Study speedup and scalability • Allow to run large problem sizes in reasonable time • Allow to run them at all, for that matter (memory requirements) Parallelization of 2D Lid-Driven Cavity Flow
Concept Parallelization of 2D Lid-Driven Cavity Flow
The Lattice-Boltzmann method • The Lattice-Boltzmann equation: • Velocity directions: Parallelization of 2D Lid-Driven Cavity Flow
Top-down vs. bottom-up Partial differential equations (Navier-Stokes) Partial differential equations (Navier-Stokes) Discretization Multi-scale analysis Difference equations (Conserved Quantities?) Discrete model (LGCA or LBM) Parallelization of 2D Lid-Driven Cavity Flow
Fluid nodes • The entire problem is represented as a grid of fluid nodes • Fluid nodes hold velocities towards all neighbors • New grid state computed for discrete time steps Parallelization of 2D Lid-Driven Cavity Flow
Wall bounceback • The fluid domain is surrounded by walls • On each timestep, the direction of links hitting a wall is reversed • Walls may be moving • Changes the momentum of the fluid close to it Parallelization of 2D Lid-Driven Cavity Flow
Implementation Parallelization of 2D Lid-Driven Cavity Flow
Domain decomposition • Each processor processes part of the grid : Ghost nodes • Represent border nodesof the neighbors : Border nodes • Updated by neighbors : Inner nodes • We can update these alone Parallelization of 2D Lid-Driven Cavity Flow
Automatic decomposition • Factorize and merge • Factorize x and y dimension and #procs • Divide x and y by prime factors of #procs • Goal: Try to keep the processor’s grids as square as possible • Best relation between inner and border nodes • Minimizes communication Parallelization of 2D Lid-Driven Cavity Flow
Automatic decomposition - demo #CPUs: 6 = 2 * 3 X-axis: 30 = 2 * 3 * 5 Y-axis: 20 = 2 * 2 * 5 30 # CPUs: 6 20 10 10 Parallelization of 2D Lid-Driven Cavity Flow
Optimizations • Overlapping wall bounceback and communication • About 5% speedup • Overlapping inner node computation with communication • Massive slowdown! • Probably due to cache effects • Making use of regular communication pattern • Slower (we have no idea why!) Parallelization of 2D Lid-Driven Cavity Flow
Experimental results Parallelization of 2D Lid-Driven Cavity Flow
Experimental setup • Lonestar Linux cluster @ University of Texas • Part of the Teragrid project • 1300 compute nodes • 2 Intel Xeon 2.66 GHz dual-core CPUs per node • 42.6 GFLOPS/node • 8GB RAM/node • Linux kernel 2.6, 64 bit • Infiniband interconnect, fat tree topology Parallelization of 2D Lid-Driven Cavity Flow
Actual speedup Parallelization of 2D Lid-Driven Cavity Flow
Relation to expected speedup Parallelization of 2D Lid-Driven Cavity Flow
Questions Parallelization of 2D Lid-Driven Cavity Flow