Parallelization of 2D Lid-Driven Cavity Flow

Parallelization of 2D Lid-Driven Cavity Flow AsifSalahuddin Ahmad Sharif Jens Kehne Parallelization of 2D Lid-Driven Cavity Flow

Objectives Parallelization of 2D Lid-Driven Cavity Flow

Our objectives • Numerical simulation of fluid dynamics, using the Lattice-Boltzmann method • Parallelize the code using MPI • Study speedup and scalability • Allow to run large problem sizes in reasonable time • Allow to run them at all, for that matter (memory requirements) Parallelization of 2D Lid-Driven Cavity Flow

Concept Parallelization of 2D Lid-Driven Cavity Flow

The Lattice-Boltzmann method • The Lattice-Boltzmann equation: • Velocity directions: Parallelization of 2D Lid-Driven Cavity Flow

Top-down vs. bottom-up Partial differential equations (Navier-Stokes) Partial differential equations (Navier-Stokes) Discretization Multi-scale analysis Difference equations (Conserved Quantities?) Discrete model (LGCA or LBM) Parallelization of 2D Lid-Driven Cavity Flow

Fluid nodes • The entire problem is represented as a grid of fluid nodes • Fluid nodes hold velocities towards all neighbors • New grid state computed for discrete time steps Parallelization of 2D Lid-Driven Cavity Flow

Wall bounceback • The fluid domain is surrounded by walls • On each timestep, the direction of links hitting a wall is reversed • Walls may be moving • Changes the momentum of the fluid close to it Parallelization of 2D Lid-Driven Cavity Flow

Implementation Parallelization of 2D Lid-Driven Cavity Flow

Domain decomposition • Each processor processes part of the grid : Ghost nodes • Represent border nodesof the neighbors : Border nodes • Updated by neighbors : Inner nodes • We can update these alone Parallelization of 2D Lid-Driven Cavity Flow

Automatic decomposition • Factorize and merge • Factorize x and y dimension and #procs • Divide x and y by prime factors of #procs • Goal: Try to keep the processor’s grids as square as possible • Best relation between inner and border nodes • Minimizes communication Parallelization of 2D Lid-Driven Cavity Flow

Automatic decomposition - demo #CPUs: 6 = 2 * 3 X-axis: 30 = 2 * 3 * 5 Y-axis: 20 = 2 * 2 * 5 30 # CPUs: 6 20 10 10 Parallelization of 2D Lid-Driven Cavity Flow

Optimizations • Overlapping wall bounceback and communication • About 5% speedup • Overlapping inner node computation with communication • Massive slowdown! • Probably due to cache effects • Making use of regular communication pattern • Slower (we have no idea why!) Parallelization of 2D Lid-Driven Cavity Flow

Experimental results Parallelization of 2D Lid-Driven Cavity Flow

Experimental setup • Lonestar Linux cluster @ University of Texas • Part of the Teragrid project • 1300 compute nodes • 2 Intel Xeon 2.66 GHz dual-core CPUs per node • 42.6 GFLOPS/node • 8GB RAM/node • Linux kernel 2.6, 64 bit • Infiniband interconnect, fat tree topology Parallelization of 2D Lid-Driven Cavity Flow

Actual speedup Parallelization of 2D Lid-Driven Cavity Flow

Relation to expected speedup Parallelization of 2D Lid-Driven Cavity Flow

Questions Parallelization of 2D Lid-Driven Cavity Flow

Parallelization of 2D Lid-Driven Cavity Flow

Parallelization of 2D Lid-Driven Cavity Flow

Presentation Transcript

Parallelization of Expert System

Cash Flow Driven Development

Parallelization

LID DRIVEN CAVITY FLOW

Flow electrification by cavity QED

3D Simulation of particle motion in lid-driven cavity flow by MRT LBM

Parallelization of urbanSTREAM

Parallelization of RHSEG

Parallelization of RHSEG

Semantic-Driven Parallelization of Loops Operating on User-Defined Containers

Anatomy of Lid

PARALLELIZATION OF MULTIPLE BACKSOLVES

Intro to 2D Supersonic Flow

Leadership Lid

Linda Lid

Cavity Locking by Piezo-Driven Mirror

Program Demultiplexing: Data-flow based Speculative Parallelization

Computer Simulation of Gravity-Driven Granular Flow

STABLY STRATIFIED SHEAR-PRODUCED TURBULENCE AND LARGE-SCALE-WAVES IN A LID DRIVEN CAVITY