120 likes | 285 Views
Parallelization of urbanSTREAM. CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August 22-23, 2006. urbanGRID. parallel flow solver. urbanSTREAM- P. grid_generator. grid_partition. ArcView Shape file (*.shp,*.dbf). grid generation.
E N D
Parallelization of urbanSTREAM CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August 22-23, 2006
urbanGRID • parallel flow solver urbanSTREAM-P grid_generator grid_partition ArcView Shape file (*.shp,*.dbf) • grid generation Meteorological data file (urbanGEM / LAM) boundary_interpolate • Partition a single grid • Achieve maximum load balancing • interpolate BCs F90, dynamic memory allocation
Halo data Halo data are passed between different CPUs Using message-passing library MPI Domain Decomposition
(K) E (I) N (J) K_max=2 I_max=2 J_max=2 # of CPUs NBLOC=I_max J_max K_max =222=8
grid_urbanSTREAM.dat ! grid file name to read in boundary_Z.dat ! BC file name to read in 0 ! IAVE, Time-averaging done earlier: (0) NO, (1) YES 300 ! ITMAX, maximum # time steps to run on current simulation 30.0 ! DELT (s), time step for integration 1000 ! MAXIT, maximum # outer iterations/time step 0.001 ! SORMAX, normalized residual-norm before soln in outer iteration is declared to have converged 0 ! IREAD, restart from a previous saved flow state: (0) NO, (1) YES 100.0 ! normalized drag coefficient, Cd*A*H [-] 7.27E-5 ! angular velocity of Earth's rotation [1/s] 35.468 ! latitude in degrees for Coriolis force term 1.E+30 ! height of boundary-layer [m], required for limited length scale K-E model; set to 1.E+30 to recover STD K-E model I_MAX,J_MAX,K_MAX NX,NY,NZ Connectivity Matrix Extra information required by urbanSTREAM-P urbanGRID Sample input.par for serial urbanSTREAM
Block #1 in question Block Connectivity Matrix N(3) W(2) E (1) S(4) Neighbor (J) (I)
Halo Data Block 3 Block 4 3D view Block 1 Block 2 B.C. from neighboring block
Message Passing Interface (MPI) Library MPI_RECV Block 1 MPI_SEND Identical 2 layers of halo data
8th ERCOFTAC workshop on turbulence modeling (I_max,J_max,K_max)=(1,2,2) Side View Top View
Parallel Efficiency 83,248 CVs per CPU 5,203 CVs per CPU Communication overhead becomes dominant!
CALL MPI_INIT(ierr) • CALL MPI_COMM_RANK(MPI_COMM_WORLD,node,ierr) • urbanSTREAM • CALL OVEL_Q (MPI_SEND, MPI_RECV) • CALL MPI_FINALIZE(ierr) Structure of urbanSTREAM-P MPI_RECV Block 1 MPI_SEND Identical