么石磊博士亚太技术网络 (APTN) stone@beijing.sgi

么石磊博士 亚太技术网络(APTN) stone@beijing.sgi.com

Data Data N threads N sub-domains 1 Domain

APO -APO C$OMP

Example for Parallelisation program sum_squares integer I,n real*8 sum print *,’enter n:’ read *,n sum=0 do I=1,n sum = sum + dble(I)**2 end do print *, ‘sum of first’,n,’ squares =‘,sum end

Rewriting Parallelisation code Message Passing Parallel Programming program sum_squares include “mpif.h integer I,n,myid,numprocs real*8 my_sum,sum call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD,myid,ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD,numprocs,ierr) if(myid.eq.0) then print *,’enter n:’ read *,n endif call MPI_BCAST(n,1,MPI_INTEGER,0, & MPI_COMM_WORLD, ierr) my_sum = 0 do I=1+myid*n/numprocs,(myid+1)*n/numprocs my_sum = my_sum + dble(I)**2 end do call MPI_REDUCE(my_sum,sum,1,MPI_DOUBLE_PRECISION, & MPI_SUM,0,MPI_COMM_WORLD,ierr) if(myhid,eq.0) then print *, ‘sum of first’,n,’ squares =‘,sum endif call MPI_FINALLIZE(ierr) end Shared Memory Parallel Programming program sum_squares integer I,n real*8 sum print *,’enter n:’ read *,n sum=0 !$OMP PARALLEL DO reduction(+:sum) do I=1,n sum = sum + dble(I)**2 end do print *, ‘sum of first’,n,’ squares =‘,sum end Added Shared Memory Code Added Dist. Memory Message Passing Code

– Use condition modifier in c$doacross:

Parallelisation Strategy • port code if necessary. Make sure it gives correct answer • Profile the code, work only on loops and routines that use a large percentage of cpu time (think about coarse-grained parallelism) • Use APO and be done(!?) or • Hand tune for loop level parallelization • To further improve performance, look for coarse-grained parallelism opportunities.

么石磊博士亚太技术网络 (APTN) stone@beijing.sgi