300 likes | 521 Views
Grouping Data in MPI. Grouping Data. Messages are expensive in terms of performance Grouping data can improve the performance of your program. Grouping Data. The count Derived Types MPI_Type_vector MPI_Type_contiguous MPI_Type_indexed MPI_Type_struct MPI_Pack/MPI_Unpack.
E N D
Grouping Data • Messages are expensive in terms of performance • Grouping data can improve the performance of your program
Grouping Data • The count • Derived Types • MPI_Type_vector • MPI_Type_contiguous • MPI_Type_indexed • MPI_Type_struct • MPI_Pack/MPI_Unpack
The Count – groupcount.c int bigvector[9]={10,20,30,40,50,60,70,80,90}; int localvector[9]; …. int main(int argc, char* argv[]){ for (i=1; i<p; i++) { if( myrank==0) { MPI_Send(bigvector,9,MPI_INT,i,tagid,MPI_COMM_WORLD); } } if (myrank != 0){ MPI_Recv(localvector,9,MPI_INT,0,tagid,MPI_COMM_WORLD,&status); ….
The count with multi-dimensional arrays – groupcount2.c int bigmatix[3][3] = {10,20,30,40,50,60,70,80,90}; int localmatrix[3][3]; …… for (i=1; i<p; i++) { if( myrank==0) { MPI_Send(bigmatrix,9,MPI_INT,i,tagid,MPI_COMM_WORLD); } } if (myrank != 0){ MPI_Recv(localmatrix,9,MPI_INT,0,tagid,MPI_COMM_WORLD,&status); printf("Data in process %d -",myrank); for (i=0; i<3; i++){printf(" ROW_%d",i); for (ii=0; ii<3; ii++) printf(" %d,",localmatix[i][ii]);} printf("\n");
The count – but what about…groupcount3.c int matrix[3][3] = {10,20,30,40,50,60,70,80,90}; int localvector[9]; for (i=1; i<p; i++) { if( myrank==0) { MPI_Send(matrix,9,MPI_INT,i,tagid,MPI_COMM_WORLD); } } if (myrank != 0){ MPI_Recv(localvector,9,MPI_INT,0,tagid,MPI_COMM_WORLD,&status); printf("Data in process %d -",myrank); for (ii=0; ii<9; ii++) printf(" %d,",localvector[ii]); printf("\n"); }
The count –sending a row of a matrix if (myrank == 0) { MPI_Send(&(A[2][0]),10,MPI_FLOAT,1,0, MPI_COMM_WORLD); } else { MPI_Recv(&(A[2][0]),10,MPI_FLOAT,0,0, MPI_COMM_WORLD,&status); }
The count – but what about sending a column • In c using the count parameter in this way to send a column does not work – why? • C organizes matrices in row major order • Fortran – reverse that… column major order • … so to send a column in c
MPI Derived Types • Derived type – a list of ordered pairs • {(MPI_FLOAT,0), (MPI_FLOAT,0), (MPI_INT,0)} • Define the type • Commit the type MPI_Type_commit( MPI_Datatype* new_mpi_type)
MPI_Type_vector int MPI_Type_Vector( int count, int block_length, int stride, MPI_Datatype element_type, MPI_Datatype* new_mpi_t) count = number of elements in the type, block_length = number entries in each element, stride = how far apart the elements are, element_type = the data of the component elements, new_mpi_t = the name of the MPI datatype
MPI_Type_vector int A[3][3]; MPI_Type_vector(3, 1, 3, MPI_INT, &mpi_column_type); MPI_Type_commit(&mpi_column_type); if (myrank == 0) MPI_Send(&(A[0],[2]), 1, mpi_column_type, 1, 0, MPI_COMM_WORLD); else MPI_Recv(&(A[0][2]), 1, mpi_column_type, 0,0, MPI_COMM_WORLD, &status);
MPI_Type_contiguous int MPI_Type_contiguous( int count, MPI_Datatype old_type, MPI_Datatype* new_mpi_t) the new MPI type new_mpi_t is countcontiguous elements of the type old_type
MPI_Type_contiguous MPI_Type_contiguous(4, MPI_INT, &mpi_4ints); MPI_Type_commit(&mpi_4ints);
MPI_Type_indexed int MPI_Type_indexed( int count, int block_length[], int displacement[], MPI_Datatype old_type, MPI_Datatype* new_mpi_t) defines count elements of old_type each element contains block_length entries each element is displaced displacement[] old_type from the initial displacement (usually 0)
MPI_Type_indexed float A[5][5]; {main matix} float T[5][5]; {top corner of matrix} int displace[5], block_lengths[5]; MPI_Datatype index_mpi_t; block_lengths[0] = 5; block_lengths[1] = 4; block_lengths[2] =3; block_lengths[3] = 2; block_lengths[4] = 1; displace[0]=0; displace[1]=6; displace[2]=12; displace[3]=18; displace[4]=24; MPI_Type_indexed(5, block_lengths, displace,MPI_FLOAT, &index_mpi_t); MPI_Type_commit(&index_mp_t); if (myrank == 0) MPI_Send(A,1,index_mpi_t,1,0,MPI_COMM_WORLD); else MPI_Recv(T,1,index_mpi_t,0,0,MPI__COMM_WORLD, &status);
MPI_Type_indexed float A[n][n]; float T[n][n]; int displacements[n], block_lengths[n]; MPI_Datatype index_mpi_t; for (i=0; i<n; i++) { block_lengths[i] = n-i; displacements[i] = (n+1)*i; } MPI_Type_indexed(n, block_lengths, displacements,MPI_FLOAT, &index_mpi_t); MPI_Type_commit(&index_mp_t); if (myrank == 0) MPI_Send(A,1,index_mpi_t,1,0,MPI_COMM_WORLD); else MPI_Recv(T,1,index_mpi_t,0,0,MPI__COMM_WORLD, &status);
MPI_Type_struct int MPI_Type_struct( int count, int block_lengths[], int displacements[], MPI_Datatype typelist[], MPI_Datatype new_mpi_type)
MPI_Type_struct Float* a_ptr, b_prt; int* n_prt; MPI_Datatype* mpi_tmp_type; int block_lengths[3]; MPI_Aint displacements[3]; MPI_Datatype typelist[3]; MPI_Aint address, start_address;
MPI_Type_struct block_lengths[0]=block_length[1]=block_lengths[2]=1; typelist[0] = MPI_FLOAT; typelist[1] = MPI_FLOAT; typelist[2] = MPI_INT;
MPI_Type_struct displacements[0] = 0; MPI_Address(a_ptr, &start_address); MPI_Address(b_ptr, &address); displacements[1] = address – start_address; MPI_Address(n_ptr, &address); displacements[2] = address – start_address;
MPI_Type_Struct MPI_Type_Struct(3, block_lengths, displacements, typelist, mpi_tmp_type); MPI_Commit(mpi_tmp_type); MPI_Bcast(a_ptr, 1, mpi_tmp_type,0,MPI_COMM_WORLD);
MPI_Type_struct -Suppose that – Float* a_ptr, b_prt[10]; int* n_prt; …. block_lengths[0] = block_lengths[1] = 1; block_lengths[2] = 10; … displacements …. (same) … MPI_Type_Struct(3, block_lengths, displacements, typelist, mpi_tmp_type); MPI_Commit(mpi_tmp_type);
MPI_Pack Int MPI_Pack( void* pack_data, int in_count, MPI_Datatype datatype, void* buffer, int buffer_size, int* position, MPI_Comm comm)
MPI_Unpack int MPI_Unpack( void* buffer, int* buffer_size, int* position, void* unpacked_data, int count, MPI_Datatype datatype, MPI_Comm comm)
MPI_Pack/MPI_Unpack float a, b; int n, myrank, position; char buffer[100]; … if (myrank == 0) { position = 0; MPI_Pack(a, 1, MPI_FLOAT, buffer, 100, &position, MPI_COMM_WORLD); MPI_Pack(b, 1, MPI_FLOAT, buffer, 100, &position, MPI_COMM_WORLD); MPI_Pack(n, 1, MPI_INT, buffer, 100, &position, MPI_COMM_WORLD); //position will be updated MPI_Bcast(buffer, 100, MPI_PACKED, 0, MPI_COMM_WORLD); } else {
MPI_Pack/MPI_Unpack MPI_Bcast(buffer, 100, MPI_PACKED, 0, MPI_COMM_WORLD); position = 0; MPI_Unpack(buffer,100,&position,a,1,MPI_FLOAT, MPI_COMM_WORLD); MPI_Unpack(buffer,100,&position,b,1,MPI_FLOAT, MPI_COMM_WORLD); MPI_Unpack(buffer,100,&position,n,1,MPI_INT, MPI_COMM_WORLD); }
What to do? • Contiguous data, same datatype –the count parameter/data type • Efficient, no overhead • Row of a matrix, several rows • Non-contiguous, evenly displaced data, same datatype – MPI_Type_vector • Column of a matrix
What to do • Non-contiguous, unevenly spaced, same datatype – MPI_Type_indexed • Selected subsections of a matrix • Non-contiguous, unevenly displaced, mixed datatype – MPI_Type_struct • Most general, most complex to setup • Collection of related data – c struct • MPI_Pack/MPI_Unpack