250 likes | 287 Views
Chapter 6 Parallel Sorting Algorithm. Sorting Parallel Sorting Bubble Sort Odd-Even (Transposition) Sort Parallel Odd-Even Transposition Sort Related Functions. Sorting. Arrange elements of a list into certain order Make data become easier to access
E N D
Chapter 6 Parallel Sorting Algorithm • Sorting • Parallel Sorting • Bubble Sort • Odd-Even (Transposition) Sort • Parallel Odd-Even Transposition Sort • Related Functions
Sorting • Arrange elements of a list into certain order • Make data become easier to access • Speed up other operations such as searching • Many sorting algorithms with different time and space complexities
Parallel Sorting Design methodology • Based on an existing sequential sort algorithm –Try to utilize all resources available –Possible to turn a poor sequential algorithm into a reasonable parallel algorithm (from O(n2) to O(n)) • Completely new approach –New algorithm from scratch –Harder to develop –Sometimes yield better solution Potential speedup • O(nlogn) optimal for any sequential sorting algorithm without using special properties of the numbers • Optimal parallel time complexity O(nlogn/n ) = O(logn)
Bubble Sort • One of the straight-forward sorting methods –Cycles through the list –Compares consecutive elements and swaps them if necessary –Stops when no more out of order pair • Slow & inefficient • Average performance is O(n2) Example: 6 5 3 1 8 7 2 4
Bubble Sort for(int i=0; i<n; i++) { for(int j=0; j<n-1; j++) { if(array[j]>array[j+1]) { int temp = array[j+1]; array[j+1] = array[j]; array[j] = temp; } } } Example: 6 5 3 1 8 7 2 4
Odd-Even (Transposition) Sort • Variation of bubble sort. • Operates in two alternating phases, even phase and odd phase. Even phase Even-indexed items compare and exchange with their right neighbor. Odd phase Odd-indexed items exchange numbers with their right neighbor.
Odd-Even (Transposition) Sort for (int i = 0; i < n; i++) { if (i % 2 == 1) { // odd phase for (int j = 2; j < n; j += 2) { if (a[j] < a[j-1]) swap (a[j-1], a[j]); } } else { //even phase for (int j = 1; j < n; j += 2) { if (a[j] < a[j-1]) swap (a[j-1], a[j]); } } }
Odd-Even (Transposition) Sort Sorting n = 8 elements, using the odd-even transposition sort algorithm. 6 5 3 1 8 7 2 4
Parallel Odd-Even Transposition Sort • Operates in two alternating phases, even phase and odd phase • Even phase Even-numbered processes exchange numbers with their right neighbor. • Odd phase Odd-numbered processes exchange numbers with their right neighbor.
Parallel Odd-Even Transposition Sort MPI_Comm_rank(MPI_COMM_WORLD, &mypid); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); for (int i = 0; i < nprocs; i++) { if (i % 2 == 1) { // odd phase if (mypid % 2 == 1) compare_and_exchange_min(mypid+1); else compare_and_exchange_max(mypid-1); } else { //even phase if (mypid % 2 == 0) compare_and_exchange_min(mypid+1); else compare_and_exchange_max(mypid-1); } }
MPI_Scatter • MPI_Scatter is a collective routine that is very similar to MPI_Bcast • A root processor sending data to all processors in a communicator • MPI_Bcast sends the same piece of data to all processes • MPI_Scatter sends chunks of an array to different processors
MPI_Scatter • MPI_Bcast takes a single element at the root processor and copies it to all other processors • MPI_Scatter takes an array of elements and distributes the elements in the order of the processor rank
MPI_Scatter • Its prototype MPI_Scatter(void* send_data, int send_count, MPI_Datatype send_datatype, void* recv_data, int recv_count, MPI_Datatype recv_datatype, int root, MPI_Comm communicator) • send_data: an array of data on the root processor • send_count and send_datatype: how many elements of a MPI Datatype will be sent to each processor • recv_data: a buffer of data that can hold recv_count elements • root: root processor • communicator
MPI_Gather • The inverse of MPI_Scatter • Takes elements from many processors and gathers them to one single processor • The elements are ordered by the rank of the processors from which they were received • Used in parallel sorting and searching
MPI_Gather • Its prototype MPI_Gather(void* send_data, int send_count, MPI_Datatype send_datatype, void* recv_data, int recv_count, MPI_Datatype recv_datatype, int root, MPI_Comm communicator) • Only the root processors needs to have a valid receive buffer • All other calling processors can pass NULL for recv_data • recv_count is the count of elements received per processors, not the total summation of counts from all processors
Example 1 #include "mpi.h" #include <stdio.h> int main (int argc, char **argv) { int size, rank; int recvbuf[4]; int sendbuf[16]={1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16}; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Scatter(sendbuf,4,MPI_INT,recvbuf,4,MPI_INT,0,MPI_COMM_WORLD); printf("Processor %d gets elements: %d %d %d %d\n",rank,recvbuf[0], recvbuf[1],recvbuf[2],recvbuf[3]); MPI_Finalize(); }
Example 1 Processor 0 gets elements: 1 2 3 4 Processor 1 gets elements: 5 6 7 8 Processor 3 gets elements: 13 14 15 16 Processor 2 gets elements: 9 10 11 12
Example 2 #include "mpi.h" #include <stdio.h> int main (int argc, char **argv) { int size, rank; int sendbuf[4]; int recvbuf[16]; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); int i; for (i =0; i < 4; i++){ sendbuf[i]= 4*rank + i+1; }
Example 2 MPI_Gather(sendbuf,4,MPI_INT,recvbuf,4,MPI_INT,0,MPI_COMM_WORLD); if (rank == 0){ int j; for(j = 0; j < 16; j++){ printf("The %d th element is %d\n", j, recvbuf[j]); } } MPI_Finalize(); }
Example 2 The 0 th element is 1 The 1 th element is 2 The 2 th element is 3 The 3 th element is 4 The 4 th element is 5 The 5 th element is 6 The 6 th element is 7 The 7 th element is 8 The 8 th element is 9 The 9 th element is 10 The 10 th element is 11 The 11 th element is 12 The 12 th element is 13 The 13 th element is 14 The 14 th element is 15 The 15 th element is 16