1 / 17

A few words on parallel sorting

This article provides an overview of parallel sorting algorithms such as merge sort, quicksort, radix sort, and PSRS (Parallel Sorting by Regular Sampling). It discusses their time complexity, parallel implementation, and compares their performance.

aflint
Download Presentation

A few words on parallel sorting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A few words on parallel sorting David Gregg Department of Computer Science University of Dublin, Trinity College 1

  2. Sorting • Sorting is one of the most studied problems in computer science • Lots of different algorithms • Huge amount of work on understanding time complexity of algorithms • Also lots of work on parallel sorting • We’re just going to scratch the surface of the problem 2

  3. Sorting • Several well-known sorting algorithms lend themselves very well to parallel implementations • Merge sort • Quick sort • Radix sort • There are also many dedicated parallel sorting algorithms • We’ll look at PSRS 3

  4. Merge Sort • Treats all elements of the array as sorted sub-arrays of length 1 • Pairs of adjacent sorted sub-arrays are merged • Merging is repeated until there is only a single sorted sub-array which consists of the entire array • O(N log N) sequential time complexity • O(N) with infinite parallel processors 4

  5. Merge Sort void mergesort(int a[ ], int low, int high) {int mid;if(low<high) { // find the midpoint in the current partition mid=(low+high)/2; // recursive call on each sub-partition mergesort(a,low,mid); mergesort(a,mid+1,high); // the actual work gets done on the way // back up from the recursion merge(a,low,high,mid);} } 5

  6. Merge Sort void mergesort(int a[ ], int low, int high) {int mid;if(low<high) { mid=(low+high)/2; #pragma omp task firstptivate(a, low, mid); mergesort(a,low,mid); #pragma omp task firstprivate(a, mid, high); mergesort(a,mid+1,high); merge(a,low,high,mid);} } 6

  7. Quicksort • Best-known and most popular sorting algorithm • Partitions the array into two subsections based on a “pivot” • Pivot is placed in final sorted place, and never moves again • Quicksort is applied recursively to the two subsections 7

  8. Quicksort void quicksort(int list[ ],int m,int n) { int key,i,j,k; if( m < n) { k = choose_pivot(m,n); swap(&list[m],&list[k]); key = list[m]; i = m+1; j = n; while(i <= j) { while((i <= n) && (list[i] <= key))  i++; while((j >= m) && (list[j] > key)) j--; if( i < j) swap(&list[i],&list[j]); } swap(&list[m],&list[j]); quicksort(list,m,j-1); quicksort(list,j+1,n) } } 8

  9. Quicksort • A common complaint about parallelizing quicksort is that the partition cannot be parallelized • If you need very large amounts of parallelism to get a good speedup this is a real problem • For fairly small numbers of cores, it’s not a major problem 9

  10. Radix Sort • Radix sort sorts by dividing the keys into buckets based on one or more bytes of the key • Most significant digit (MSD) vs. least significant (LSD) • MSD • Divide the array into buckets numbered 0..255 • Based on the value of the first byte of the key • Recursively sort each bucket into sub-buckets, based on the value of subsequent bytes in the key • MSD radix sort is a recursive divide-and conquer algorithm • Divides the array into smaller and smaller partitions, so locality tends to be good • O(N) steps are needed for each level of partitioning • O(b) levels are needed, where be is number of bytes in key • So total time complexity is O(bN) 10

  11. Radix Sort • Least significant digit (LSD) • Divide the array into buckets numbered 0..255 • Based on the value of the last byte of the key • But do not recursively sort each bucket • Instead divide the full array into separate buckets based on the second last key • LSB radix sort looks like it should not work • But as long as the partition is stable (ie it maintains the original order of equal keys) LSB radix sort actually works • So total time complexity is O(bN) 11

  12. Radix Sort • Radix sort sorts by dividing the keys into buckets based on one or more bytes of the key • Choices • Most significant digit vs. least significant • MSD gives best locality • In-place versus out of place • In-place usually gives best locality • Radix sorting is usually faster than comparison-based algorithms • Where it is applicable • For large N 12

  13. Parallel sorting by regular sampling • Fast parallel algorithm for sorting • Number of steps, and the large ones can be executed in parallel • Assume there are four threads in following discussion 13

  14. Parallel sorting by regular sampling • Step 1 • Divide array into 4 parts and sort each part in parallel sorted sorted sorted sorted 14

  15. Parallel sorting by regular sampling • Step 2 • Take one or more samples from each of the four partitions of the array • Sort these sample items sorted sorted sorted sorted sample sample sorted samples 15

  16. Parallel sorting by regular sampling • Step 3 • Partition each of the four sorted parts using the samples as pivot • This is a searching problem, not a full partition problem 16

  17. Parallel sorting by regular sampling • Step 4 • Now do a parallel merge of each of the partitions from each of the four parts 17

More Related