360 likes | 505 Views
CSS342: Sorting Algorithms. Professor: Munehiro Fukuda. Why We Desperately Need Efficient Sorting Algorithms?. Data must be sorted before we run the following programs: Search algorithms such as binary search and interpolation search
E N D
CSS342: Sorting Algorithms Professor: Munehiro Fukuda CSS342: Sorting Algorithms
Why We Desperately Need Efficient Sorting Algorithms? • Data must be sorted before we run the following programs: • Search algorithms such as binary search and interpolation search • Many computational geometry/graphics algorithm such as the convex hull • We always or frequently need to sort the following data: • Dictionary • White/yellow pages • Student grades CSS342: Sorting Algorithms
Topics Day 1: Lecture • Selection Sort worst/average O(n2) • Bubble Sort worst/average O(n2) • Insertion Sort worst/average O(n2) • Shell Sort worst O(n2) average O(n3/2) • Merge Sort worst/average O(n log n) • Quick Sort worst O(n2) average O(n log n) • Radix Sort worst/average O(n) Day 2: Lab Work • Partial Quick Sort Homework Assignment • Non-recursive Semi-In-Place Merge Sort CSS342: Sorting Algorithms
O(n2) sorting Selection Sort size-1 0 Scan item 0 to size-1, locate the largest item, and swap it with the rightmost item. Initial array 29 10 14 37 13 Scan item 0 to size-2, locate the 2ndlargest item, and swap it with the 2ndrightmost item. After 1st swap 29 10 14 13 37 Scan item 0 to size-3, locate the 3rd largest item, and swap it with the 3rd rightmost item. 13 10 14 29 37 After 2nd swap Scan item 0 to size-4, locate the 4th largest item, and swap it with the 4th rightmost item. 13 10 14 29 37 After 3rd swap Scan item 0 to size-5, locate the 5th largest item, and swap it with the 5th rightmost item. 10 13 14 29 37 After 4th swap CSS342: Sorting Algorithms
O(n2) sorting Selection Sort template <class Object> void selectionSort( vector<Object> & a ) { for ( int last = a.size( ) - 1; last >= 1; --last ) { int indexSoFar = 0; // Index of largest item found so far. // Assume 0th item is the largest first for ( int i = 1; i <= last; ++i ) { if ( a[i] > a[indexSoFar] ) indexSoFar = i; } // indexSoFar points to the largest item at this point swap( a[indexSoFar], a[last] ); } } indexSoFar 0 last = a.size( ) - 1 swap indexSoFar 0 last swap indexSoFar 0 last swap CSS342: Sorting Algorithms
O(n2) sorting Efficiency of Selection Sort Comparisons Swapping Initial array 29 10 14 37 13 N-1 (=4) 1 After 1st swap N-2 (=3) 1 29 10 14 13 37 N-3 (=2) 1 After 2nd swap 13 10 14 29 37 After 3rd swap 13 10 14 29 37 N-4 (=1) 1 After 4th swap 10 13 14 29 37 O(n(n-1)/2) O(n-1) O(n2) CSS342: Sorting Algorithms
O(n2) sorting Bubble Sort Pass 1 Pass 2 Pass 3 29 10 14 37 13 10 14 29 13 37 10 14 13 29 37 10 29 14 37 13 10 14 29 13 37 10 14 13 29 37 10 14 29 37 13 10 14 29 13 37 10 13 14 29 37 10 14 29 37 13 10 14 13 29 37 Pass 4 10 14 29 13 37 10 13 14 29 37 10 13 14 29 37 CSS342: Sorting Algorithms
O(n2) sorting index nextIndex 15 10 swap Bubble Sort #include <iostream> #include <vector> #include <string> using namespace std; template <class Object> void bubbleSort( vector<Object> & a ) { bool swapOccurred = true; // true when swaps occur for ( int pass = 1; ( pass < a.size( ) ) && swapOccurred; ++pass ) { swapOccurred = false; // swaps have not occurred at the beginning for ( int i = 0; i < a.size( ) - pass; ++i ) { // a bubble(i) goes from 0 to size - pass if ( a[i] > a[i + 1] ) { swap( a[i], a[i + 1] ); swapOccurred = true; // a swap has occured } } } } CSS342: Sorting Algorithms
O(n2) sorting Efficiency of Bubble Sort Pass 1 Pass 2 29 10 14 37 13 10 14 29 13 37 10 29 14 37 13 10 14 29 13 37 10 14 29 37 13 10 14 29 13 37 10 14 29 37 13 10 14 13 29 37 10 14 29 13 37 Comparison Swapping N-1 N-1 N-2 N-2 … … 1 1 O(n2) O(n2) O(n2) CSS342: Sorting Algorithms
O(n2) sorting Insertion Sort Sorted Unsorted 29 10 14 37 13 Copy 10 Shift 29 29 29 14 37 13 10 29 14 37 13 Insert 10, copy 14 unsortedTop 10 29 29 37 13 Shift 29 10 14 29 37 13 Insert 14; copy 37 10 14 29 37 13 Shift nothing 10 14 29 37 13 Copy 13 10 14 14 29 37 Shift 37, 29 and 14. 10 13 14 29 37 Insert 13 CSS342: Sorting Algorithms
O(n2) sorting Insertion Sort template <class Object> void SortedList<Object>::insertionSort( ) { for ( int unsorted = 1; unsorted < array.size( ); ++unsorted ) { // Assume the 0th item is sorted. Unsorted items start from the 1st item Object unsortedTop = array[unsorted]; // Copy the top of unsorted group int i; for ( i = unsorted - 1; ( i >= 0 ) && (array[i] > unsortedTop ); --i ) // Upon a successful comparison, shift array[i] to the right array[i + 1] = array[i]; // insert the unsorted top into the sorted group; array[i + 1] = unsortedTop; } } #endif unsortedTop 13 insert copy compare unsorted 2 3 8 10 14 29 37 13 11 25 20 loc loc+1 loc loc+1 shift loc loc+1 14 29 37 CSS342: Sorting Algorithms
O(n2) sorting Efficiency of Insertion Sort Comparison Insertion Shift Sorted Unsorted 29 10 14 37 13 10 29 14 37 13 1 1 1 10 29 14 37 13 10 14 29 37 13 2 1 2 10 14 29 37 13 10 14 29 37 13 3 1 3 10 14 29 37 13 10 13 14 29 37 N-1(=4) 1 N-1=(4) O(n2) = O(n2) O(n) O(n2) CSS342: Sorting Algorithms
O(n3/2) sorting ShellSort gap = 3/2.2 = 1 0 16 15 11 81 94 11 96 12 35 17 95 28 58 41 75 15 85 87 38 20 12 12 Initially divided by 2 gap = 17/2 = 8 11 15 81 94 11 96 12 35 17 95 20 58 11 75 12 35 17 38 • The idea is to perform an insertion sort among items in gap • This reduces the large amount of data movement. 17 17 28 58 41 75 15 85 87 38 28 94 41 96 15 85 87 95 sort 38 20 20 81 Practically chosen 28 28 sort gap = 8/2.2 = 3 20 35 41 38 15 12 11 20 58 11 35 41 17 38 28 75 12 35 sort 75 58 20 41 35 17 38 28 58 75 75 58 87 94 41 96 87 81 94 81 96 15 85 87 94 85 95 85 95 81 81 87 96 94 95 95 85 96 CSS342: Sorting Algorithms
O(n3/2) sorting ShellSort template <class Comparable> void shellsort( vector<comparable> &a ); { for ( int gap = a.size( ) / 2; gap > 0; gap = ( gap == 2 )? 1 : int( gap / 2.2 ) ) { for ( int i = gap; i < a.size( ); i++ ) { Comparable tmp = a[i]; int j = i; for ( ; j >= gap && tmp < a[j – gap]; j -= gap ) a[j] = a[j – gap]; a[j] = tmp; } } } (1) (2) (3) (4) (5) (2) Assume i = a.size( ) –1 gap = 16/2 = 8 (1) 0 16 81 94 11 96 12 35 17 95 28 58 41 75 15 85 87 38 20 (4) (4) Shift a[16-8 * 2] if it is larger than tmp Shift a[16-8] if it is larger than tmp (3) (5) 20 tmp CSS342: Sorting Algorithms
O(n3/2) sorting Efficiency of ShellSort • Performance • Worst case: O(N2) • Average case: • O(N3/2) when dividing 2 • O(N5/4) or O(N7/6) when dividing 2.2 • Proof: • A long-standing open problem CSS342: Sorting Algorithms
O(nlog n) sorting Sorting Algorithms • Selection Sort • Bubble Sort • Insertion Sort • Shell Sort • Merge Sort • Quick Sort O(n2) (Shell’s average case depends on increment.) Use a recursive solution Take advantage of tree’s log(n) characteristics O(n log n) CSS342: Sorting Algorithms
O(nlog n) sorting 1 2 3 4 5 7 8 11 13 14 20 23 25 Mergesort(with an auxiliary temporary array) Assuming that we have already had two sorted array, How can we merge them into one sorted array? 1 4 8 13 14 20 25 2 3 5 7 11 23 CSS342: Sorting Algorithms
O(nlog n) sorting Mergesort(with an auxiliary temporary array) sorted sorted Template <class Comparable> void merge(vector<Comparable> &a, int first, int mid, int last) { vector<Comparable> tempArray(a.size( )); int first1 = first; int lsat1 = mid; int first2 = mid + 1; int last2 = last; int index = first1; for ( ; (first1 <= last1) && (first2 <= last2); ++index) { if (a[first1] < a[first2]) { tempArray[index] = a[first1]; ++first1; } else { tempArray[index] = a[first2]; ++first2; } } for ( ; first1 <= last1; ++first1, ++index) tempArray[index] = a[first1]; for ( ; first2 <= last2; ++first2, ++index) tempArray[index] = a[first2]; for ( index = first; index <= last; ++index ) a[index] = tempArray[index]; } first mid last theArray last1 last2 first1 first2 >= < tempArray index sorted sorted first mid theArray last2 last1 first1 first2 tempArray CSS342: Sorting Algorithms index
O(nlog n) sorting 16 38 27 39 12 17 5 24 16 27 38 39 5 12 17 24 5 12 16 17 24 27 38 39 Now, how can we make each item separated? Mergesort(from down to top: conquer) 38 16 27 39 12 17 24 5 CSS342: Sorting Algorithms
O(nlog n) sorting Mergesort(from top to down: divide) mid=(fist + last)/2 first last theArray 38 16 27 39 12 17 24 5 mid=(fist + last)/2 mid=(fist + last)/2 first last first last first < last 38 16 27 39 12 17 24 5 first last 38 16 27 39 12 17 24 5 first last 38 16 27 39 12 17 24 5 CSS342: Sorting Algorithms
O(nlog n) sorting mid=(fist + last)/2 first last theArray 38 16 27 39 12 17 24 5 38 16 27 39 12 17 24 5 38 16 27 39 12 17 24 5 first last 16 38 27 39 12 17 5 24 38 16 27 39 12 17 24 5 16 27 38 39 5 12 17 24 5 12 16 17 24 27 38 39 Mergesort(final view) template<Comparable> void mergesort(vector<Comparable> &a, int first, int last) { if ( first < last ) { int mid = ( first + last ) / 2; mergesort( a, first, mid ); mergesort( a, mid+1, last ); merge( a, first, mid, last ); } } CSS342: Sorting Algorithms
O(nlog n) sorting 38 16 27 39 12 17 24 5 16 38 27 39 12 17 5 24 16 27 38 39 5 12 17 24 5 12 16 17 24 27 38 39 Mergesort(Efficiency Analysis) At level X, #nodes in each pair = 2x At level X, # major operations = n/ 2x * (3 * 2x – 1) = O(3n) #levels = log n, where n = # array elements ( if n is a power of 2 ) #levels = log n + 1 if n is not a power of 2 # operations = O(3n) * (log n + 1) = O(3 n log n) = O(n log n) CSS342: Sorting Algorithms
O(nlog n) sorting Quicksort(A partition about a pivot) 81 31 75 57 43 0 13 26 92 65 Select a pivot Partition 0 75 81 43 65 92 57 13 31 26 Smaller items Larger items 0 43 26 65 75 81 92 13 57 31 0 13 26 31 43 57 CSS342: Sorting Algorithms
O(nlog n) sorting Quicksort(Code overview) template<class Comparable> void quicksort(vector<Comparable> &a, int first, int last) { int pivotIndex; // after partition, pivotIndex points to a pivot if ( first < last ) { partition( a, fist, last, pivotIndex ); quicksort( a, first, pivotIndex - 1 ); quicksort( a, pivotIndex + 1, last ); } } CSS342: Sorting Algorithms
O(nlog n) sorting unknown p ? S1 S2 unknown first firstUnknown last lastS1 p < p > p ? first lastS1 firstUnknown last Quicksort(Partitioning Algorithm) Initial State Repeat moving each element in the unknown region to S1 or S2 Until unknown reaches 0. CSS342: Sorting Algorithms
O(nlog n) sorting Quicksort(Moving an new unknown into S1) swap S1 S2 unknown p < p > p new <p ? first lastS1 firstUnknow last S1 S2 unknown p < p new <p > p ? first lastS1 last firstUnknow CSS342: Sorting Algorithms
O(nlog n) sorting Quicksort(Moving an new unknown into S2) S1 S2 unknown p < p > p new >p ? first lastS1 firstUnknow last S1 S2 unknown p < p > p ? new >p first lastS1 last firstUnknow CSS342: Sorting Algorithms
O(nlog n) sorting swap S1 S2 unknown unknown p p ? p < p > p ? new <p firstUnknow first last lastS1 first lastS1 firstUnknow last S1 swap S2 S1 S2 S1 S2 unknown p < p > p < p p > p p < p new <p > p ? first last lastS1 firstUnknow first last lastS1 firstUnknow last first lastS1 firstUnknow Quicksort(Partitioning Code) template<class Comparable> void partition(vector<Comparable> a[], int first, int last, int& pivotIndex) { //place it in a[first] choosePivot( a, first, last ); Comparable pivot = theArray[first]; int lastS1 = first; int firstUnknown = first + 1; for ( ; firstUnknown <= last; ++ firstUnknown ) if ( a[firstUnknown] < pivot ) { ++lastS1; swap( a[firstUnknown], a[lastS1] ); } // else item from unknown belongs in S2 swap( a[first], a[lastS1] ); pivotIndex = lastS1; } CSS342: Sorting Algorithms
O(nlog n) sorting Quicksort(Example) Original array 27 28 12 39 26 16 firstUnknown=1(points to 28) 28 belongs in S2 27 28 12 39 26 16 S2 S1 is empty. 12 belongs in S1, so swap 28 and 12 27 28 12 39 26 16 27 12 28 39 26 16 39 belongs in S2 26 belongs in S1, swap 28 and 26 27 12 28 39 26 16 16 belongs in S1, swap 39 and 16 27 12 26 39 28 16 S1 and S2 are determined 27 12 26 16 28 39 Place pivot between S1 and S2 16 12 26 27 28 39 CSS342: Sorting Algorithms
O(nlog n) sorting Quicksort(Efficiency Analysis) • Worst case: If the pivot is the smallest item in the array segment, S1 will remain empty. • S2 decreases in size by only 1 at each recursive call. • Level 1 requires n-1 comparisons. • Level 2 requires n-2 comparisons. • Thus, (n-1) + (n-2) + …. + 2 + 1 = n(n-1)/2 = O(n2) • Then, how can we select the best pivot? • Average case: S1 and S2 contain the same number of items. • log n or log n + 1 levels of recursions occur. • Each level requires n-k comparisons • Thus, at most (n-1) * (log n + 1) = O(n log n ) CSS342: Sorting Algorithms
O(nlog n) sorting Mergesort versus Quicksort • Then, why do we need Quicksort? • Reasons: • Mergesort requires item-copying operations from the array a to the temp • array and vice versa. • A worst-case situation is not typical. • Then, why do we need Mergesort? • Reason: • If you sort a linked list, no item-copying operations are necessary. CSS342: Sorting Algorithms
O(n) sorting Radix Sort(Algorithm Overview) CSS342: Sorting Algorithms
O(log n) sorting Radix Sort(Efficiency Analysis) • Each grouping work requires n shuffles. • # grouping and combining steps is # digits. • The previous case is 4. • Thus, for k digit number, the performance is: • K * n = O( n ) where k is irrelevant to n • Disadvantage: • Need to compare digits in the same order rather than items. • Need to accommodate 10 groups for numbers • Need to accommodate 27 groups for strings (alphabet + blank) CSS342: Sorting Algorithms
A Comparison of Sorting Algorithms Worst case Average case Selection sort n2 n2 Bubble sort n2 n2 Insertion sort n2 n2 Shell sort n2 n3/2 ,n5/4depends on increment n log n Mergesort n log n Quicksort n2 n log n Radix sort n n Studied in css343 Treesort n2 n log n Heapsort n log n n log n Studied in css343 Question: do we really need to always use mergesort or quicksort? CSS342: Sorting Algorithms
Lab Work • Partial Quicksort • Find the top k items • Find the bottom k items • Find the median • Key Idea: • Focus on only either partition[first, pivot -1] or partition[pivot, last] that fits the requirements: top k, bottom k, or middle. CSS342: Sorting Algorithms
orig 8 5 4 1 7 2 6 3 temp 5 8 1 4 2 7 3 6 orig 1 4 5 8 2 3 6 7 temp 1 2 3 4 5 6 7 8 Programming Assignment • In-Place Sorting • Sort data items only in the original array. Example: Quick Sort • Impractical for Merge Sort • Non-Recursive, Semi-In-Place Merge Sort • Using a loop rather than recursion. • Using only one additional temporary array. • Moving data from the original to temporary or vice versa at each stage CSS342: Sorting Algorithms