240 likes | 351 Views
Class 15 - Recursive sorting methods. Processing arrays by recursion Divide-and-conquer algorithm Quicksort. Processing arrays by recursion. Follow usual principle: to define f(A), assume f(B) can be calculated for any array B smaller than A.
E N D
Class 15 - Recursive sorting methods • Processing arrays by recursion • Divide-and-conquer algorithm • Quicksort
Processing arrays by recursion • Follow usual principle: to define f(A), assume f(B) can be calculated for any array B smaller than A. • But, instead of actually passing in a smaller array, pass A together with arguments indicating which part of A to process. • Examples: void f (double[] A, int i, int j) { // process A[i] ... A[j] ... recursive call f(A, i+1, j) ... }
Processing arrays by recursion (cont.) void f (double[] A, int i, int j) { // process A[i] ... A[j] ... recursive call f(A, i, j-1) ... } void f (double[] A, int i) { // process A[i] ... A[A.length-1] ... recursive call f(A, i+1) ... } void f (double[] A, int i) { // process A[0] ... A[i] ... recursive call f(A, i-1) ... }
Processing arrays by recursion (cont.) • Base cases void f (double[] A, int i, int j) { // i = j - 1-element subarray // i > j - 0-element subarray void f (double[] A, int i) { // process A[i] ... A[A.length-1] // i = A.length-1 - 1-element // i = A.length - 0-element
Divide-and-conquer methods • Greater efficiency sometimes obtained by dividing array in half, operating on each half recursively. void f (double[] A, int i, int j) { // process subarray A[i]... A[j] ... process A[i]...A[j] in linear time ... f(A, i, (i+j)/2); f(A, (i+j)/2+1, j); ... process A[i]...A[j] in linear time ... }
Divide-and-conquer methods Total time cn pre- and post-processing: time cn cn time c(n/2) time c(n/2) cn c(n/4) c(n/4) c(n/4) c(n/4) . . . cn cn * # of levels
Divide-and-conquer methods for sorting • Consider subarrays of the form A[i]...A[j]. Assuming can sort arbitrary subarrays - in particular, A[i]...A[(i+j)/2] and A[(i+j)/2+1] ... A[j] - how can we define divide-and-conquer sorting method? • Two “obvious” methods:
Divide-and-conquer methods for sorting (cont.) • “Post-process”: To sort A[i]...A[j]: • Sort A[i]...A[(i+j)/2] • Sort A[(i+j)/2+1] ... A[j] • “Merge” sorted halves 17 9 22 5 4 16 8 12 sort 5 9 17 22 4 16 8 12 sort 5 9 17 22 4 8 12 16 merge 4 5 8 9 12 16 17 22
Divide-and-conquer methods for sorting (cont.) • “Pre-process”: To sort A[i]...A[j]: • Move smaller elements into left half and larger elements into right half (“partition”) • Sort A[i] ... A[(i+j)/2] • Sort A[(i+j)/2+1] ... A[j] 17 9 22 5 4 16 8 12 partition 4 9 8 5 17 16 22 12 sort 4 5 8 9 17 16 22 12 sort 4 5 8 9 12 16 17 22
Divide-and-conquer methods for sorting (cont.) • Method 1 called merge sort • But need to write merge step • Method 2 called quicksort • But need to write partition step
Quicksort “Pre-process”: To sort A[i]...A[j]: • Move smaller elements into left half and larger elements into right half (“partition”) • Sort A[i] ... A[(i+j)/2] • Sort A[(i+j)/2+1] ... A[j] 17 9 22 5 4 16 8 12 partition 4 9 8 5 17 16 22 12 sort 4 5 8 9 17 16 22 12 sort 4 5 8 9 12 16 17 22
Quicksort (cont.) void quickSort (double[] A, int i, int j) { // sort A[i]...A[j] if (base case) ... handle base case ...; else { partition(A, i, j); int midpt = (i+j)/2; quickSort(A, i, midpt); quickSort(A, midpt+1, j); } } (N.B. this is the right idea, but it doesn’t work in this form. We’ll fix it in a little while.)
Quicksort (cont.) • Only question is how to partition. • Can divide problem into two subproblems: • locating the “median” element in A[i..j], say it is A[m]. (This value is called the pivot.) • moving smaller elements than A[m] to left, larger elements to right. void partition (double[] A, int i, int j) { int pivotLoc = locOfMedian(A, i, j); swap(A,i,pivotLoc); partition1(A, i+1, j, A[i]); }
partition1, first version • partition1(double[] A, int i, int j, double pivot) - shuffle elements of A[i..j] so that all elements less than pivot appear to the left of all elements greater than pivot. • If pivot is the median of the elements of A[i..j], then this will split the subarray exactly in half, so that the “middle” will be (i+j)/2.
partition1, first version • Idea is simple: Suppose can partition A[i+1..j] or A[i..j-1] recursively. Two cases: • A[i] < x: Partition A[i+1..j]. • A[i] > x: Swap A[i] and A[j], then partition A[i..j-1] recursively.
partition1, first version void partition1 (double[] A, int i, int j, double pivot) { // Shuffle elements of A[i..j] so that elements // < pivot appear to the left of elements > pivot if (A[i]<pivot) … … write this! }
Finding the median • Next problem: find median of A[i..j]. • Can easily do it in quadratic time, but how can it be done in linear time?
Guessing the median • Unfortunately, it can’t. Second best solution: guess the median and partition around the guess, hoping for the best. • Major point: Since we can’t be sure our guess will be correct, we don’t know where the “middle” will end up. Therefore, partition1 needs to return an integer giving the location of the dividing line between small and large values.
Quicksort again • Without the ability to divide the array exactly in half, need to change structure of quicksort somewhat: void quickSort (double[] A, int i, int j) { int m; if (i < j) { m = partition(A, i, j); quickSort(A, i, m-1); quickSort(A, m+1, j); } }
Quicksort again (cont.) • Difference is that partition has to tell quicksort where it ended up splitting the array: int partition (double[] A, int i, int j) { // Guess median of A[i] ... A[j], // and move other elements so that // A[i] ... A[m-1] are all less than A[m] and // A[m+1] ... A[j] are all greater than A[m] // m is returned to the caller swap(A, i, guessMedianLocation(A, i, j)); int m = partition1(A, i+1, j, A[i]); swap(A, i, m); return m; }
partition1 • partition1 must return the “middle”. int partition1 (double[] A, int i, int j, double pivot) { // Shuffle elements of A[i..j] so that elements // < pivot appear to the left of elements > pivot if (base case) ... if (A[i] <= pivot) // A[i] in correct half return partition1(A, i+1, j, pivot); else if (A[j] > pivot) // A[j] in correct half return partition1(A, i, j-1, pivot); else { // A[i] and A[j] in wrong half swap(A, i, j); return partition1(A, i, j-1, pivot); } }
partition1 (cont.) • Need to handle base cases. However, there is one key point to remember here: the index returned by partition1 must contain a value less than the pivot, because it will be swapped back into A[i]. Base case needs to guarantee this: // Base case for partition1: if (j == i) if (A[i] < pivot) return i; else return i-1;
Guessing the median • Making a good guess concerning the median is very important. • Here is a simple approach: int guessMedianLocation (double[] A, int i, int j) { return (i+j)/2; }
Final words on quicksort • Quicksort runs in time n log n, but only if the partitions are (roughly) in half. • Worst case performance for quicksort is quadratic, but it is difficult to find examples where it is inefficient.