590 likes | 736 Views
Chapter 7 Sorting. Sort is a very useful and frequently used operation Require fast algorithm Easy algorithms sort in O( N 2 ) Complicate algorithms sort in O( N log N ) Any general-purpose sorting algorithm requires W ( N log N ) comparisons. Sorting.
E N D
Chapter 7 Sorting • Sort is a very useful and frequently used operation • Require fast algorithm • Easy algorithms sort in O(N2) • Complicate algorithms sort in O(N log N) • Any general-purpose sorting algorithm requires W(N log N) comparisons 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Sorting What is covered in this chapter: • Sort array of integers • Comparison-based sorting main operations are compare and swap • Assume that the entire sort can be done in main memory 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Sorting Algorithms • Insertion Sort • Shellsort • Heapsort • Mergesort • Quicksort • Bucket Sort 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Insertion Sort • Sort N elements in N-1 passes ( pass 1 to N-1 ) • In pass p • insertion sort ensures that the elements in positions 0 through p are in sorted order • elements in positions 0 through p-1 are already in sorted order • move the element in position p to the left until its correct place is found 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Insertion Sort 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Insertion Sort • The element in position p is saved in tmp • All larger elements prior to position p are moved one spot to the right • Then tmp is placed in the correct spot 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
public static void insertionSort( Comparable [ ] a ) { int j; for( int p = 1; p < a.length; p++ ) { Comparable tmp = a[ p ]; for( j = p; j > 0 && tmp.compareTo( a[ j - 1 ] ) < 0; j-- ) a[ j ] = a[ j - 1 ]; a[ j ] = tmp; } } 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis • Insertion sort has nested loops. Each loop can have N iterations. So, insertion sort is O(N2). • The inner loop can be executed at most p+1 times for each value of p. • For all p = 1 to N-1, the inner loop can be executed at most 2 + 3 + 4 + . . . + N = q(N2) • Input in reverse order can achieve this bound. 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis • If the input is presorted, the running time is O(N) because the test in the inner for loop always fails immediately. • The average case is q(N2) 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Shellsort • Sort N element in t passes • Each pass k has an associated value hk • The t passes use a sequence of h1, h2 , . . . , ht(called increment sequence) • The first pass uses ht and the last pass uses h1 • ht > . . . > h2 > h1 and h1 = 1 • In each pass, all elements spaced hk apart are sorted 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Shellsort 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Shellsort • A sequence that is sorted using hkis said to be hk-sorted • An hk-sorted sequence that is then hk-1 sorted remains hk-sorted • An hk-sort performs an insertion sort on hkindependent subarrays 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Shellsort • Any increment sequence works, as long as h1 = 1 • Some choices are better than others • A popular (but poor) increment sequence is 1, 2, 4, 8, . . . , N/2 ht= N/2 , and hk = hk+1/2 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Shellsort v.s. Insertion Sort • The last pass of shellsort performs an insertion sort on the whole array (h1-sort). • But shellsort is better than insertion sort because shellsort perform insertion sorts on presorted arrays 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
public static void shellsort( Comparable [ ] a ) { int j; for( int gap = a.length / 2; gap > 0; gap /= 2 ) for( int i = gap; i < a.length; i++ ) { Comparable tmp = a[ i ]; for( j = i; j >= gap && tmp.compareTo(a[j-gap]) < 0; j -= gap ) a[ j ] = a[ j - gap ]; a[ j ] = tmp; } } 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
A Bad Case of Shellsort • N is a power of 2 • All the increments are even, except the last increment, which is 1. • The N/2 largest numbers are in the even positions and the N/2 smallest numbers are in the odd positions 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
A Bad Case of Shellsort 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
A Bad Case of Shellsort • No sorting is performed until the last pass • i th smallest number (iฃ N/2) is in position 2i-1 • Restoring the i th element to its correct place requires moving it i-1 spaces • Restoring N/2 smallest numbers requires = W(N2) work 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Worst-Case Analysis • A pass with increment hk consists of hkinsertion sorts of about N/hk elements • Since insertion sort is quadratic, the total cost of a pass is O(hk(N/hk)2) = O(N 2/hk) • Summing over all passes gives 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Hibbard’s Increments • Increment sequence 1, 3, 7, 15, . . . , 2k - 1 • hk+1= 2 hk + 1 • Consecutive increments have no common factors • Worst case running time of Shellsort using Hibbard’s increment is Q(N3/2) 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis of Hibbard’s Increments • Input of hk-sort is already hk+1-sorted and hk+2-sorted (e.g. input of 3-sort is already 7-sorted and 15-sorted) • Let i be the distance between two elements. If i is expressible as a linear combination of hk+1 and hk+2, then a[p-i] ฃa[p] • For example, 52 = 1*7 + 3*15, so a[100] ฃa[152] because a[100] ฃ a[107] ฃa[122] ฃa[137] ฃa[152] 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis of Hibbard’s Increments • All integers ณ (hk+1 -1)(hk+2 -1) = 8hk2 + 4hk can be expressed as a linear combination of hk+1 and hk+2 • Proof: i = x*hk+1 + y*hk+2 i+1 = x*hk+1 + y*(2*hk+1+1) +1 i+1 = x*hk+1 + y*(2*hk+1+1) - 2*hk+1 + 2*hk+1 + 1 i+1 = (x-2)*hk+1 + (y+1)*hk+2 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis of Hibbard’s Increments • So, a[p-i] ฃa[p] if iณ 8hk2 + 4hk • In each pass, a[p] is never moved further than a[p-i] or 8hk2 + 4hk elements to the left • The innermost for loop is executed at most 8hk+ 4 = O(hk) times for each position. So, each pass has O(Nhk) running time. 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis of Hibbard’s Increments • For hk > N1/2, use the bound O(N2/hk). • For hkฃ N1/2 use the bound O(Nhk) • About half of the increment sequence satisfy hk< N1/2 • The total time is 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Sedgewick’s Increments • Sedgewick’s increments is {1, 5, 19, 41, 109, . . .} which can be term as 9*4i - 9*2i + 1 or 4i -3*2i + 1 • O(N4/3) worst-case time and O(N7/6) average time 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Heap Sort • Build a binary heap of N elements and then perform NdeleteMin operations • Building a heap takes O(N) time and NdeleteMin operations take O(N log N) time • The total running time is O(N log N) 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Heap Sort • The sorted elements, which are taken out of the heap, can be place in another array. • To avoid using extra array to keep result, replace the last element in the heap with the element taken out of the heap. • To get the result in increasing order, use max-heap instead. 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
26 31 53 41 58 26 97 53 58 31 59 41 97 59 97 59 53 53 59 58 26 26 41 41 58 31 97 31 After one deleteMax 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
41 97 31 97 59 53 41 26 31 59 58 26 53 58 58 53 41 53 31 31 26 26 58 41 59 59 97 97 After two deleteMax After three deleteMax 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
public static void heapsort( Comparable [ ] a ) { for( int i = a.length / 2; i >= 0; i-- ) percDown( a, i, a.length ); for( int i = a.length - 1; i > 0; i-- ) { swapReferences( a, 0, i ); percDown( a, 0, i ); } } private static int leftChild( int i ) { return 2 * i + 1; } // array begins at index 0 public static final void swapReferences( Object [ ] a, int index1, int index2 ) { Object tmp = a[ index1 ]; a[ index1 ] = a[ index2 ]; a[ index2 ] = tmp; } 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
private static void percDown( Comparable [ ] a, int i, int n ) { int child; Comparable tmp; for( tmp = a[ i ]; leftChild( i ) < n; i = child ) { child = leftChild( i ); if( child != n - 1 && a[ child ].compareTo( a[ child + 1 ] ) < 0 ) child++; if( tmp.compareTo( a[ child ] ) < 0 ) a[ i ] = a[ child ]; else break; } a[ i ] = tmp; } 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis • Building the heap uses at most 2N comparisons • deleteMax uses at most 2N log N - O(N) comparisons • So, heapsort uses at most 2N log N - O(N) comparison 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis • Worst-case and average-case are only slightly different • Average number of comparison is 2N log N - O(N log log N) 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Mergesort • The fundamental operation is merging two sorted lists. • Because the lists are sorted, this can be done in one pass through the input, if the output is put in a third list. 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Mergesort • Mergesort takes two input arrays A and B, an output array C, and three counters, Actr, Bctr, and Cctr. • The smaller of A[Actr]and B[Bctr] is copied to the next entry in C, and appropriate counters are advanced • Remaining input items are copied to C 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
1 13 24 26 2 15 27 38 Cctr Actr Bctr Bctr Cctr Actr Cctr Actr Bctr Mergesort 1 13 24 26 2 15 27 38 1 1 13 24 26 2 15 27 38 1 2 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Cctr Cctr Actr Bctr Cctr Bctr Bctr Actr Actr Mergesort 1 13 24 26 2 15 27 38 1 2 13 1 13 24 26 2 15 27 38 1 2 13 15 1 13 24 26 2 15 27 38 1 2 13 15 24 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Cctr Cctr Bctr Actr Actr Bctr Bctr Cctr Actr Mergesort 1 13 24 26 2 15 27 38 1 2 13 15 24 26 1 13 24 26 2 15 27 38 1 2 13 15 24 26 1 13 24 26 2 15 27 38 1 2 13 15 24 26 27 38 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Mergesort • If N > 1, recursively mergesort the first half and the second half • If N = 1, only one element to sort -> the base case 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
24 1 13 2 26 13 1 15 2 24 26 27 27 38 15 38 24 1 13 13 26 24 26 1 2 2 27 15 27 38 38 15 2 2 27 27 38 15 15 38 24 13 26 1 2 27 38 15 13 24 24 13 26 1 26 1 Mergesort: Divide and Conquer 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Analysis • Merging two sorted lists is linear, because at most N-1 comparisons are made • For N = 1, the time to mergesort is constant • Otherwise, the time to mergesort N numbers is the time to do two recursive mergesorts of size N/2, plus the linear time to merge • T(N) = N log N + N 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
public static void mergeSort( Comparable [ ] a ) { Comparable [ ] tmpArray = new Comparable[ a.length ]; mergeSort( a, tmpArray, 0, a.length - 1 ); } private static void mergeSort( Comparable [ ] a, Comparable [ ] tmpArray, int left, int right ) { if( left < right ) { int center = ( left + right ) / 2; mergeSort( a, tmpArray, left, center ); mergeSort( a, tmpArray, center + 1, right ); merge( a, tmpArray, left, center + 1, right ); } } 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
private static void merge( Comparable [ ] a, Comparable [ ] tmpArray, int leftPos, int rightPos, int rightEnd ) { int leftEnd = rightPos - 1; int tmpPos = leftPos; int numElements = rightEnd - leftPos + 1; while( leftPos <= leftEnd && rightPos <= rightEnd ) if( a[ leftPos ].compareTo( a[ rightPos ] ) <= 0 ) tmpArray[ tmpPos++ ] = a[ leftPos++ ]; else tmpArray[ tmpPos++ ] = a[ rightPos++ ]; while( leftPos <= leftEnd ) // Copy rest of first half tmpArray[ tmpPos++ ] = a[ leftPos++ ]; while( rightPos <= rightEnd ) // Copy rest of right half tmpArray[ tmpPos++ ] = a[ rightPos++ ]; for( int i = 0; i < numElements; i++, rightEnd-- ) a[ rightEnd ] = tmpArray[ rightEnd ]; } 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Quicksort Divide-and-Conquer recursive algorithm 1. If the number of elements in S is 0 or 1, then return 2. Pick any element v in S. This is called the pivot 3. Partition the remaining elements in S ( S - {v} ) into two disjoint groups: S1 and S2. S1 contains elements ฃ v, S2 contains elements ณ v 4. Return {quicksort(S1), v, quicksort(S2)} 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
65 75 81 92 0 13 26 31 43 57 13 81 92 43 65 31 57 26 75 0 select pivot 13 81 92 43 65 31 57 26 75 0 partition 65 13 0 26 43 57 31 92 75 81 quicksort quicksort 0 13 26 31 43 57 65 75 81 92 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Quicksort v.s. Mergesort • In Quicksort, subproblems need not be of equal size • Quicksort is faster because partitioning step can be performed in place and very efficiently 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Picking the Pivot • Use the first element as the pivot • Bad choice • If input is presorted or in reverse order, the pivot makes poor partitioning because either all elements go into S1 or they go into S2 • If the input is presorted, quicksort will take quadratic time to do nothing useful • Use the larger of the first two distinct elements • Also bad 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Picking the Pivot • Choose the pivot randomly • generally safe • generating random numbers is expensive • does not reduce the average running time • Median-of-Three Partitioning • The best choice would be the median of the array • A good estimation is to use the median of the left, right, and center elements as pivot 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
Partitioning Strategy 1. Swap the pivot with the last element 2. i starts at the first element and j starts at the next-to-last element 3. Move i right, skipping over elements smaller than the pivot. Move j left, skipping over elements larger than the pivot. Both i and j stops if encounter an element equal to the pivot 4. When i and j stop, if i is to the left of j, swap their elements 5. Repeat 3 and 4 until i and j cross 6. Swap the pivot with i’s element 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University
i j i i j j i j Partitioning 8 1 4 9 0 3 5 2 7 6 8 1 4 9 0 3 5 2 7 6 2 1 4 9 0 3 5 8 7 6 2 1 4 9 0 3 5 8 7 6 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University