320 likes | 327 Views
Learn about different sorting algorithms and strategies, including N2 sorts, Quicksort, MergeSort, and their performance characteristics. Understand how many passes are needed, work per pass, and how to improve sorting efficiency.
E N D
Description Given a linear collection of items x1, x2, x3,….,xn arrange them so that they (or some key field in them) are in ascending order x1<= x2<=x3 ….<=xn or in descending order x1>= x2>=x3 ….>=xn
Some Issues • internal vs. external sorting • array vs. linked list data structure • N2 vs. N log2 N performance • worst case vs. average case big-O • additional memory requirements • stable vs. unstable sorts
N2 sorting strategy • start with unsorted array of length N • do a series of “passes” through the array (outer loop) • during each pass do comparisons and maybe exchanges of items (inner loop) • result of a pass is to reduce size of the unsorted part of the array by 1 and increase size of the sorted part of the array by 1 • number of passes is N-1 and work/pass is O(N) so big-Oh is N2
N2 sorting algorithms • Selection Sort • each pass selects largest/smallest remaining element and moves it to its correct location • Insertion Sort • each pass inserts 1 element into an already sorted array • Exchange (Bubble) Sort • each pass does a series of exchanges so that 1 element moves to where it belongs
N2 sorts compared Algorithm #comp #exchanges Big O selection (N2-N)/2 N-1 N2 bubble worst (N2-N)/2 (N2-N)/2 N2 average (N2-N)/2 (N2-N)/4 N2 best N-1 0 N insertion worst (N2-N)/2 (N2-N)/2 N2 average (N2-N)/4 (N2-N)/4 N2 best N-1 N-1 N
insertion sort start order 67 33 25 21 94 49 after pass 1 33 67 25 21 94 49 after pass 2 25 33 67 21 94 49 after pass 3 2125 33 67 94 49 after pass 4 2125 33 6794 49 after pass 5 2125 33 49 67 94 How many passes are needed to sort N items? Work (comparisons/exchanges) per pass?
how to sort faster? • N2 algorithms do N-1 passes and O(N) work/pass --> O(N2) • reduce passes to log2 N • mergesort • quicksort • reduce work/pass to log2 N • heapsort • don’t do any comparisons at all • radix sort
QuickSort • fastest general purpose sort • usually written recursively • many variations • usually is O(N log2 N) • performance is data-dependent • worst case is O(N2)
Quicksort • Choose some element called a pivot • Perform a sequence of exchanges so that • All elements that are less than this pivot are to its left and • All elements that are greater than the pivot are to its right. • Divides the (sub)list into two smaller sub lists, • Each of which may then be sorted independently in the same way.
Quicksort If the list has 0 or 1 elements, return. // the list is sorted Else do: Pick an element (near middle) as the pivot. Split remaining elements into two disjoint groups: SmallerThanPivot= {all elements < pivot} LargerThanPivot= {all elements > pivot} Return the list rearranged as: Quicksort(SmallerThanPivot), pivot, Quicksort(LargerThanPivot).
QuickSort void QuickSort (ElementType x[ ], int first, int last) { if (first < last) int pivot; pivot = Split (x, first, last); QuickSort (x, first, pivot - 1); QuickSort (x, pivot + 1, last); } QuickSort (A, 0, N-1);
QuickSort strategy initial order: 45 23 13 68 54 17 70 24 after pass 1: 17 23 13 244554 70 68 after pass 2: 131723 2445 54 70 68 after pass 3: 13 17 23 24 45 54 68 70 How many passes? Work/pass?
Quicksort Performance • O(log2n) is the average case computing time • If the pivot results in sublists of approximately the same size. • O(n2) worst-case • List already ordered, elements in reverse • When Split() repetitively results, for example, in one empty sublist
Assume Even Partitioning N N/2 N/2 N/4 N/4 N/4 N/4 big-Oh is N log2 N
Worst Case N N-1 N-2 O(N2) N-3
Quicksort Improvements • pivot selection methods • first element • middle element • median of 3 • random choice • insertion sort for partitions of size < 20 • leave small partitions unsorted and do one insertion sort of whole array at the end • sort smaller partition first • STL sort algorithm uses improved Quicksort
MergeSort • based on algorithm to merge 2 sorted lists into 1 sorted list • merging can be done in O(N) time • N is number of items in the 2 sorted lists • usually written recursively • "work" is done during the unwinding • requires O(N) temporary memory • what do other sorting algorithms need? • average and worst case big O is N log2 N • used for sorting large external files
Merge Algorithm 1. Open File1 and File2 for input, File3 for output 2. Read first element x from File1 and first element y from File2 3. While neither eof File1 or eof File2If x < y then a. Write x to File3 b. Read a new x value from File1Otherwise a. Write y to File3 b. Read a new y from File2End while 4. If eof File1 encountered copy rest of of File2 into File3. If eof File2 encountered, copy rest of File1 into File3
MergeSort strategy initial order: 45 23 13 68 54 17 70 24 pass 1: 45 23 13 6854 17 70 24 pass 2: 45 2313 6854 1770 24 pass 3: 4523136854177024 13 17 23 24 45 54 68 70 13 23 45 6817 24 54 70 23 4513 6817 5424 70 How many passes? Work/pass?
Shellsort (D.L.Shell) voidshellsort (int[ ] a, int n) { inti, j, k, h, v; int[ ] cols = {1391376, 463792, 198768, 86961, 33936, 13776, 4592, 1968, 861, 336, 112, 48, 21, 7, 3, 1} for (k=0; k<16; k++) { h=cols[k]; for (i=h; i<n; i++) { v=a[i]; j=i; while (j>=h && a[j-h]>v) { a[j]=a[j-h]; j=j-h; } a[j]=v; } } }
Implementing a Heap -1 • Note the placement of the nodes in the array • The lower value key always has a parent node with a higher-value key. This is NOT a heap (yet)
Implementing a Heap -2 • In an array implementation children of ithnode are at myArray[2*i] and myArray[2*i+1] • Parent of theithnode is atmyArray[i/2]
heapSort • 2 phases – done sequentially • reorganize the array into a heap • takes O(N log2 N) • do N-1 passes • each pass moves the largest/smallest item to where it belongs • selection of largest item takes O(log2 N) work • takes O(N log2 N)
phase 1 of heapSort • work node by node, down the tree • leaf nodes are already heaps • for every non-leaf node, move it down a path of children until heap property is satisfied • possible because each subtree will already be a heap • step one can be done in N log2 N time
6 3 5 9 2 10 10 9 6 3 2 5 6 3 10 9 2 5 6 9 10 3 2 5 making a heap A [6 3 5 9 2 10] (not a heap) A [10 9 6 3 2 5] Now it is a heap
Heapsort Algorithm 1. Consider x as a complete binary tree, use heapify to convert this tree to a heap 2. for i = n down to 2:a. Interchange x[1] and x[i] (puts largest element at end)b. Apply percolate_down to convert binary tree corresponding to sublist in x[1] .. x[i-1]
Heapsort • Algorithm for converting a complete binary tree to a heap – called "heapify"For r = n/2 down to 1: Apply percolate_down to the subtree in myArray[r] , … myArray[n]End for • Puts largest element at root
Heapsort • Now swap element 1 (root of tree) with last element • This puts largest element in correct location • Use percolate down on remaining sublist • Converts from semi-heap to heap
Heapsort • Again swap root with rightmost leaf • Continue this process with shrinking sublist
Percolate Down Algorithm Initialize: c = 2 * r // r=root of subtree While r <= n/2 do following If c < n and myArray[c] < myArray[c + 1] Increment c by 1If myArray[r] < myArray[c] Swap myArray[r] and myArray[c]r = cc = 2 * celse endEnd while
Priority Queue • Implementation possibilities • As a list (array, vector, linked list) • As an ordered list • Best is to use a heapBasic operations have O(log2n) time • STL priority queue adapter uses heap • Note operations in table of Fig. 13.2 in text, page 766
Radix Sort • Based on examining digits in some base-b numeric representation of items (or keys) • Least significant digit radix sort • Processes digits from right to left • Used in early punched-card sorting machines • Create groupings of items with same value in specified digit • Collect in order and create grouping with next significant digit