190 likes | 361 Views
Sorting. Problem : Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: Insertion sort average/worst-case running time O(n 2 ) best-case running time O(n) good for partially sorted or small arrays Heap sort
E N D
Sorting • Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. • We have already seen: • Insertion sort • average/worst-case running time O(n2) • best-case running time O(n) • good for partially sorted or small arrays • Heap sort • average/worst-case running time O(nlgn)
MergeSort • A divide & conquer algorithm • Main idea: • break the list in two • sort each half recursively • base case: a single-element array is already sorted • merge the two halves
Sorting: MergeSort void mergesort (int array[ ], int low, int high) { int mid; if (low < high) { mid=(low+high)/2; mergesort(array, low, mid); mergesort(array, mid+1, high); merge(array, low, mid, mid+1, high); } }
Sorting: MergeSort • Running time: • Time to mergesort n items = twice the time to mergesort n/2 items + the time to merge a total of n items • T(n) = 2*T(n/2) + n =...=O(nlgn)
Quicksort • A divide & conquer algorithm • Basic idea: • Pick an element (called pivot) • Partition the array in two subsequences: those smaller than the pivot and those larger or equal • Sort each subsequence recursively
Quicksort • Partitioning an array • Goal : move all elements smaller than or equal to the pivot to the left of the array and all elements larger than or equal to the pivot to the right of the array. In the end, the right part of the array will contain elements that are larger than or equal to those in the left part.
Quicksort • Running time: depends on selection of pivot • Best case (array partitioned in half every time) • time to quicksort n elements = time to partition array + twice the time to quicksort n/2 elements • T(n) = 2*T(n/2) + n = ... = O(nlgn)
Quicksort • Running time: depends on selection of pivot • Average case • Still O(nlgn) but with a slightly larger constant • Worst case: • Partitioning n elements results in two subsequences of lengths 1 and n-1 • T(n) = T(n-1) + n = ... =O(n2)
Quicksort • Selecting the pivot • The first element • The middle element • The median of three elements • Random element • Idea:Randomized quicksort • Before sorting the array, randomly permute the elements to make the running time independent of the input ordering • The permutation takes some time, but in practice randomized quicksort is almost always O(nlgn)
Related problem: Selection • Problem: • Given an unsorted sequence of elements find the k-th smallest • Idea: • Use a procedure similar to quicksort but sort a subsequence only if you need to.
Related problem: Selection • Algorithm: Quickselect(A, low, high, k) • select the pivot • partition the array in two subsequences around the pivot. Let i be the index between the two pieces. • if i == k, return pivot • if i > k return Quickselect(A, low, i-1, k); • if i < k return Quickselect(A, i+1, high, k);
Sorting: Lower bounds • Comparison sort: A family of algorithms that use comparisons to determine the sorted order of a collection • Examples: Mergesort, quicksort, insertion sort • Decision tree = a tree that represents the comparisons performed by a sorting algorithm on input of given size
Sorting: Lower bounds decision tree for 3 elements: a ? b b ? c a ? c (b , a , c ) b ? c (a , b , c) a ? c (a , c , b ) (c , a , b ) (b , c , a ) (c , b a )
Sorting: Lower bounds • Each internal node contains a comparison • Each leaf contains a permutation • Algorithm execution = a path from the root to a leaf • Worst-case number of comparisons = height of decision tree • Idea: If we find a lower bound on the height of the decision tree, we’ll have a lower bound on the running time of any comparison algorithm
Sorting: Lower bounds • A decision tree that sorts n elements has height at least nlgn • This means that algorithms such as heapsort and mergesort whose running times are O(nlgn) are asymptotically optimal. • However, in some cases, if we have additional information about the input, we may be able to perform a sorting in linear time.
Extra! Extra! Counting sort • Restriction: each of the n elements is an integer in the range 1 through k • Idea: • For each element x, count the number of elements less than x. • Then, we will be able to place x directly into its correct position in the sorted array. • The algorithm uses two temporary arrays: • array B of size n to hold the sorted output • array C of size k for temporary storage
Extra! Extra! Counting sort Running time: O(n+k) // array of size n, input in range 1..k for i=1 to k C[i] = 0; for i=1 to n C[A[i]] = C[A[i]] + 1 for i=2 to k C[i] = C[i] + C[i-1] for i=n downto 1 B[C[A[i]]] = A[i] C[A[i]] = C[A[i]] - 1
Stable sorting algorithms • A sorting algorithm is stable if it does not change the relative order of items with the same value. • This is important when satellite data is carried around with the element being sorted. • Example: ATM transactions where the key is the account number. Initially the transactions are ordered by time, but if we sort them by account number, the relative order of the transactions must stay the same. • Counting sort is stable (but would it still be so if the last loop was increasing instead of decreasing?) • Quicksort is not stable
Space requirements • Mergesort needs additional space of (n) for the temporary array. • Counting sort needs additional space of (n+k) for the temporary arrays • All other algorithms that we saw sort the arrays in-place and need only constant additional space for temporary variables.