450 likes | 469 Views
Learn how Quicksort can solve the missing integer problem efficiently using O(n) queries and key applications in sorting algorithms. Explore techniques like element uniqueness, closest pairs, frequency distribution, and more.
E N D
Problem • From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j] in A[i]?” • Note - there are a total of nlogn bits, so we are not allowed to read the entire input!
Solution • Ask all the n integers what their last bit is and see whether 0 or 1 is the bit which occurs less often than it is supposed to. That is the last bit of the missing integer! • How can we determine the second-to-last bit?
Solution • Ask the n/2 numbers which ended with the correct last bit! By analyzing the bit patterns of the numbers from 0 to n which end with this bit. • By recurring on the remaining candidate numbers, we get the answer in T(n) = T(n/2) + n =O(n), by the Master Theorem
Why is sorting so important? • Most of the interesting concepts in the course can be taught in the context of sorting, such as: • Divide and conquer • Randomized algorithms • Lower bounds
Why is sorting so important? • One of the reasons why sorting is so important, is that after sorting items, other problems become very simple to solve.
Searching • Binary search runs on a sorted set in O(logn) time • Searching for an element in a non sorted set take linear time • This is probably the most important application of sorting
Element Uniqueness • Given a set of numbers we want to check if all numbers are unique. • Sort the elements and linearly scan all adjacent pairs.
Closest pairs • Given n numbers, find the pair which are closest to each other. • After sorting the elements, the closest pairs will be next to each other, so a linear scan will do. • Related problems….
Frequency distribution • Which element appears the largest number of times in a set. • After sorting, a linear scan will do.
Median and Order statistics • What is the median of a set of numbers? What is the k-th largest element? • After sorting the elements the k-th largest element can be found at index k, in constant time
Convex hulls • Given n points in two dimensions, find the smallest area polygon which contains them all.
Huffman Codes • If you are trying to minimize the size of a text file, you should want to assign different lengths to represent different characters, according to the frequency in which each character appears in the text.
Quicksort • Although mergesort is O(nlogn), it is not convenient to be used with arrays, since it requires extra space. • In practice, Quicksort is the fastest algorithm and it uses partition as its main idea.
Quicksort • Partitioning places all the elements less than the pivot in the left part of the array, and all elements greater than the pivot in the right part of the array. • The pivot fits in the slot between them.
Partition • Example – use 10 as a pivot • Note that the pivot element ends up in the correct place in the total order!
Partition • First we must select a pivot element • Once we have selected a pivot element, we can partition the array in one linear scan, by maintaining three sections of the array: • All elements smaller than the pivot • All elements greater than the pivot • All unexplored elements
Example: pivot element is 10 | 17 12 6 19 23 8 5 | 10 | 5 12 6 19 23 8 | 17 5 | 12 6 19 23 8 | 17 5 | 8 6 19 23 | 12 17 5 8 | 6 19 23 | 12 17 5 8 6 | 19 23 | 12 17 5 8 6 | 19 | 23 12 17 5 8 6 ||19 23 12 17 5 8 6 10 23 12 17 19
Quicksort • Partition does at most n swaps and takes linear time. • The pivot element ends up in the position it retains in the final sorted order. • After a partitioning, no element flops to the other side of the pivot in the final sorted order. • Thus we can sort the elements to the left of the pivot and the right of the pivot independently! And recursively
QuickSort QuickSort(A, p,r) if (p < r) then q Partition(A,p,r) QuickSort(A,p,q) QuickSort(A,q+1,r) QuickSort(A,1,length[A])
QuickSort publicvoid sort (Comparable[] values) { sort (values, 0, values.length - 1); } privatevoid sort (Comparable[] values, int from , int to) { if (from < to) { int pivot = partition (values, from , to); sort(values, from, pivot); sort(values, pivot + 1, to); } }
privateint partition (Comparable[] values, int from, int to) { Comparable pivot = values[from]; int j = to + 1; int i = from - 1; while (true) { do { j--; } while (values[j].compareTo(pivot) >= 0); do { i++; } while (values[i].compareTo(pivot) < 0); if (i < j) { Comparable temp = values[i]; values[i] = values[j]; values[j] = temp; } else { return j; } } }
Partition • The partition method returns the index separating the array, but also has a side effect, which is swapping the elements in the array according to their size
Partition – version 2 publicint partition (int[] values, int from, int to) { int pivot = values[from]; int leftWall = from; for (int i = from + 1; i <= to; i++) { if (values[i] < pivot) { leftWall++; int temp = values[i]; values[i] = values[leftWall]; values[leftWall] = temp; } } int temp = values[from]; values[from] = values[leftWall]; values[leftWall] = temp; return leftWall; }
Partition (A[], left, right) 1. pivot left 2. temp right 3. while temp <> pivot 4. if A[min(pivot,temp] > A[max(pivot,temp)] 5. swap (A[pivot],A[temp]) 6. swap (pivot,temp) 7. if temp > pivot 8. temp— 9. else 10. temp++ 11. return pivot
Time Analysis • The running time for quick sort depends on how equally partition divides the array. • The chosen pivot element determines how equally partition divides the array. • If partition results in equal sub arrays quicksort can be as good as merge sort • If partition to equally divide the array, quicksort may be asymptotically as worse as insertion sort.
Best case • Since each element ultimately ends up in the correct position, the algorithm correctly sorts. But how long does it take? • The best case for divide-and-conquer algorithms comes when we split the input as evenly as possible. Thus in the best case, each subproblem is of size n/2.
Best case • The partition step on each subproblem is linear in its size. Thus the total effort in partitioning each step is O(n). • The total partitioning on each level is O(n), and it take log(n) levels of perfect partitions to get to single element. The total effort is therefore O(nlogn)
Worst case • If the pivot is the biggest or smallest element in the array, the sub-problems will be divided into size 0 and n-1, thus instead of log(n) recursive steps we end up with O(n) recursive steps and a total sorting time of O(n^2)
Worst case • The worst case input for quick sort depends on the way we choose the pivot element. • If we choose the first or last element as the pivot, the worst case is when the elements are already sorted!!
Worst case • Having the worst case occur in a sorted array is bad, since this is an expected case in many applications. (Insertion sort deals with sorted arrays in linear time.) • To eliminate this problem, pick a better pivot: • Use a random element of the array as the pivot. • take the median of three elements (first, last, middle) as the pivot. • The worst case remains, however, because the worst case is no longer a natural order it is much more difficult to occur.
Randomization • Quicksort is good on average, but bad on certain worst-case instances. • Suppose you picked the pivot element at random. Your enemy select a worst case input because you would have the same probability for a good pivot! • By either picking a random pivot or scrambling the permutation before sorting it, we can say: ``With high probability, randomized quicksort runs in O(nlogn) time.''
Time Analysis • Worst Case
Time Analysis • Best Case
Time Analysis • Average Case