1 / 13

Median/Order Statistics Algorithms

Median/Order Statistics Algorithms. Minimum and Maximum Selection in expected linear time Selection in worst-case linear time. Minimum and Maximum. How many comparisons are sufficient to find minimum/maximum? How many comparisons are sufficient to find both minimum AND maximum?

dakota
Download Presentation

Median/Order Statistics Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Median/Order Statistics Algorithms • Minimum and Maximum • Selection in expected linear time • Selection in worst-case linear time

  2. Minimum and Maximum • How many comparisons are sufficient to find minimum/maximum? • How many comparisons are sufficient to find both minimum AND maximum? • Show n + log n - 2 comparisons are sufficient to find second minimum (and minimum)

  3. Selection (Median) Problem • How quickly can we find the median (or in general the kth largest element) of an unsorted list of numbers? • Two approaches • Quicksort partition algorithm expected Θ(n) time but Ω(n2) time in the worst-case • Deterministic Θ(n) time in the worst-case

  4. Quicksort Approach • int Select(int A[], k, low, high) • Choose a pivot item • Determine rank of pivot element in current partition • Compare all items to this pivot element • If pivot is kth item, return pivot • Else update low and high and recurse on partition that contains kth item

  5. Example k=5 low high rank 17 12 6 23 19 8 5 10 1 8 6 8 5 10 17 12 23 19 5 8 4 17 12 19 23 5 6 7 12 17 found: 5

  6. Probabilistic Analysis • Assume each of n! permutations is equally likely • Modify earlier indicator variable analysis of quicksort to handle this k-selection problem • What is probability ith smallest item is compared to jth smallest item? • If k is contained in (i..j)? • If k ≤ i? • If k ≥ j?

  7. Cases where (i..j) do not contain k • Case k ≥ j: • Σ(i=1 to k-1) Σj = i+1 to k 2/(k-i+1) = Σi=1 to k-1 (k-i) 2/(k-i+1) = Σi=1 to k-1 2i/(i+1) [replace k-i with i] = 2 Σi=1 to k-1 i/(i+1) ≤ 2(k-1) • Case k ≤ i: • Σ(j=k+1 to n) Σi = k to j-1 2/(j-k+1) = Σj=k+1 to n (j-k) 2/(j-k+1) = Σj = 1 to n-k 2j/(j+1) [replace j-k with j and change bounds] = 2 Σj=1 to n-k j/(j+1) ≥ 2(n-k) • Total for both cases is ≤ 2n-2

  8. Case where (i..j) contains k • At most 1 interval of size 3 contains k • i=k-1, j=k+1 • At most 2 intervals of size 4 contain k • i=k-1, j=k+2 and i=k-2, j= k+1 • In general, at most q-2 intervals of size q contain k • Thus we get Σ(q=3 to n) (q-2)2/q ≤ Σ(q=3 to n) 2 = 2(n-2) • Summing together all cases we see the expected number of comparisons is less than 4n

  9. Best case, Worst-case • Best case running time? • What happens in the worst-case? • Pivot element chosen is always what? • This leads to comparing all possible pairs • This leads to Θ(n2) comparisons

  10. Deterministic O(n) approach • Need to guarantee a good pivot element while doing O(n) work to find the pivot element • int Select(int A[], k, low, high) • Choosing pivot element • Divide into groups of 5 • For each group of 5, find that group’s median • Use median of the medians as pivot element • Determine rank of pivot element • Compare some remaining items directly to median • Update low and high and recurse on partition that contains kth item (or return kth item if it is pivot)

  11. Guarantees on the pivot element • Median of medians is guaranteed to be smaller than all the red colored items • Why? • How many red items are there? • Likewise, median of medians is guaranteed to be larger than the blue colored items • Thus median of medians is in the range: • What elements do we need to compare to pivot to determine its rank? • How many of these are there?

  12. int Select(int A[], k, low, high) Choosing pivot element For each group of 5, find that group’s median Find the median of the medians Compare remaining items directly to median Recurse on correct partition Analysis of number of comparisons • Analysis • Choosing pivot element • c1 n/5 • c1 for median of 5 • Recurse on problem of size n/5 • c2 n comparisons • Recurse on problem of size at most 7n/10 • T(n) =

  13. Solving recurrence relation • T(n) = T(7n/10) + T(n/5) + O(n) • Key observation: 7/10 + 1/5 = 9/10 < 1 • Prove T(n) ≤ cn for some constant c by induction on n • T(n) = 7cn/10 + cn/5 + dn • = 9cn/10 + dn • Need 9cn/10 + dn ≤ cn • Thus c/10 ≥ d  c ≥ 10d

More Related