140 likes | 370 Views
Medians and Order Statistics. i-th order statistic: i-th smallest element n elements: median is n odd: (n+1)/2 n even: n/2 or n/2+1 Assume distinct numbers. Input : A, n, 1<=i<=n Output : element x of A larger than i-1 elements of A. Solutions. O(n log n) time based on …
E N D
Medians and Order Statistics • i-th order statistic: i-th smallest element • n elements: median is • n odd: (n+1)/2 • n even: n/2 or n/2+1 • Assume distinct numbers. • Input: A, n, 1<=i<=n • Output: element x of A larger than i-1 elements of A.
Solutions • O(n log n) time based on … • O(n) time average. • O(n) time worst case.
Minimum and Maximum • How many comparisons? • At most n-1. • Examine each element and keep trach of smallest one: • Comparison based • Each element must be compared • Each must loose once (except winner). • What about simultaneous min and max?
Min & Max • Can do with 2n-2 comparisons. • Can do better • Form pairs of elements • Compare elements in each pair • Pair (ai, ai+1), assume ai < ai+1, then • Compare (min,ai), (ai+1,max) • 3 comparisions for each pair.
Average Time Median Selection • Divide-and-Conquer (prune-and-search). • Randomized: behavior determined by output of random number generator. • Based on QuickSort: • Partition input array recursively, but • Work only on one side!
QuickSort(A,p,r) If p < r then q=partition(A,p,r) QuickSort(A,p,q) QuickSort(A,q+1,r). First call: QuickSort(A,1,n) After partition(A,p,r): A[i]<A[q}, i<q; A[q]<A[j}, q<j. RandSelect(A,p,r,i) If p == r then return A[p] q=RandPartition(A,p,r) k=q-p+1 /* size of A[p..q] If i ≤ k then return RandSelect(A,p,q,i) Else return RandSelect(A,q+1,r,i-k). First call: RandSelect(A,1,n,i). Returns the i-th smallest element in A[p..r]. Randomized Selection
Selection (cont.) • RandPartition (see 8.3, 8.4 textbook) gives partition with low side: • 1 element with probability 2/n • j elements with probability 1/n, for j=2,3,…,n. • Assume i-th element always on larger side: T(n)≤(T(max(1,n-1)+Σk=1..n-1T(max(k,n-k)))/n+O(n) ≤(T(n-1)+2 Σk=n/2..n-1T(k))/n+O(n) =2(Σk=n/2..n-1T(k))/n+O(n), since T(n-1)=O(n2). • Then T(n)=O(n) (proof by substitution).
Worst Case Linear Time Selection • O(n) worst case algorithm. • Works in similar way: recursively partition input array • Idea: guarantee good split • E.g., in QuickSort assume at each recursion level have T(n)=T(9n/10)+T(n/10)+O(n). • Then, T(n)=O(n log n). • Use deterministic partitioning: • Compute the element to partition around.
Steps to find i-th smallest elementAlgorithm Select • Divide elements in n/5 groups of 5 elements, plus at most one group with (n mod 5) elements. • Find median of each group: • Insertion sort: O(1) time (at most 5 elements). • Take middle element (largest if two medians). • Use Select recursively to find median x of medians.
Algorithm Select (cont.) • Partition input array around median-of-medians x. Let k be the number of elements on low side, n-k on high side. • a1,a2,…,ak | ak+1,ak+2,…,an • ai < aj, for 1 ≤ i ≤ k, k+1 ≤ j ≤ n. • Use Select recursively to: • Find i-th smallest element on low side, if i ≤ k • Find (i-k)-th smallest on high side, if i > k.
Analysis • Find lower bound on number of elements greater than x. • At least half of medians in step 2 greater than x. Then, • At least half of the groups contribute 3 elements that are greater than x, except: • Last group (if less than 5 elements); • x own group. • Discard those two groups: • Number of elements greater than x is ≥ 3((n/5)/2-2)=3n/10-6. • Similarly, number of elements smaller than x is ≥3n/10-6. • Then, in worst case, Select is called recursively in Step 5 on at most 7n/10+6 elements (upper bound).
Analysis (cont.) • Steps 1,2 and 4: O(n) time. • Step 3: T(n/5) • Step 5: at most T(7n/10+6) • 7n/10+6 < n for n > 20. • T(n) ≤ T(|¯n/5¯|)+T(7n/10+6)+O(n), n > n1. • Use substitution to solve: • Assume T(n) ≤ cn, for n > n1; find n1 and c.
Analysis (cont.) • T(n) ≤ c|¯n/5¯| + c(7n/10+6) + O(n) ≤ cn/5 + c + 7cn/10 + 6c +O(n) = 9cn/10 + 7c + O(n) • Want T(n) ≤ cn: • Pick c such that c(n/10-7) ≥ c1n, where c1 is constant from O(n) above (n1 = 80).
Questions • Why not groups of 7 elements? • Why not groups of 3 elements? • T(n)=O(?)