120 likes | 200 Views
Chapter 8: Overview. Comparison sorts: algorithms that sort sequences by comparing the value of elements Worst-case runtime (the number of comparison required to sort n numbers) has a tight lower bound of n lg(n) To beat n lg(n) requires information about the input in addition
E N D
Chapter 8: Overview Comparison sorts: algorithms that sort sequences by comparing the value of elements Worst-case runtime (the number of comparison required to sort n numbers) has a tight lower bound of n lg(n) To beat n lg(n) requires information about the input in addition to the values of the elements to be sorted Counting, Radix and Bucket sort are examples where we have this additional information Their worst-case runtimes are W(n).
Lower bound on comparison sorts: Assuming elements are distinct makes no difference in proving a lower bound on runtime. For distinct elements, all the information that can be gained from comparison of elements is contained in the question Is “aj > ak?” This reduces comparison sorts to binarydecision trees.
Decision tree to sort 3 distinct elements Each node (labeled j : k) denotes a comparison of elements aj and ak One edge denotes aj< ak, the other denotes aj > ak Each leaf is a permutation of 3 objects that describes how the input array is sorted by the path that connects the root to that leaf.
Binary decision tree to sort n distinct elements Any valid sorting algorithm must be accurate for all possible permutations of its input A necessary condition for correct comparison sorting is that every permutation of n object must appear in a decision tree as a leaf that is reachable from the root The length of the longest path from root to a leaf represents the worst-case number of comparison for a comparison sort • worst-case running time of comparison sort = height of its decision tree
Minimum height of decision tree to sort n elements Let h be the height of decision tree of an n-element comparison sort For a complete binary tree, # of leaves = 2h Binary decision tree to sort n elements is not complete Therefore, 2h> n! Solve for h: h> lg(n!) We have shown lg(n!) = Q(nlgn) Therefore, h = W(nlgn)
Counting Sort Counting sort requires (1) elements to be sorted are integers (2) know range of integers in the input array (i.e. no elements have a value < 0 or > k) • and (2) allow us to use the elements to be sorted as array indices For every element x count the number of elements that are less than x • and (2) allow us to do this without comparing the size of elements.
Runtime of Counting-sort initialization for loop (lines 1&2) requires Q(k) 1st pass on C[i] (for loop in lines 3&4) requires Q(n) 2nd pass on C[i] (for loop in lines 6&7) requires Q(k) Building the output (for loop in lines 9,10&11) requires Q(n) Overall T(n) =Q(k + n) Q(n) for finite k
Bucket sort Beats W(nlgn) on comparison sorts by using the distribution of input values Example: Sort elements uniformly distributed on [0,1) Define bins or buckets such that all 0.0 <x < 0.1 go in bin 0 all 0.1 <x < 0.2 go in bin 1 etc. Due to uniformly distributed [0,1), expect only a small number of elements to fall into any one bin Small number of elements can be efficiently sorted by Insertion-sort Concatenate sorted bins to get the sorted output
Runtime of Bucket-sort is a random variable Different permutations of the same input have the same runtime Different instances of inputs uniformly distributed on [0,1) will have different bin structure and different runtimes T(n) = Q(n) + O(ni2) (bin input and sort bins) E[T(n)] = Q(n) + O(E[ni2]) (linearity of expectation values) By complicated use of indicator random variables E[ni2] = 2 – 1/n (text 202-203) E[T(n)] = Q(n) + O(E[ni2]) = Q(n) + n O(2 – 1/n) Q(n)