1 / 16

Summary of claims

Summary of claims. Sorting algorithms that compare adjacent elements have average-case time W (n 2 ) Sorting algorithms that compare pairs of elements have worst-case time W (n log n) BST sort and quicksort each have average-case time complexity Q (n log n)

Download Presentation

Summary of claims

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Summary of claims • Sorting algorithms that compare adjacent elements have average-case time W(n2) • Sorting algorithms that compare pairs of elements have worst-case time W(n log n) • BST sort and quicksort each have average-case time complexity Q(n log n) • In each case we assume (unrealistically?): • that each permutation of the sorted sequence is equally likely to appear as input

  2. Average-case analysis • Average-case analysis of algorithms requires a precise notion of average-case behavior. • Finding such a notion can be hard. • Often the simplest notion is to assume that each possible input is equally likely.

  3. Average-case analysis for sorting • For sorting, the input may be considered to be a permutation of the sorted sequence. • This is why we assume that each input permutation is equally likely • and thus has probability 1/n!. • This may be a bad assumption in practice • in particular, sorted input often occurs with probability greater than 1/n!

  4. The equal-likelihood assumption • If it’s important, we can force the equal-likelihood assumption to be true • The sorting algorithm can first apply a pseudorandom permutation to its input • this can be done in O(n) time • so this preprocessing step won’t affect the time complexity of the sorting algorithm

  5. Inefficiency of swapping adjacent elements • Some sorting algorithms work by swapping adjacent elements e.g., insertion sort and bubble sort • We need a way to prove our claim: that these algorithms must have average-case time complexity W(n2). • under the equal-likelihood assumption

  6. Inverse permutations • One slick idea: for any input permutation p, there’s an inverse permutation p-1. • By assumption, p and p-1 have the same probability. • It’s enough to show that the average-case time complexity for p and p-1 is W(n2) • The key concept is that of an inversion – a pair of elements that is out of order.

  7. Inversions • Any pair of elements must be inverted in exactly one of {p, p-1} • So p and p-1 must together contain n(n-1)/2 inversions – one for each pair of elements. • so the average number over {p,p-1} is n(n-1)/4 • Since p is arbitrary, the average number of inversions overall is n(n-1)/4, or Q(n2). • Swapping adjacent elements fixes just 1 inversion, so W(n2) swaps are required.

  8. BST sort in the average case • Recall our claim – that BST sort takes time Q(n log n) in the average case • if we make the equal-likelihood assumption • It’s enough to show that the total sum D(n) of the distances to the nodes is Q(n log n) • since this sum measures the total time for all insertions • and traversal takes time Q(n) • this sum is often called the internal path length

  9. Internal path lengths • Consider BSTs of size n and LST size i • The average internal path length of such BSTs is D(i) + D(n-i-1) + n-1 • since the distance of all n-1 nonroot nodes to the root is 1 greater than that to the subtree root • So the average internal path length of BSTs of size n is the average of these values • as i ranges from 0 through n-1 • all values of i are equally likely by equal likelihood

  10. Bounding the average internal path length • But the average value of D(i) + D(n-i-1) + n-1 is (2/n)[S D(i)] + n-1 • since the values of i vary over the same range • To show: this is ≤ cn log n for some c • We can use the corresponding inequality for i<n as an induction hypothesis • So it’s enough to show that (2/n)[S ci log i] + n-1 ≤ cn log n

  11. Bounding (2/n)[S ci log i] + n-1 by cn log n • The integral test bounds S i log i above by the integral from 1 to n of x log x dx • The indefinite integral is (x2/2)log x – x2/4 • using integration by parts • So (2/n)[S ci log i] + n-1 ≤ (2c/n)[(n2log n)/2 - (n2/4) + 1/4] +n-1 = cn log n – cn/2 + c/2n + n – 1 which is at most cn log n for c≥2 and n > c/2

  12. Quicksort in the average case • For the average-case analysis of quicksort: • under the equal-likelihood assumption • Let T(n) be the average-case time for quicksort on input of size n • Then T(n) ≤ (1/n) [S T(k) + S T(k) ] + cn • T(0) = 0, so the sum runs from k=1 to k=n-1 • since a randomly chosen pivot element is equally likely to be anywhere in the output

  13. To show: T(n) ≤ dn log n for some d • We have T(n) ≤ (2/n) S T(k) + cn • by combining like terms • By induction, T(n) = (2/n) dS (k log k) + cn • for some d that we may choose • The sum is at most n2[(log n)/2 - (1/4)] + 1/4 • by the same integral test as for BSTs • So T(n) ≤ dn log n – (d/2)n + d/(2n) + cn • And so T(n) ≤ dn log n, QED • for d >>c and large n (e.g, for d ≥ 3c and n2 > 3)

  14. Sorting by comparing pairs of elements • Finally, consider an arbitrary sorting algorithms that works by comparing pairs of elements • In k comparisons, such an algorithm can distinguish at most 2k inputpermutations • But there are n! input permutations. • So W(log n!) comparisons are required

  15. A lower bound for sorting by comparing pairs of elements • But log n! is just S log k • as k ranges from 1 to n • And S log k is bounded below by ∫ log x dx • as x ranges from 1 to n, by the integral test • The indefinite integral is x log x – x • So log n! ≥ n log n – n + 1 • And the number of comparisons required by comparison-based sorts is W(n log n)

  16. Guessing and verifying a solution for the mergesort recurrence • Suppose we guess that the mergesort recurrence has the O(n log n) solution suggested by merge trees • We can verify our guess recursively • We’d need to show 2T(n/2) + cn < d n log n • by induction we may assume • 2T(n/2) + cn ≤ 2(dn/2(log n/2)) + cn • but on the right we have dn (-1 + log n) + cn • = -2dn + dn log n + cn • < dn log n if d > c/2

More Related