1 / 15

Summary of claims

Summary of claims. Sorting algorithms that compare adjacent elements have average-case time W (n 2 ) Sorting algorithms that compare pairs of elements have worst-case time W (n log n) BST sort and quicksort each have average-case time complexity Q (n log n)

jovan
Download Presentation

Summary of claims

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Summary of claims • Sorting algorithms that compare adjacent elements have average-case time W(n2) • Sorting algorithms that compare pairs of elements have worst-case time W(n log n) • BST sort and quicksort each have average-case time complexity Q(n log n) • In each case we assume that each permutation of the sorted sequence is equally likely to appear as input • This is often an unrealistic assumption.

  2. Average-case analysis • Average-case analysis of algorithms requires a precise notion of average-case behavior. • Finding such a notion can be hard. • Often the simplest notion is to assume that each possible input is equally likely.

  3. Average-case analysis for sorting • For sorting, the input may be considered to be a permutation of the sorted sequence. • So the simplest assumption here is that each input permutation is equally likely • and thus has probability 1/n!. • This may be a bad assumption in practice • In particular, sorted input often occurs with probability greater than 1/n!

  4. The equal-likelihood assumption • Yet it’s common to assume equal likelihood in average-case analysis of sorting. • This assumption can be made true if necessary. • the sorting algorithm can first apply a pseudorandom permutation to its input • this can be done in O(n) time

  5. Inefficiency of swapping adjacent elements • Some sorting algorithms work by swapping adjacent elements e.g., insertion sort and bubble sort • Recall our claim: these algorithms must have average-case time complexity W(n2). • with the equal likelihood assumption

  6. Inverse permutations • One slick idea: for any input permutation p, there’s an inverse permutation p-1. • By assumption, p and p-1 have the same probability. • It’s enough to show that the average-case time complexity for p and p-1 is W(n2) • The key concept is that of an inversion – a pair of elements that is out of order.

  7. Inversions • Any pair of elements must be inverted in exactly one of {p, p-1} • So p and p-1 must together contain as many inversions as there are pairs of elements. • this number is n(n-1)/2 • The average number of inversions over p and p-1 (and thus in general) is thus n(n-1)/4 • which is Q(n2). • Swapping adjacent elements fixes just 1 inversion, so W(n2) swaps are required.

  8. BST sort in the average case • Recall our claim – that BST sort takes time Q(n log n) in the average case • if we make the equal-likelihood assumption • It’s enough to show that the total sum of the distances to the nodes is Q(n log n) • since this sum measures the total time for all insertions • and traversal takes time Q(n) • this sum is often called the internal path length

  9. Internal path lengths • Consider BSTs of size n and LST size i • The average internal path length of such BSTs is D(i) + D(n-i-1) + n-1 • So the average internal path length of BSTs of size n is the average of these values • as i ranges from 0 through n-1 • using our equal-likelihood assumption to ensure that all values of i are equally likely

  10. Bounding the average internal path length • But the average value of D(i) + D(n-i-1) + n-1 is (2/n)[S D(i)] + n-1 • since the arguments to D range over the same • To show that this is at most cn log n for some c, we can use the corresponding inequality for i<n as an induction hypothesis • So it’s enough to show that (2/n)[S ci log i] + n-1 ≤ cn log n

  11. Bounding (2/n)[S ci log i] + n-1 by cn log n • The integral test bounds S i log i above by the integral from 1 to n of x log x dx • The indefinite integral is (x2/2)log x – x2/4 • using integration by parts • So (2/n)[S ci log i] + n-1 ≤ (2c/n)[(n2log n)/2 - (n2/4) + 1/4] +n-1 = cn log n – cn/2 + c/2n + n – 1 which is at most cn log n for c≥2 and n > c/2

  12. Quicksort in the average case • For the average-case analysis of quicksort: • under the equal-likelihood assumption • Let T(n) be the average-case time for quicksort on input of size n • Then T(n) ≤ (1/n) [S T(k) + S T(k) ] + cn • T(0) = 0, so the sum runs from k=1 to k=n-1 • since a randomly chosen pivot element is equally likely to be anywhere in the output

  13. To show: T(n) ≤ dn log n for some d • We have T(n) ≤ (2/n) S T(k) + cn • by combining like terms • By induction, T(n) = (2/n) dS (k log k) + cn • for some d that we may choose • The sum is at most n2[(log n)/2 - (1/4)] + 1/4 • by the same integral test as for BSTs • So T(n) ≤ dn log n – (d/2)n + d/(2n) + cn • And so T(n) ≤ dn log n, QED • for d >>c (e.g, for d ≥ 3c and n2 > 3)

  14. Sorting by comparing pairs of elements • Finally, consider an arbitrary sorting algorithms that works by comparing pairs of elements • In k comparisons, such an algorithm can distinguish at most 2k inputpermutations • But there are n! input permutations. • So W(log n!) comparisons are required

  15. A lower bound for sorting by comparing pairs of elements • But log n! is just S log k • as k ranges from 1 to n • And S log k is bounded below by ∫ log x dx • as x ranges from 1 to n, by the integral test • The indefinite integral is x log x – x • So log n! ≥ n log n – n + 1 • And the number of comparisons required by comparison-based sorts is W(n log n)

More Related