1 / 22

15-211 Fundamental Structures of Computer Science

15-211 Fundamental Structures of Computer Science. Sorting – Part II. February 25, 2003. Ananda Guna. Announcements. Work on Homework #4 Due on Monday, March 17, 11:59pm You should have started by now! Quiz #2 is Tuesday, Feb.25 Study Huffman and LZW algorithms

Download Presentation

15-211 Fundamental Structures of Computer Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 15-211Fundamental Structuresof Computer Science Sorting – Part II February 25, 2003 Ananda Guna

  2. Announcements • Work on Homework #4 • Due on Monday, March 17, 11:59pm • You should have started by now! • Quiz #2 • is Tuesday, Feb.25 • Study Huffman and LZW algorithms • Midterm is Tuesday March 4th • Review for mid term test thursday

  3. Master Theorem THEOREM: The recurrence T(n) = aT(n/b) + cn, T(1) = c, where a, b, and c are all constants, solves to: T(n) = (n) if a < b T(n) = (n log n) if a = b T(n) = (n logb a) if a > b

  4. Recurrences • Divide-and-conquer algorithms often lead to recurrences of the following form: • T(n) = a*T(n/b) + cn • T(1) = c (Here a,b, and c, are constants > 0.) • For merge sort : a = b = 2 • What if a = b = 3 for merge sort? How will that affect c?

  5. Solving General Recurrences • We can solve this by repeated substitution method • T(n) = aT(n/b) + cn • = a(aT(n/b2) + cn/b) + cn • = a(a(aT(n/b3)+cn/b2)+cn/b) +cn • = …… • = ak+1 T(n/bk+1) + cn[(a/b)k+..+(a/b)2+(a/b)+1] • We will solve this in class.

  6. Sorting Recap • Selection sort: always O(n^2) • Insertion sort: the total time is O(n + # inversions). This is O(n^2)in worst-case and average-case, but might be smaller if the file is almost sorted. • Bubble-sort - between insertion and selection in terms of running time. • These are all easy to code but O(n2) on average • We can do better

  7. Sorting Recap ctd.. • Better algorithms • Heapsort : O(n log n) worst-case. Can do in-place in array. • Mergesort. O(n log n) worst-case. Simple divide-and-conquer: split into left and right halves, recursively sort both halves, and then merge the results. Running time described by recurrence: T(n) = 2T(n/2) + cn and Recurrence solves to O(n log n). • Quicksort: O(n^2) worst-case but O(n log n) average-case. • If you always pick the leftmost as pivot, then this is a lot like inserting into a binary search tree • Cost is like sum of depths of the nodes • How can we avoid the worst case for any data set? Hint: Randomize • Why is quicksort better than MergeSort? Hint: faster innerloop

  8. Lower Bound for the Sorting Problem

  9. How fast can we sort? • We have seen several sorting algorithms with O(Nlog N) running time. • Can we do better than N.logN? • In fact, O(Nlog N) is a general lower bound for the sorting algorithm. • A proof appears in Weiss. • Informally we can argue as follows…

  10. a<b<c a<c<b b<a<c b<c<a c<a<b c<b<a a<b b<a a<b<c a<c<b c<a<b b<a<c b<c<a c<b<a a<c c<a b<c c<b a<b<c a<c<b c<a<b b<a<c b<c<a c<b<a b<c c<b a<c c<a a<b<c a<c<b b<a<c b<c<a Decision tree for sorting N! leaves. So, tree has height log(N!). log(N!) = (Nlog N).

  11. Our lower bound argument • We make the following observations/Arguments • for any two different permutations P1,P2 of the input, the algorithm must at some point make a comparison that causes it to do different things in the two permutations (otherwise, they wouldn't both be sorted) • each comparison has only two outcomes (it's a YES/NO question) • there must be some permutation that causes the algorithm to ask log(n!) questions • So we argued that comparison based algorithms are both O(n log n) and  (n log n) • So this is (n log n).

  12. Summary on sorting bound • If we are restricted to comparisons on pairs of elements, then the general lower bound for sorting is (Nlog N). • A decision treeis a representation of the possible comparisons required to solve a problem.

  13. Bucket Sort

  14. Non-comparison-based sorting • If we can use more than just comparisons of pairs of elements, we can sometimes sort more quickly. • A simple example is bucket sort. • In bucket sort, we require the additional knowledge that all elements are non-negative integers less than a specified maximum value.

  15. 2 3 1 Bucket sort 1 3 3 1 2

  16. Implementing Bucket Sort • Assume all values are in the range 0..k for some small k. • Make an array of k linked lists • Insert each item into array[item.value()] • Make one pass and collect all items • This is O(N + k) algorithm

  17. Bucket sort characteristics • Runs in O(N) time. • Easy to implement each bucket as a linked list. • Is stable: • If two elements (A,B) are equal with respect to sorting, and they appear in the input in order (A,B), then they remain in the same order in the output.

  18. Radix Sort • If your integers are in a larger range then do bucket sort on each digit • Start by sorting with the low-order digit using a STABLE bucket sort. • Then, do the next-lowest,and so on • If the items are b digits long (or b bytes long for strings) then the time to sort N items is O(Nb).

  19. 0 1 0 0 0 0 1 0 0 1 1 0 1 0 1 0 0 1 1 1 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 1 1 0 1 1 1 0 1 1 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 Radix sort Example • A sorting algorithm that goes beyond comparison - radix sort. 2 0 5 1 7 3 4 6 0 1 0 0 0 0 1 0 1 0 0 1 1 1 1 0 1 1 1 0 0 1 1 0 0 1 2 3 4 5 6 7 Each sorting step must be stable.

  20. Radix sort characteristics • Each sorting step can be performed via bucket sort, and is thus O(N). • If the numbers are all b bits long, then there are b sorting steps. • Hence, radix sort is O(bN). • Also, radix sort can be implemented in-place (just like quicksort).

  21. 0 3 1 0 3 2 2 5 2 1 2 3 2 2 4 0 1 5 0 1 6 1 6 9 0 1 5 0 1 6 1 2 3 2 2 4 0 3 1 0 3 2 2 5 2 1 6 9 0 1 5 0 1 6 0 3 1 0 3 2 1 2 3 1 6 9 2 2 4 2 5 2 Not just for binary numbers • Radix sort can be used for decimal numbers and alphanumeric strings. 0 3 2 2 2 4 0 1 6 0 1 5 0 3 1 1 6 9 1 2 3 2 5 2

  22. Thursday and Next Week • We will do a review for midterm on Thursday • Midterm test is Tuesday March 4th • We will post some old exams on Bb. • We will have online office hours next week • Work on HW4 – Ask questions early

More Related