1 / 30

Lecture 26: BUCKET SORT & RADIX Sort

CSC 213 – Large Scale Programming. Lecture 26: BUCKET SORT & RADIX Sort. Today’s Goals. Review discussion of merge sort and quick sort How do they work & why divide-and-conquer? Are they fastest possible sorts? Another way to sort data presented

Download Presentation

Lecture 26: BUCKET SORT & RADIX Sort

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC 213 – Large Scale Programming Lecture 26:BUCKET SORT & RADIX Sort

  2. Today’s Goals • Review discussion of merge sort and quick sort • How do they work & why divide-and-conquer? • Are they fastest possible sorts? • Another way to sort data presented • How can we sort data with single simple value? • What are limits on using buckets to sort our data? • If we want more buckets, can we expand these limits? • How does radix sort work? How long does it need?

  3. Quick Sort v. Merge Sort Quick Sort Merge Sort • Divide data around pivot • Want pivot to be near middle • All comparisons occur here • Conquer with recursion • Does not need extra space • Merge usually done already • Data already sorted! • Divide data in blindly half • Always gets even split • No comparisons performed! • Conquer with recursion • Needs* to use other arrays • Merge combines solutions • Compares from (sorted) halves

  4. Complexity of Sorting • With n! external nodes, binary tree’s height is: O(n log n)

  5. Bucket-Sort • Buckets, B, is array of Sequence • Sorts Collection, C, in two phases: • Remove each elementv from C & add to B[v] • Move elements from each bucket back to C A B C

  6. Bucket-Sort • Buckets, B, is array of Sequence • Sorts Collection, C, in two phases: • Remove each elementv from C & add to B[v] • Move elements from each bucket back to C

  7. Bucket-Sort Algorithm AlgorithmbucketSort(Sequence<Integer>C)B=new Sequence[10] // & instantiate eachSequence // Phase 1 for each element v in CB[v].addLast(v) // Assumes each number in C between 0 & 9endfor// Phase 2loc = 0for each Sequenceb in Bfor each element v in bC.set(loc,v)loc+= 1endforendfor return C

  8. Bucket Sort Properties • For this to work, values must be legal indices • Non-negative integer indices needed to access arrays • Sorting occurs without comparing objects

  9. Bucket Sort Properties • For this to work, values must be legal indices • Non-negative integer indices needed to access arrays • Sorting occurs without comparing objects

  10. Bucket Sort Properties • For this to work, values must be legal indices • Non-negative integer indices needed to access arrays Sorting occurs without comparing objects

  11. Bucket Sort Properties • For this to work, values must be legal indices • Non-negative integer indices needed to access arrays • Sorting occurs without comparing objects • Stable sort describes any sort of this type • Preserves relative ordering of objects with same value • (Bubble-sort & Merge-sort are other stable sorts)

  12. Bucket Sort Extensions • Use Comparator for Bucket-sort • Get index for vusing compare(v,null) • Comparatorfor booleans could return • 0when vis false • 1 when vis true • Comparator for US states, could return • Annual per capita consumption of Jello • Consumption of jellooverall, in cubic feet • State’s ranking by population

  13. Bucket Sort Extensions • State’s ranking by population

  14. Bucket Sort Extensions • Extended Bucket-sort works with many types • Limited set of data neededfor this to work • Need way to enumeratevalues of the set

  15. Bucket Sort Extensions • Extended Bucket-sort works with many types • Limited set of data neededfor this to work • Need way to enumeratevalues of the set enumerateis subtle hint

  16. d-Tuples • Combination of d values such as (k1, k2, …, kd) • ki is ith dimension of the tuple • A point (x,y,z) is 3-tuple • xis1st dimension’s value • Value of 2nd dimension isy • zis3rd dimension’s value

  17. Lexicographic Order • Assume a&bare both d-tuples • a= (a1,a2, …, ad) • b= (b1,b2, …, bd) • Can say a<bif and only if • a1< b1OR • a1= b1&& (a2, …, ad) < (b2, …, bd) • Order these 2-tuples using previous definition(3 4) (7 8) (3 2) (1 4) (4 8)

  18. Lexicographic Order • Assume a&bare both d-tuples • a= (a1,a2, …, ad) • b= (b1,b2, …, bd) • Can say a<bif and only if • a1< b1OR • a1= b1&& (a2, …, ad) < (b2, …, bd) • Order these 2-tuples using previous definition(3 4) (7 8)(3 2)(1 4)(4 8)(1 4) (3 2)(3 4) (4 8) (7 8)

  19. Radix-Sort • Very fast sort for data expressed as d-tuple • Cheats to win;faster than sorting’s lower bound • Sort performed using d calls to bucket sort • Sorts least to most important dimension of tuple • Luckily lots of data are d-tuples • String is d-tuple of char “L E T T E R S” “L I N G E R S”

  20. Radix-Sort • Very fast sort for data expressed as d-tuple • Cheats to win;faster than sorting’s lower bound • Sort performed using d calls to bucket sort • Sorts least to most important dimension of tuple • Luckily lots of data are d-tuples • Digits of an intcan be used for sorting, also 1 0 0 1 3 7 2 9 1 0 0 9 2 2 1 0

  21. Radix-Sort For Integers • Represent int as a d-tuple of digits:621010 = 1111102041010 = 0001002 • Decimal digits needs 10 buckets to use for sorting • Ordering using their bits needs 2 buckets • O(d∙n) time needed to run Radix-sort • d is length of longest element in input • In most cases value of dis constant (d = 31 for int) • Radix sort takes O(n) time, ignoring constant

  22. Radix-Sort In Action • List of 4-bit integers sorted using Radix-sort 1001 0010 1101 0001 1110

  23. Radix-Sort In Action • List of 4-bit integers sorted using Radix-sort 1001 0010 0010 1110 1101 1001 0001 1101 0001 1110

  24. Radix-Sort In Action • List of 4-bit integers sorted using Radix-sort 1001 0010 1001 0010 1110 1101 1101 1001 0001 0001 1101 0010 0001 1110 1110

  25. Radix-Sort In Action • List of 4-bit integers sorted using Radix-sort 1001 0010 1001 1001 0010 1110 1101 0001 1101 1001 0001 0010 0001 1101 0010 1101 0001 1110 1110 1110

  26. Radix-Sort In Action • List of 4-bit integers sorted using Radix-sort 1001 0010 1001 1001 0001 0010 1110 1101 0001 0010 1101 1001 0001 0010 1001 0001 1101 0010 1101 1101 0001 1110 1110 1110 1110

  27. Radix-Sort AlgorithmradixSort(Sequence<Integer>C) // Works from least to most significant value for bit = 0 to 30C = bucketSort(C, bit) // Sort C using the specified bitendfor return C • What is big-Oh complexity for Radix-Sort? • Call in loop uses each element twice • Loop repeats once per digit to complete sort

  28. Radix-Sort AlgorithmradixSort(Sequence<Integer>C) // Works from least to most significant value for bit = 0 to 30C = bucketSort(C, bit) // Sort C using the specified bitendfor return C • What is big-Oh complexity for Radix-Sort? • Call in loop uses each element twice O(n) • Loop repeats once per digit to complete sort * O(1) O(n)

  29. Radix-Sort AlgorithmradixSort(Sequence<Integer>C) // Works from least to most significant value for bit = 0 to 30C = bucketSort(C, bit) // Sort C using the specified bitendfor return C • What is big-Oh complexity for Radix-Sort? • Call in loop uses each element twice O(n) • Loop repeats once per digitto complete sort * O(1) O(log n) times (?)O(n log n)

  30. For Next Lecture • Start thinking test cases for program #2 • Wed. is next deadline when these must be submitted • Spend time on this: tests & design saves coding • Tuesday deadline for weekly assignment • For Wednesday, review index files, Set & sorts • Quiz will be like others this term with mix of problems

More Related