650 likes | 665 Views
This article discusses the importance of sorting algorithms, their relevance in problem-solving, and provides explanations, implementations, and analysis of various popular sorting algorithms. Learn more about bubble sort, insertion sort, selection sort, shell sort, heap sort, merge sort, quick sort, counting sort, and radix sort.
E N D
Sorting HKOI Training Team (Advanced) 2006-01-21
What is sorting? • Given: A list of n elements: A1,A2,…,An • Re-arrange the elements to make them follow a particular order, e.g. • Ascending Order: A1 ≤ A2 ≤ … ≤ An • Descending Order: A1 ≥ A2 ≥ … ≥ An • We will talk about sorting in ascending order only
Why is sorting needed? • Some algorithms works only when data is sorted • e.g. binary search • Better presentation of data • Often required by problem setters, to reduce workload in judging
Why learn Sorting Algorithms? • C++ STL already provided a sort() function • Unfortunately, no such implementation for Pascal • This is a minor point, though
Why learn Sorting Algorithms? • Most importantly, OI problems does not directly ask for sorting, but its solution may be closely linked with sorting algorithms • In most cases, C++ STL sort() is useless. You still need to write your own “sort” • So… it is important to understand the idea behind each algorithm, and also their strengths and weaknesses
Some Sorting Algorithms… • Bubble Sort • Insertion Sort • Selection Sort • Shell Sort • Heap Sort • Merge Sort • Quick Sort • Counting Sort • Radix Sort How many of them do you know?
Bubble, Insertion, Selection… • Simple, in terms of • Idea, and • Implementation • Unfortunately, they are inefficient • O(n2) – not good if N is large • Algorithms being taught today are far more efficient than these
Shell Sort • Named after its inventor, Donald Shell • Observation: Insertion Sort is very efficient when • n is small • when the list is almost sorted
Shell Sort 2 1 4 7 4 4 8 3 6 4 7 7 • Divide the list into k non-contiguous segments • Elements in each segments are k-elements apart • In the beginning, choose a large k so that all segments contain a few elements (e.g. k=n/2) • Sort each segment with Insertion Sort
Shell Sort 2 1 4 4 4 8 3 6 7 7 • Definition: A list is said to be “k-sorted” when A[i] ≤ A[i+k] for 1 ≤ i ≤ n-k • Now the list is 5-sorted
Shell Sort 2 1 4 7 4 8 3 6 4 7 ≥2 ≥1 Insert Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 4 7 4 8 3 6 4 7 ≥4 ≥7 Insert Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 4 7 4 8 3 6 4 7 ≥2 <4 <4 Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 3 7 4 8 4 6 4 7 ≥1 <7 <8 Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 3 6 4 7 4 8 4 7 ≥4 Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 3 6 4 7 4 8 4 7 ≥7 <8 Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 1 2 1 2 3 3 6 4 4 4 7 4 4 6 7 7 4 7 8 8 • Finally, k is reduced to 1 • The list look like mostly sorted • Perform 1-sort, i.e. the ordinary Insertion Sort
Shell Sort – Worse than Ins. Sort? • In Shell Sort, we still have to perform an Insertion Sort at last • A lot of operations are done before the final Insertion Sort • Isn’t it worse than Insertion Sort?
Shell Sort – Worse than Ins. Sort? • The final Insertion Sort is more efficient than before • All sorting operations before the final one are done efficiently • k-sorts compare far-apart elements • Elements “moves” faster, reducing amount of movement and comparison
Shell Sort – Increment Sequence • In our example, k starts with n/2, and half its value in each pass, until it reaches 1, i.e. {n/2, n/4, n/8, …, 1} • This is called the “Shell sequence” • In a good Increment Sequence, all numbers should be relatively prime to each other • Hibbard’s Sequence: {2m-1, 2m-1-1, …, 7, 3, 1}
Shell Sort – Analysis • Average Complexity: O(n1.5) • Worse case of Shell Sort with Shell Sequence: O(n2) • When will it happen?
Heap Sort • In Selection Sort, we scan the entire list to search for the maximum, which takes O(n) time • Are there better way to get the maximum? • With the help of a heap, we may reduce the searching time to O(lg n)
Heap Sort – Build Heap • Create a Heap with the list 2 2 8 5 7 1 4 8 5 7 1 4
Heap Sort • Pick the maximum, restore the heap property 8 8 7 5 2 1 4 7 5 2 1 4
Heap Sort • Repeat step 2 until heap is empty 7 7 4 5 2 1 8 4 5 2 1
Heap Sort • Repeat step 2 until heap is empty 5 5 4 1 2 7 8 4 1 2
Heap Sort • Repeat step 2 until heap is empty 4 4 2 1 5 7 8 2 1
Heap Sort • Repeat step 2 until heap is empty 2 2 1 4 5 7 8 1
Heap Sort • Repeat step 2 until heap is empty 1 1 2 4 5 7 8
Heap Sort – Analysis • Complexity: O(n lg n) • Not a stable sort • Difficult to implement
Merging • Given two sorted list, merge the list to form a new sorted list • A naïve approach: Append the second list to the first list, then sort them • Slow, takes O(n lg n) time • Are there any better way?
Merging • We make use of a property of sorted lists: The first element is always the minimum • What does that imply? • An additional array is needed store temporary merged list • Pick the smallest number from the un-inserted numbers and append them to the merged list
Merging List A 1 3 7 9 List B 2 3 6 Temp
Merge Sort • Merge sort follows the divide-and-conquer approach • Divide:Divide the n-element sequence into two (n/2)-element subsequences • Conquer: Sort the two subsequences recursively • Combine: Merge the two sorted subsequence to produce the answer
Merge Sort Merge Sort Merge Sort 1. Divide the list into two 2 8 5 5 8 7 1 1 4 4 7 2. Call Merge Sort recursively to sort the two subsequences
Merge Sort 3. Merge the list (to temporary array) 2 5 8 1 4 7 4. Move the elements back to the list
Merge Sort – Analysis • Complexity: O(n lg n) • Stable Sort • What is a stable sort? • Not an “In-place” sort • i.e. Additional memory required • Easy to implement, no knowledge of other data structures needed
Stable Sort • What is a stable sort? • The name of a sorting algorithm • A sorting algorithm that has stable performance over all distribution of elements, i.e. Best ≈ Average ≈ Worse • A sorting algorithm that preserves the original order of duplicated keys
Stable Sort Original List a b Stable Sort a b Un-stable Sort b a
Stable Sort • Which sorting algorithms is/are stable? Bubble Sort Selection Sort Insertion Sort Shell Sort Heap Sort Merge Sort
Stable Sort • In our previous example, what is the difference between 3a and 3b? • When will stable sort be more useful? • Sorting records • Multiple keys
Quick Sort • Quick Sort also uses the Divide-and-Conquer approach • Divide: Divide the list into two by partitioning • Conquer: Sort the two list by calling Quick Sort recursively • Combine: Combine the two sorted list
Quick Sort – Partitioning • Given: A list and a “pivot” (usually an element in the list) • Re-arrange the elements so that • Elements on the left-hand side of “pivot” are less than the pivot, and • Elements on the right-hand side of the “pivot” are greater than or equal to the pivot < Pivot Pivot ≥ Pivot
Pivot lo hi Quick Sort – Partitioning • e.g. Take the first element as pivot • Swap all pairs of elementsthat meets the following criteria: • The left one is greater than or equal to pivot • The right one is smaller than pivot • Swap pivot with A[hi] 4 6 7 0 9 3 9 4 ≥ pivot? ≥ pivot? < pivot? < pivot? < pivot? < pivot? < pivot?
Pivot Quick Sort • After partitioning: • Apply Quick Sort on both lists 0 3 4 4 7 9 6 6 7 9 9 9 4 Quick Sort Quick Sort
Quick Sort – Analysis • Complexity • Best: O(n lg n) • Worst: O(n2) • Average: O(n lg n) • When will the worst case happen? • How to avoid the worst case? • In-Place Sort • Not a stable sort
Counting Sort • Consider the following list of numbers 5, 4, 2, 1, 4, 3, 4, 2, 5, 1, 4, 5, 3, 2, 3, 5, 5 • Range of numbers = [1,5] • We may count the occurrence of each number 2 3 3 4 5
Counting Sort (1) • With the frequency table, we can reconstruct the list in ascending order 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5
Counting Sort (1) • Can we sort records with this counting sort? • Is this sort stable?
Counting Sort (2) • An alternative way: use cumulative frequency table and a temporary array • Given the following “records” Frequency Table Cumulative 1 3 4 2 6