650 likes | 793 Views
Sorting. HKOI Training Team (Advanced) 2006-01-21. What is sorting?. Given : A list of n elements: A 1 ,A 2 ,…,A n Re-arrange the elements to make them follow a particular order, e.g. Ascending Order: A 1 ≤ A 2 ≤ … ≤ A n Descending Order: A 1 ≥ A 2 ≥ … ≥ A n
E N D
Sorting HKOI Training Team (Advanced) 2006-01-21
What is sorting? • Given: A list of n elements: A1,A2,…,An • Re-arrange the elements to make them follow a particular order, e.g. • Ascending Order: A1 ≤ A2 ≤ … ≤ An • Descending Order: A1 ≥ A2 ≥ … ≥ An • We will talk about sorting in ascending order only
Why is sorting needed? • Some algorithms works only when data is sorted • e.g. binary search • Better presentation of data • Often required by problem setters, to reduce workload in judging
Why learn Sorting Algorithms? • C++ STL already provided a sort() function • Unfortunately, no such implementation for Pascal • This is a minor point, though
Why learn Sorting Algorithms? • Most importantly, OI problems does not directly ask for sorting, but its solution may be closely linked with sorting algorithms • In most cases, C++ STL sort() is useless. You still need to write your own “sort” • So… it is important to understand the idea behind each algorithm, and also their strengths and weaknesses
Some Sorting Algorithms… • Bubble Sort • Insertion Sort • Selection Sort • Shell Sort • Heap Sort • Merge Sort • Quick Sort • Counting Sort • Radix Sort How many of them do you know?
Bubble, Insertion, Selection… • Simple, in terms of • Idea, and • Implementation • Unfortunately, they are inefficient • O(n2) – not good if N is large • Algorithms being taught today are far more efficient than these
Shell Sort • Named after its inventor, Donald Shell • Observation: Insertion Sort is very efficient when • n is small • when the list is almost sorted
Shell Sort 2 1 4 7 4 4 8 3 6 4 7 7 • Divide the list into k non-contiguous segments • Elements in each segments are k-elements apart • In the beginning, choose a large k so that all segments contain a few elements (e.g. k=n/2) • Sort each segment with Insertion Sort
Shell Sort 2 1 4 4 4 8 3 6 7 7 • Definition: A list is said to be “k-sorted” when A[i] ≤ A[i+k] for 1 ≤ i ≤ n-k • Now the list is 5-sorted
Shell Sort 2 1 4 7 4 8 3 6 4 7 ≥2 ≥1 Insert Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 4 7 4 8 3 6 4 7 ≥4 ≥7 Insert Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 4 7 4 8 3 6 4 7 ≥2 <4 <4 Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 3 7 4 8 4 6 4 7 ≥1 <7 <8 Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 3 6 4 7 4 8 4 7 ≥4 Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 2 1 3 6 4 7 4 8 4 7 ≥7 <8 Insert • After each pass, reduces k (e.g. by half) • Although the number of elements in each segments increased, the segments are usually mostly sorted • Sort each segments with Insertion Sort again
Shell Sort 1 2 1 2 3 3 6 4 4 4 7 4 4 6 7 7 4 7 8 8 • Finally, k is reduced to 1 • The list look like mostly sorted • Perform 1-sort, i.e. the ordinary Insertion Sort
Shell Sort – Worse than Ins. Sort? • In Shell Sort, we still have to perform an Insertion Sort at last • A lot of operations are done before the final Insertion Sort • Isn’t it worse than Insertion Sort?
Shell Sort – Worse than Ins. Sort? • The final Insertion Sort is more efficient than before • All sorting operations before the final one are done efficiently • k-sorts compare far-apart elements • Elements “moves” faster, reducing amount of movement and comparison
Shell Sort – Increment Sequence • In our example, k starts with n/2, and half its value in each pass, until it reaches 1, i.e. {n/2, n/4, n/8, …, 1} • This is called the “Shell sequence” • In a good Increment Sequence, all numbers should be relatively prime to each other • Hibbard’s Sequence: {2m-1, 2m-1-1, …, 7, 3, 1}
Shell Sort – Analysis • Average Complexity: O(n1.5) • Worse case of Shell Sort with Shell Sequence: O(n2) • When will it happen?
Heap Sort • In Selection Sort, we scan the entire list to search for the maximum, which takes O(n) time • Are there better way to get the maximum? • With the help of a heap, we may reduce the searching time to O(lg n)
Heap Sort – Build Heap • Create a Heap with the list 2 2 8 5 7 1 4 8 5 7 1 4
Heap Sort • Pick the maximum, restore the heap property 8 8 7 5 2 1 4 7 5 2 1 4
Heap Sort • Repeat step 2 until heap is empty 7 7 4 5 2 1 8 4 5 2 1
Heap Sort • Repeat step 2 until heap is empty 5 5 4 1 2 7 8 4 1 2
Heap Sort • Repeat step 2 until heap is empty 4 4 2 1 5 7 8 2 1
Heap Sort • Repeat step 2 until heap is empty 2 2 1 4 5 7 8 1
Heap Sort • Repeat step 2 until heap is empty 1 1 2 4 5 7 8
Heap Sort – Analysis • Complexity: O(n lg n) • Not a stable sort • Difficult to implement
Merging • Given two sorted list, merge the list to form a new sorted list • A naïve approach: Append the second list to the first list, then sort them • Slow, takes O(n lg n) time • Are there any better way?
Merging • We make use of a property of sorted lists: The first element is always the minimum • What does that imply? • An additional array is needed store temporary merged list • Pick the smallest number from the un-inserted numbers and append them to the merged list
Merging List A 1 3 7 9 List B 2 3 6 Temp
Merge Sort • Merge sort follows the divide-and-conquer approach • Divide:Divide the n-element sequence into two (n/2)-element subsequences • Conquer: Sort the two subsequences recursively • Combine: Merge the two sorted subsequence to produce the answer
Merge Sort Merge Sort Merge Sort 1. Divide the list into two 2 8 5 5 8 7 1 1 4 4 7 2. Call Merge Sort recursively to sort the two subsequences
Merge Sort 3. Merge the list (to temporary array) 2 5 8 1 4 7 4. Move the elements back to the list
Merge Sort – Analysis • Complexity: O(n lg n) • Stable Sort • What is a stable sort? • Not an “In-place” sort • i.e. Additional memory required • Easy to implement, no knowledge of other data structures needed
Stable Sort • What is a stable sort? • The name of a sorting algorithm • A sorting algorithm that has stable performance over all distribution of elements, i.e. Best ≈ Average ≈ Worse • A sorting algorithm that preserves the original order of duplicated keys
Stable Sort Original List a b Stable Sort a b Un-stable Sort b a
Stable Sort • Which sorting algorithms is/are stable? Bubble Sort Selection Sort Insertion Sort Shell Sort Heap Sort Merge Sort
Stable Sort • In our previous example, what is the difference between 3a and 3b? • When will stable sort be more useful? • Sorting records • Multiple keys
Quick Sort • Quick Sort also uses the Divide-and-Conquer approach • Divide: Divide the list into two by partitioning • Conquer: Sort the two list by calling Quick Sort recursively • Combine: Combine the two sorted list
Quick Sort – Partitioning • Given: A list and a “pivot” (usually an element in the list) • Re-arrange the elements so that • Elements on the left-hand side of “pivot” are less than the pivot, and • Elements on the right-hand side of the “pivot” are greater than or equal to the pivot < Pivot Pivot ≥ Pivot
Pivot lo hi Quick Sort – Partitioning • e.g. Take the first element as pivot • Swap all pairs of elementsthat meets the following criteria: • The left one is greater than or equal to pivot • The right one is smaller than pivot • Swap pivot with A[hi] 4 6 7 0 9 3 9 4 ≥ pivot? ≥ pivot? < pivot? < pivot? < pivot? < pivot? < pivot?
Pivot Quick Sort • After partitioning: • Apply Quick Sort on both lists 0 3 4 4 7 9 6 6 7 9 9 9 4 Quick Sort Quick Sort
Quick Sort – Analysis • Complexity • Best: O(n lg n) • Worst: O(n2) • Average: O(n lg n) • When will the worst case happen? • How to avoid the worst case? • In-Place Sort • Not a stable sort
Counting Sort • Consider the following list of numbers 5, 4, 2, 1, 4, 3, 4, 2, 5, 1, 4, 5, 3, 2, 3, 5, 5 • Range of numbers = [1,5] • We may count the occurrence of each number 2 3 3 4 5
Counting Sort (1) • With the frequency table, we can reconstruct the list in ascending order 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5
Counting Sort (1) • Can we sort records with this counting sort? • Is this sort stable?
Counting Sort (2) • An alternative way: use cumulative frequency table and a temporary array • Given the following “records” Frequency Table Cumulative 1 3 4 2 6