530 likes | 569 Views
Explore the importance and performance aspects of sorting algorithms, including Selection Sort, Bubble Sort, Merge Sort, and more. Understand criteria for algorithm selection and their impact on data efficiency.
E N D
CS214 – Data StructuresLecture 04&5: Quadratic Sorting Slides by Dr-Mohamed El-Ramly, PhD Dr-Basheer Youssef
Agenda 0. Why Sorting? Quadratic Sorting algorithms Selection Sort Insertion Sort Bubble Sort Theoretical Boundaries of Sorting Problem STL support for sorting Sub-quadratic Sorting Merge Sort Shell Sort Quick Sort Heap Sort
Chapter Objectives • To learn how to use the standard sorting algorithms in STL • To learn how to implement the following sorting algorithms: selection sort, bubble sort, insertion sort, shell sort, merge sort, heap sort and quick. • To understand the difference in performance of these algorithms, and which to use for small, medium and large sets of data and for what types of data. Chapter 9: Sorting
Why Sorting? • The efficiency of data handling can be substantially increased if the data is sorted. • Imagine an unsorted phone directory or a dictionary! • Often, it is required to sort data before processing. • First, we need to decide on the sorting criterion • Second we need to decide on the sorting algorithm Chapter 9: Sorting
Which Sorting Algorithms? • To compare algorithm efficiency, we need a machine-independent criterion for evaluating their performance. • Two factors determine the algorithm performance: • The number of comparisons • The number of movements • We use Big-O notation for this purpose. Chapter 9: Sorting
What Affects Performance? • Initial Order of the Data • Does the algorithm recognize already ordered data? • How much time is spent in on ordering already ordered data? • Algorithm’s intelligence • Best / Worst / Average cases • Comparisons / Movements: simple or complex? • Do we compare int values or arrays? • Do we move int values or structures? Chapter 9: Sorting
The family of sorting methods Main sorting themes Address- -based sorting Comparison-based sorting count Sort Transposition sorting BubbleSort Divide and conquer Diminishing increment sorting Insert and keep sorted MergeSort QuickSort Insertion sort ShellSort
Insertion Sort • Based on the technique used by card players to arrange a hand of cards • Player keeps the cards that have been picked up so far in sorted order • When the player picks up a new card, he makes room for the new card and then inserts it in its proper place Chapter 9: Sorting
Insertion Sort Algorithm • For each array element from the second to the last (nextPos = 1) • Insert the element at nextPos where it belongs in the array, increasing the length of the sorted subarray by 1 Chapter 9: Sorting
next to be inserted sorted 3 4 7 12 14 33 14 20 21 38 10 55 9 23 28 16 temp less than 10 10 12 21 14 14 20 33 38 sorted 3 4 7 10 55 9 23 28 16 10 12 14 33 14 20 21 38 One step of insertion sort Chapter 9: Sorting
Psuedocode for Insertion Sort • insertionSort (data [], n) for (i = 1; i < n; i++) move all elements data[j]>data [i]by1position; place data [i]in its proper position; Chapter 9: Sorting
C++ Code for Insertion Sort • template <class T> • void insertionSort (T data[], int n) { for (int i = 1, j; i < n; i++) T tmp = data [i]; for (j = i; j > 0 && tmp < data [j-1]; j--) data [j] = data [j – 1]; data [j] = tmp; } Chapter 9: Sorting
Analysis of insertion sort • Insertion sort sorts array when really needed. • If the array is already sorted, moves are very few. • On the other hand, shifting can be time consuming. Chapter 9: Sorting
Best Case • Outer loop n – 1 • T tmp = data [i]; 1 move x n – 1 times • data [j] = tmp; 1 move x n – 1 times • tmp < data [j-1] 1 comparison x n – 1 • Moves: 2 (n - 1) • Comparisons: n – 1 • O (n) Chapter 9: Sorting
Worst Case • Happens when the data is in reverse order • Outer loop n – 1 • T tmp = data [i]; 1 move x n – 1 times • data [j] = tmp; 1 move x n – 1 times • Inner loop executes, 1, 2, …, n – 1 • For each inner loop, there is 1 comparison • For each inner loop, there is 1 move Chapter 9: Sorting
Worst Case • Comparisons: (1 + 2 + …. + (n - 1)) = n (n – 1) / 2 O(n2) • Moves: (1 + 2 + …. + (n - 1)) + 2 (n – 1) = n (n – 1) / 2 + 2 (n – 1) = n2/2 + 3n/2 – 2 O(n2) Chapter 9: Sorting
Average Case • This is an approximate analysis simpler than the book • Assume the data is randomly ordered • Outer loop n – 1 • T tmp = data [i]; 1 move x n – 1 times • data [j] = tmp; 1 move x n – 1 times • Inner loop executes, 1, 2, …, n – 1 • For each inner loop, there is on average i/2 comparison • For each inner loop, there is on average i/2 moves Chapter 9: Sorting
Average Case • Comparisons: (1 + 2 + …. + (n - 1))/2 = n (n – 1) / 4 O(n2) • Moves: (1 + 2 + …. + (n - 1))/2 + 2 (n – 1) = n (n – 1) / 4 + 2 (n – 1) = n2/4 + 7n/4 – 2 O(n2) Chapter 9: Sorting
Analysis of Insertion Sort • Maximum number of comparisons is O(n*n) • In the best case, number of comparisons is O(n) • The number of shifts performed during an insertion is one less than the number of comparisons or, when the new value is the smallest so far, the same as the number of comparisons • A shift in an insertion sort requires the movement of only one item whereas in a bubble or selection sort an exchange involves a temporary item and requires the movement of three items Chapter 9: Sorting
Selection Sort • Selection sort is a relatively easy to understand algorithm • Sorts an array by making several passes through the array, selecting the next smallest item in the array each time and placing it where it belongs in the array • It attempts to localize exchanges by putting the item directly in its final place. • Efficiency is O(n*n) Chapter 9: Sorting
SelectionSort • Given an array of length n, • Search elements0 through n-1 and select the smallest • Swap it with the element in location 0 • Search elements 1 through n-1 and select the smallest • Swap it with the element in location 1 • Search elements 2 through n-1 and select the smallest • Swap it with the element in location 2 • Search elements 3 through n-1 and select the smallest • Swap it with the element in location 3 • Continue in this fashion until there’s nothing left to search Chapter 9: Sorting
2 2 7 2 2 4 2 4 7 4 5 8 5 8 8 7 5 5 8 5 7 8 4 4 7 Example and analysis of Selection Sort • The Selection Sort might swap an array element with itself--this is harmless, and not worth checking for • Analysis: • The outer loop executes n-1 times • The inner loop executes about n/2 times on average (from n to 2 times) • Work done in the inner loop is constant (swap two array elements) • Time required is roughly (n-1)*(n/2) • You should recognize this asO(n2) Chapter 9: Sorting
Psuedocode for Selection Sort • selectionSort (data [], n) for (i = 1; i < n-1; i++) select smallest element from data[i],…,data [n-1]; swap it with data [i]; Chapter 9: Sorting
C++ Code of Selection Sort • template <class T> • void selectionSort(T data[], int n) { for (int i = 0, j, least; i < n-1; i++) { for (j = i+1, least = i; j < n; j++) if (data [j] < data [least]) least = j; swap (data [least], data [i]); } } Chapter 9: Sorting
Analysis of Selection Sort • template <class T> • void selectionSort(T data[], int n) { for (int i = 0, j, least; i < n-1; i++) { for (j = i+1, least = i; j < n; j++) if (data [j] < data [least]) least = j; swap (data [least], data [i]); } } n - 1 n-1, n-2,…,1 1 comparison 1 swap Chapter 9: Sorting
Best / Worst / Average Case • It does not recognize order of data • Comparisons: 1 + 2 + . . . + (n - 1) = n (n - 1) / 2 O (n2) • Swaps: n – 1 O (n) • Moves: 3 (n – 1) O (n) Chapter 9: Sorting
Analysis of Selection Sort • What conclusion can we make about Selection Sort? • How intelligent is Selection Sort? • What do you notice about the number of swaps? • Modify the algorithm to further reduce the number of swaps. Chapter 9: Sorting
Bubble Sort • Compares adjacent array elements and exchanges their values if they are out of order • Smaller values bubble up to the top of the array and larger values sink to the bottom Chapter 9: Sorting
2 2 2 2 2 2 2 2 2 2 7 2 2 4 7 4 2 7 7 5 5 7 5 4 7 7 5 5 8 5 5 8 5 5 8 4 7 5 4 4 4 8 5 5 5 7 7 4 7 4 7 7 8 4 8 8 8 8 8 4 4 4 8 8 8 2 5 4 7 8 Example of Bubble Sort (done) Chapter 9: Sorting
Psuedocode for Bottom-up Bubble Sort • bubbleSort (data [], n) for (i = 1; i < n-1; i++) for (j = n - 1; j > i; --j) swap items j and j – 1 if out of order; Chapter 9: Sorting
C++ Code of Bottom-up Bubble Sort • template <class T> • void bubbleSort(T data[], int n) { for (int i = 0; i < n - 1; i++) { for (int j = n - 1; j > i; --j) if (data [j] < data [j-1]) swap (data [j], data [j - 1]); } Chapter 9: Sorting
Analysis of Bubble Sort • template <class T> • void bubbleSort(T data[], int n) { for (int i = 0; i < n - 1; i++) { for (int j = n - 1; j > i; --j) if (data [j] < data [j-1]) swap (data [j], data [j - 1]); } n - 1 n-1,…,1 1 comparison 1 swap Chapter 9: Sorting
Worst Case • Reverse ordered data • Comparisons: 1 + 2 + . . . + (n - 1) = n (n - 1) / 2 O (n2) • Swaps: 1 + 2 + . . . + (n - 1) = n (n - 1) / 2 O (n2) • Moves: 3 n (n – 1) / 2 O (n2) Chapter 9: Sorting
Best Case • When array is ordered • Comparisons: 1 + 2 + . . . + (n - 1) = n (n - 1) / 2 O (n2) Can it be improved to O(n) • Swaps: none O (1) • Moves: none O (1) Chapter 9: Sorting
Average Case • Comparisons: 1 + 2 + . . . + (n - 1) = n (n - 1) / 2 O (n2) • Swaps: 1/2 + 2/2 + . . . + (n - 1)/2 = n (n - 1) / 4 O (n2) • Moves: 3 n (n – 1) / 4 O (n2) Chapter 9: Sorting
Analysis of Bubble Sort • Provides excellent performance in some cases and very poor performances in other cases • Main problem: it moves data up, one step at a time • Works best when array is nearly sorted to begin with • Worst case number of comparisons is O(n2) • Worst case number of exchanges is O(n2) • Best case occurs when the array is already sorted • O(n) comparisons • O(1) exchanges Chapter 9: Sorting
Comparison of Quadratic Sorts • None of the algorithms are particularly good for large arrays Chapter 9: Sorting
Comparison of Quadratic Sorts • Bubble Sort, Selection Sort, and Insertion Sort are all O(n2) • As we will see later, we can do much better than this with somewhat more complicated sorting algorithms • WithinO(n2), • Bubble Sort is very slow, and should probably never be used for anything • Selection Sort is intermediate in speed • Insertion Sort is usually the fastest of the three--in fact, for small arrays (say, 10 or 15 elements), insertion sort is faster than more complicated sorting algorithms • Selection Sort and Insertion Sort are “good enough” for small arrays Chapter 9: Sorting
Theoretical Analysis of Sorting Problem • Quadratic algorithms are not efficient ? • Can we obtain better efficiency ? • We need a lower bound or best performance. • We focus on comparisons not moves because in ideal cases, you can avoid any moves (e.g., bubble sort) • What is the best estimate of the number of item comparisons if the array is randomly ordered? Chapter 9: Sorting
Theoretical Analysis of Sorting Problem • Every sorting algorithms can be represented by a decision tree. • n – elements array may have n! ways of ordering • But leaves may be more due to failures or repetitions Chapter 9: Sorting
Theoretical Analysis of Sorting Problem • An i-level complete decision tree has 2i-1leaves and 2i-1 -1 non-terminals and 2i-1 total nodes. • So, all non-complete trees with the same number of i levels have fewer nodes, i.e., k+m 2i-1 (m leaves, k nonterminals) and k 2i-1 -1 and m 2i-1 • n! are the possible ways of ordering an n-array • n! m 2i-1 • log (n!) + 1 i (We assume i is the lowest level referring to max number of comparisons needed and longest path on the tree) Chapter 9: Sorting
Theoretical Analysis of Sorting Problem • i log (n!) + 1 • The path length of a tree with n! nodes should be at least log (n!) • In other words O (log (n!)) is the best to expect in the worst case for the number of comparisons. • The same applies to average case. Chapter 9: Sorting