300 likes | 618 Views
CSC 336 – Algorithms and Data Structures. Sorting and Searching. Dr. Paige H. Meeker Computer Science Presbyterian College, Clinton, SC. Sorting Methods to Discuss. Insertion Sort Quicksort Heapsort Mergesort Shellsort Radix Sort Bucket Sort. Insertion Sort.
E N D
CSC 336 – Algorithms and Data Structures Sorting and Searching Dr. Paige H. Meeker Computer Science Presbyterian College, Clinton, SC
Sorting Methods to Discuss • Insertion Sort • Quicksort • Heapsort • Mergesort • Shellsort • Radix Sort • Bucket Sort
Insertion Sort • Insertion sort is a simple sorting algorithm that is well suited for sorting small data sets or to insert new elements into an already sorted sequence • Worst case is O(n2), so it is not the best method in most cases.
Insertion Sort How does it work? • The idea behind this sort is that, using a list of elements, the first “i” elements are sorted and the remaining elements must be placed into their proper position in the list • When beginning the sequence, a0, a1, …, an, a0 is the only sorted element and the remaining elements (ai) must be sorted into their proper position by comparing them to ai-1, ai-2, etc. If no element aj (such that aj <= ai) is found, then the element ai is inserted at the beginning of the list. After inserting element ai, the length of the sorted part of the list increases by 1 and we start again.
Insertion Sort Example 5 7 0 3 4 2 6 1 // 5 stays, 7-1 unsorted 5 7 0 3 4 2 6 1 // 7 stays, 57 sorted, 0-1 not 0 5 7 3 4 2 6 1 // 0 moves 2 places 0 3 5 7 4 2 6 1 // 3 moves 2 places 0 3 4 5 7 2 6 1 // 4 moves 2 places 0 2 3 4 5 7 6 1 // 2 moves 4 places 0 2 3 4 5 6 7 1 // 6 moves 1 place 0 1 2 3 4 5 6 7 // 1 moves 6 places 17 total comparisons/moves
Insertion Sort • When is the worse case going to occur?
Insertion Sort • When is the worse case going to occur? When, in each step, the proper position for the element that is inserted is found at the beginning of the already sorted sequence. i.e. sequence was already in descending sorted order.
Quick Sort • One of the fastest sorting algorithms in practice; average time is O(nlogn). However, when faced with its worst case, it degenerates to O(n2).
Quicksort How does it work? • Works recursively by a divide and conquer strategy. • Sequence to be sorted is partitioned into 2 parts such that all elements in the first part b are <= all elements in the second part c. Then, the two parts are sorted separately by a recursive application of the same procedure and eventually recombined into one sorted sequence. • First step is to choose a comparison element x and all elements < x are in the first partition and those > x are in the second partition.
Quicksort Example From Wikipedia.com
Quicksort Best Case: • When each recursive step produces partitioning with two parts of equal length Worst Case: • When an unbalanced partitioning occurs, particularly one where 1 element is in one part and all other elements are in the second part.
Heapsort • Data structure of the algorithm is a heap • If the sequence to be sorted is arranged in a max-heap, the greatest element of the heap can be retrieved immediately from the root; the remaining elements are rearranged (logn time per removal) • O(nlogn)
Mergesort • Produces a sorted sequence by sorting each half of the sequence recursively and then merging the two together. • Uses a recursive, divide and conquer strategy like quicksort • O(nlogn)
Mergesort How does it work? • Sequence to be sorted is divided into two halves • Each half is sorted independently • Two sorted halves are merged into a sorted sequence.
Mergesort Example From Wikipedia.com
Mergesort • Drawback: Needs O(n) extra space to store a temporary array to hold the data in between steps.
Shellsort • Fast, easy to understand, easy to implement • O(nlogn) time
Shellsort How does it work? • Arrange the data sequence in a two-dimensional array • Sort the columns of the array • This partially sorts the data – repeat this process with a narrower array (smaller # of columns) – last step, an array of just one column. • Idea is that the number of sorting operations per step is limited based on the presortedness of the sequence because of the previous steps.
Shellsort Example 7 columns Elements 8 and 9 are already at the end of the sequence, but 2 is also there. Let’s do this again… 3 columns Now, the sequence is almost completely sorted – only the 6, 8, and 9 have to move to their correct positions.
Radix Sort aka Postal Sort • Fast sorting algorithms used to sort items that are identified by unique keys. • O(nk), where n is the number of elements and k is the average length of the key.
Radix Sort How does it work? • Take the least significant digit of each key • Sort the list of elements based on that digit, but keep the order of elements with the same digit • Repeat with each significant digit
Radix Sort Example Sorting: 170 45 75 90 2 24 802 66 • Sort my least significant digit (one’s place) – gives: 170 90 2 802 24 45 75 66 • Sort by next digit (ten’s place) – gives: 2 802 24 45 66 170 75 90 • Sorting by most significant digit (hundred’s place) – gives: 2 24 45 66 75 90 170 802
Radix Sort Why does it work? • Requires only a single pass over the data since each item can be placed in its correct position without having to be compared with other items.
Bucket Sort (aka Bin Sort) • Partitioning the array into a finite number of “buckets” and then sorting each bucket. • O(n) time (assuming a uniform distribution of buckets)
Bucket Sort How does it work? • Set up an array of empty “buckets” • Go over the original array, putting each object in its bucket • Sort each non-empty bucket • Put elements from non-empty buckets back into the original array
Bucket Sort Example From Wikipedia.com Elements are distributed among the bins and then sorted in each bin using one of the other sorts we have discussed.
Searching Now that we can order the data, let’s see how we can find what we need within it…
Binary Search • Idea: Cut the search space in half by asking only one question • O(logn)
Interpolation Search • In a binary search, the search space is always divided in two to guarantee logarithmic time; however, when we search for “Albert” in the phone book, we don’t start in the middle – we start toward the front and work from there. That is the idea of an interpolation search.
Interpolation Search • Instead of cutting the search space by a fixed half, we cut it by an amount that seems most likely to succeed. • This amount is determined by interpolation • O(loglogn) • However, this is not a significant improvement over binary search and is more difficult to program.