830 likes | 1.56k Views
Searching and Sorting. Week 11. 1. Searching and Sorting. Fundamental problems in computer science and programming Sorting done to make searching easier Multiple different algorithms to solve the same problem How do we know which algorithm is "better"? Look at searching first
E N D
Searching and Sorting Week 11 1
Searching and Sorting • Fundamental problems in computer science and programming • Sorting done to make searching easier • Multiple different algorithms to solve the same problem • How do we know which algorithm is "better"? • Look at searching first • Examples will use arrays of integers to illustrate algorithms
SEARCHING ARRAYS In many applications it is necessary to be able to locate a particular data item in a collection of data items. For example: We might need to locate an item in an inventory by part number; or to locate a particular employee by social security # or name; or to locate a particular book by ISBN number, title, or author. Many algorithms have been developed for searching for a particular data item in a collection. In this chapter we will discuss two search algorithms useful for locating a particular item in an array.
The Sequential Search Algorithm The simplest search algorithm is the sequential search. A sequential search locates an item by comparing each item in the collection with the value being searched for. The search ends when the value is found or when the collection is exhausted.
The Sequential Search Algorithm /**The sequentialSearch method searches an array for a value. @param array The array to search. @param value The value to search for. @return The subscript of the value if found in the array, otherwise -1. */ public static int sequentialSearch(int[ ] array, int value) { int index, element; boolean found; // Flag indicating search results index = 0; element = -1; found = false; while (!found && index < array.length) { if (array[index] == value) { found = true; element = index; } index++; } return element; }
Testing Strategy for a Search Algorithm When testing a search algorithm try to search for the first value in the list, the last value in the list, some value in the middle, and some value that is not in the list.
The Sequential Search Algorithm How would you modify the sequentialSearch method if you wanted to search an array of strings? Try this on a computer. public static int sequentialSearch(int[ ] array, int value) { int index, element; boolean found; // Flag indicating search results index = 0; element = -1; found = false; while (!found && index < array.length) { if (array[index] == value) { found = true; element = index; } index++; } return element; }
Performance of the Sequential Search Algorithm In the worst case, the data item is not in the array or the data item is the last element in the array. Using the sequential search we check all the elements to determine this. If the array has 250 elements, 250 elements must be checked. We say that the sequential search has a worst case performance of order n where n is the number of array elements. On average, if there is equal probability of searching for any element in the array, the performance of the sequential search is order n/2, because over all the searches on average we have to search half the list. As the number of unsuccessful searches increases, the average number of comparisons also increases. Although the sequential search is easy to understand and code, it should not be used on large arrays.
The Binary Search Algorithm The binary search algorithm is a more efficient search algorithm. The binary search works on a sorted collection of data items. In the binary search, the item to be located is compared with the middle element of the sorted collection, if they match the search ends. Otherwise, we determine whether the item belongs before or after the middle element of the array and continue the search in the proper half of the array using the binary search algorithm.
The Binary Search Algorithm Suppose we are searching for 101 in the array shown below: array[first] array[middle] array[last] [0] [1] [2] [3] [4] [5] [6] 13 17 19 27 84 97 101 array address array[middle] array[first] array[last] [0] [1] [2] [3] [4] [5] [6] 13 17 19 27 84 97 101 array address array[middle] array[first] array[last] [0] [1] [2] [3] [4] [5] [6] 13 17 19 27 84 97 101 array address array[middle] array[first] array[last] found is true [0] [1] [2] [3] [4] [5] [6] 13 17 19 27 84 97 101 array address
The Binary Search Algorithm Suppose we are searching for 15 in the array shown below:
The Binary Search Algorithm Abinary search on an array that is in ascending order: Set first to 0. Set last to the last subscript in the array. Set position to -1. Set found to false. While found is not true and first is less than or equal to last Set middle to the subscript half-way between array[first] and array[last]. If array[middle] equals the desired value Set found to true. Set position to middle. Else If array[middle] is greater than the desired value Set last to middle - 1. Else Set first to middle + 1. End If. End While. Return position.
The Binary Search Algorithm public static int binarySearch(int[ ] array, int value) { int first, last, middle; int position; // Position of search value boolean found; // Flag first = 0; last = array.length - 1; position = -1; found = false; while (!found && first <= last) { middle = (first + last) / 2; // Note: integer division if (array[middle] == value) // If value is found at midpoint... { found = true; position = middle; } else if (array[middle] > value) // else if value is in lower half... last = middle - 1; else // else if value is in upper half.... first = middle + 1; } return position; // Return the position of the item, or –1 if it was not found. }
The Binary Search Algorithm How would you modify the method binarySearch if you wanted to search an array of floats that was sorted in descending order? Try this on a computer.
The Binary Search Algorithm public static int binarySearch(int[ ] array, int value) { int first, last, middle; int position; // Position of search value boolean found; // Flag first = 0; last = array.length - 1; position = -1; found = false; while (!found && first <= last) { middle = (first + last) / 2; // Note: integer division if (array[middle] == value) // If value is found at midpoint... { found = true; position = middle; } else if (array[middle] > value) // else if value is in lower half... last = middle - 1; else // else if value is in upper half.... first = middle + 1; } return position; // Return the position of the item, or –1 if it was not found. }
The Performance of the Binary Search Algorithm Every time the binary search makes a comparison with the middle element and fails to find the desired item, half the array is eliminated. In the worst case, the number of comparisons that are required is k, where 2k is greater than or equal to the number of elements in the array, n. If there are n elements in the array, binary search has a worst case performance of log2n.
SORTING Quite often it is necessary to arrange or sort items in a particular order.
SORTING Many sorting algorithms exist that can be used to sort a collection of data items. In this section, we will study the selection sort algorithm and its implementation in a method that sorts the contents of an array of integers.
SORTING A collection of data items may be sorted in ascending or descending order. The values in an array are sorted in ascending order if the value of the element at subscript i is greater than or equal to the value stored at subscript i - 1, for all subscripts i. The values in an array are sorted in descending order if the value of the element at subscript i is less than or equal to the value stored at subscript i - 1, for all subscripts i.
Selection Sort To sort an array in ascending order using selection sort, the array is scanned looking for the smallest value. This value is exchanged with the value at subscript 0. Then the array is scanned looking for the second smallest value, this value is exchanged with the value subscript 1. This process continues until all the elements in the array are in the proper order.
Selection Sort The text provides the following pseudocode for an ascending order selection sort on an array: For startScan is each subscript in the array from 0 through the next-to-last subscript Set minIndex variable to startScan. Set minValue variable to array[startScan]. For index is each subscript in the array from (startScan + 1) through the last subscript If array[index] is less than minValue Set minValue to array[index]. Set minIndex to index. End If. Increment index. End For. Set array[minIndex] to array[startScan]. Set array[startScan] to minValue. End For.
Selection Sort The following source code for a method called selectionSort that can be used to sort an integer array in ascending order: public static void selectionSort(int[ ] array) { intstartScan, index, minIndex, minValue; for(startScan = 0; startScan < (array.length - 1); startScan++) { minIndex = startScan; minValue = array[startScan]; for(index = startScan + 1; index < array.length; index++) { if (array[index] < minValue) { minValue = array[index]; minIndex = index; } } array[minIndex] = array[startScan]; array[startScan] = minValue; } }
Selection Sort What would need to be changed in this method if we wanted to sort an array of doubles instead of an array of integers? Make these change(s) and test the result using the computer. public static void selectionSort(int[ ] array) { int startScan, index, minIndex, minValue; for(startScan = 0; startScan < (array.length - 1); startScan++) { minIndex = startScan; minValue = array[startScan]; for(index = startScan + 1; index < array.length; index++) { if (array[index] < minValue) { minValue = array[index]; minIndex = index; } } array[minIndex] = array[startScan]; array[startScan] = minValue; } }
Selection Sort What would need to be changed in this method if we wanted to sort an array of strings instead of an array of integers? Make these change(s) and test the result using the computer. public static void selectionSort(int[ ] array) { int startScan, index, minIndex, minValue; for(startScan = 0; startScan < (array.length - 1); startScan++) { minIndex = startScan; minValue = array[startScan]; for(index = startScan + 1; index < array.length; index++) { if (array[index] < minValue) { minValue = array[index]; minIndex = index; } } array[minIndex] = array[startScan]; array[startScan] = minValue; } }
Selection Sort What would need to be changed in this method if we wanted to sort the array in descending order instead of ascending order? Make these change(s) and test the result using the computer. public static void selectionSort(int[ ] array) { int startScan, index, minIndex, minValue; for(startScan = 0; startScan < (array.length - 1); startScan++) { minIndex = startScan; minValue = array[startScan]; for(index = startScan + 1; index < array.length; index++) { if (array[index] < minValue) { minValue = array[index]; minIndex = index; } } array[minIndex] = array[startScan]; array[startScan] = minValue; } } 26
Testing Strategy for a Sorting Algorithm When testing a sorting algorithm try to sort a list that is in the reverse order of a sorted list, sort a list that is already sorted, sort a list that is somewhat out of order, and sort a list with duplicate elements.
Stable Sorting • A property of sorts • If a sort guarantees the relative order of equal items stays the same then it is a stable sort • [71, 6, 72, 5, 1, 2, 73, -5] • subscripts added for clarity • [-5, 1, 2, 5, 6, 71, 72, 73] • result of stable sort • Real world example: • sort a table in Wikipedia by one criteria, then another • sort by country, then by major wins
Insertion Sort • Another of the O(N^2) sorts • The first item is sorted • Compare the second item to the first • if smaller swap • Third item, compare to item next to it • need to swap • after swap compare again • And so forth…
Insertion Sort Code public void insertionSort(int[] list) { int temp, j; for(inti = 1; i < list.length; i++) { temp = list[i]; j = i; while( j > 0 && temp < list[j - 1]) { // swap elements list[j] = list[j - 1]; list[j - 1] = temp; j--; } } }
Comparing Algorithm • Which algorithm do you think will be faster given random data, selection sort or insertion sort? • Why?
Sub Quadratic Sorting Algorithms Sub Quadratic means having a Big O better than O(N2)
Shell Sort • Created by Donald Shell in 1959 • Wanted to stop moving data small distances (in the case of insertion sort and bubble sort) and stop making swaps that are not helpful (in the case of selection sort) • Start with sub arrays created by looking at data that is far apart and then reduce the gap size
Quick Sort • Invented by C.A.R. (Tony) Hoare • A divide and conquer approach that uses recursion • If the list has 0 or 1 elements it is sorted • otherwise, pick any element p in the list. This is called the pivot value • Partition the list minus the pivot into two sub lists according to values less than or greater than the pivot. (equal values go to either) • return the quicksort of the first list followed by the quicksort of the second list
Quick Sort in Action 39 23 17 90 33 72 46 79 11 52 64 5 71 Pick middle element as pivot: 46 Partition list 23 17 5 33 39 11 46 79 72 52 64 90 71 quick sort the less than list Pick middle element as pivot: 33 23 17 5 11 33 39 quicksort the less than list, pivot now 5 {} 5 23 17 11 quicksort the less than list, base case quicksort the greater than list Pick middle element as pivot: 17and so on….
Merge Sort Don Knuth cites John von Neumann as the creatorof this algorithm • If a list has 1 element or 0 elements it is sorted • If a list has more than 2 split into into 2 separate lists • Perform this algorithm on each of those smaller lists • Take the 2 sorted lists and merge them together
Merge Sort When implementing, one temporary array is used instead of multiple temporary arrays. Why?
Final Comments • Language libraries often have sorting algorithms in them • Java Arrays and Collections classes • C++ Standard Template Library • Python sort and sorted functions • Hybrid sorts • when size of unsorted list or portion of array is small use insertion sort, otherwise use O(N log N) sort like Quicksort of Mergesort • Many other sorting algorithms exist.