Searching and sorting

Searching and sorting • Reasons for not using the most efficient algorithm include: • The more efficient algorithm is much more complicated. Therefore, it is harder to maintain and debug. • The more efficient algorithm requires another part of the system to prepare data and preparing this data takes a long time. • For example, binary search requires sorted data • As we shall see the sorting algorithms are relatively complicated. • And they are slow – the best algorithms are O (n log n). • Making searching faster by going from O (n) to O (log n) is not worth it if the price is having to use a new algorithm that is O(n log n). • However, society has organized itself so that sorting does not have to be used too often • For example, banks only sort their data once per day Computer Science I - Martin Hardwick

void banking::Insert( /* in */ acct item ) // Inserts into a list of bank accounts // Data is vector <acct> { int index = data.size() - 1; data.push_back (item); while (index >= 0 && item.get_num() < data[index].get_num()) { data[index+1] = data[index]; index--; } data[index+1] = item; // Insert item } As we have seen, using the Insert() operation for the SortedList class repeatedly allows us to create a sorted list. but is this efficient? In the worst case, each call to Insert() moves the entire list one slot. first call moves 0 items second call moves 1 item third call moves 2 items . . . Nth call moves N-1 items. For a list of length N, this requires: 0 + 1 + 2 + … + (n-1) moves = Creating a Sorted List Computer Science I - Martin Hardwick

Discussion • Repeated use of the Insert() operation sorts a list using O(n2) moves and O(n2) comparisons. • each move is preceded by a comparison between variable item and the list element to be moved • We can do better than repeated use of the Insert() operation, however. • There is a simple sorting algorithm that uses O(n2) comparisons but only O(n) moves. • it is called Selection Sort Computer Science I - Martin Hardwick

#include <iostream> #include <fstream> #include <vector> using namespace std; void banking::selection(vector <int> &list) //PURPOSE: sort list into ascending // order using selection sort //POSTCONDITIONS: reorders list // into ascending order { // size of list int length = list.size(); // position of min number int minLoc; // top of sorted part of list int top; Given a list of numbers, we want to sort the list into ascending order. We will do the sorting in a function that can be moved to any program that needs to sort a list. Selection Sort (1) Computer Science I - Martin Hardwick

top k k k k k k k k k k k k k k k k minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc minLoc Selection Sort Algorithm • At any point during the algorithm, the first part of the list is sorted and the last part is unsorted. • initially the first part is empty and the last part is the entire list • Search the unsorted part of the list for the smallest number. • Swap this smallest number to beginning of unsorted part of the list. • Repeat this process until the list is sorted. sorted sorted unsorted unsorted 1 2 3 4 20 10 19 9 18 8 17 7 16 6 15 5 14 13 12 5 20 20 temp Computer Science I - Martin Hardwick

// swap numbers in list int temp; // loop variable int k; // loop to sort list for (top=0; top<=length-2; top++) { // find smallest number in unsorted // part of the list minLoc = top; for (k=top+1; k<=length-1; k++) { if (list[k] < list[minLoc]) { minLoc = k; } } In a list of length n, once n-1 numbers are in their final positions, the list is sorted. by default, the remaining number must be in the correct position Start by assuming that the first number in the unsorted part of the list is the smallest remaining number. search the rest of the unsorted list for something smaller remember where the smallest number is Selection Sort (2) Computer Science I - Martin Hardwick

// swap smallest number // into its final position in // the sorted list temp = list[top]; list[top] = list[minLoc]; list[minLoc] = temp; } } The program swaps two elements of the list. the smallest remaining number and the first number in the unsorted part of the list Swapping requires an extra variable (named temp in this case) to temporarily hold the value of one of the variables being swapped. The sorted part of the list is now one element longer. the unsorted part is one element shorter Selection Sort (3) Computer Science I - Martin Hardwick

int main () //PURPOSE: test selection function //PRECONDITIONS: file list.txt in // project folder //POSTCONDITIONS: rets 0 if success { vector <int> list; int k; // loop variable int tmp; // tmp variable ifstream listfile; // file with list // read the numbers into the list // and display them on the screen listfile.open("list.txt"); listfile >> tmp; list.push_back (tmp); cout << "Original List:" << endl; while (!listfile.eof()) { cout << tmp << "\t"; listfile >> tmp; list.push_back (tmp); } Selection Sort (3) • Read the list of numbers from a file named list.txt . • Display the list before sorting. Computer Science I - Martin Hardwick

// sort the list selection(list); // display the sorted list cout << endl << endl << "Sorted List:" << endl; for (k=0; k<list.size(); k++) { cout << list[k] << "\t"; } cout << endl << endl; return 0; } Function selection() sorts the list. remember that a vector must be passed by reference After sorting, the list is displayed again so that the new sorted order can be seen. Selection Sort (4) Computer Science I - Martin Hardwick

Program Results Computer Science I - Martin Hardwick

Performance Of Selection Sort • The selection sort function has nested for loops as follows: • The outer loop is executed n-1 times, where n is the length of the list. • The inner loop executes: n-1, n-2, n-3, . . ., 2, 1 times on successive iterations of the outer loop, respectively. • this is an average of about n/2 times for large n • Thus, the comparison inside the inner loop is made about (n-1)x(n/2) = 1/2(n2 -n) times. • Therefore, selection sort is an O(n2) algorithm because of the number of comparisons that it does. • However, it does O(n) moves, making it more efficient that repeated calls to the Insert() operator for the SortedList class. for (top=0; top<=length-2; top++) { . . . for (k=top+1; k<=length-1; k++) { comparisonof list elements } } Computer Science I - Martin Hardwick

n2 nlog2 n n What Does This Mean? • The most efficient sorting algorithms can be shown to be O(nlog2 n), where n is the size of the list to sort. • As can be seen in the graph O(nlog2 n) is much better than O(n2) when n gets large. Computer Science I - Martin Hardwick

More efficient sorting • The best sorting algorithms are O (n log n) • These algorithms require a programming technique called recursion • A recursive function is one that calls itself • A recursive function must have at least two parts • A part that solves a simple case of the problem without recursion • A part that makes the problem simpler and then uses recursion • Here is a simple recursive function • More on recursion next time int factorial (int val) { if (val == 1) // simple part return 1; else // simplification part return (val * factorial (val-1)); } Computer Science I - Martin Hardwick

Searching and sorting

Searching and sorting

Presentation Transcript

Searching and Sorting

Sorting and Searching

Searching and Sorting

SORTING AND SEARCHING

Searching and Sorting

Searching and Sorting

Sorting and Searching

Searching and Sorting

Searching and Sorting

Sorting and Searching

Searching and Sorting

Searching and Sorting

SEARCHING AND SORTING

Searching and Sorting

Searching and Sorting

Sorting and Searching

Searching and Sorting

Searching and Sorting

Sorting and Searching

Searching and Sorting

Sorting and Searching