530 likes | 640 Views
C++ Programming: Program Design Including Data Structures, Second Edition. Chapter 19: Searching and Sorting Algorithms. Objectives. In this chapter you will: Learn various search algorithms Explore how to implement the sequential and binary search algorithms
E N D
C++ Programming: Program Design Including Data Structures, Second Edition Chapter 19: Searching and Sorting Algorithms
Objectives In this chapter you will: • Learn various search algorithms • Explore how to implement the sequential and binary search algorithms • Discover how the sequential and binary search algorithms perform • Become familiar with asymptotic notation C++ Programming: Program Design Including Data Structures, Second Edition
Objectives • Learn the various sorting algorithms • Explore how to implement the bubble, selection, insertion, quick, and merge sorting algorithms • Discover how the sorting algorithms discussed in this chapter perform C++ Programming: Program Design Including Data Structures, Second Edition
Search Algorithms • The most important operation on a list is the search algorithm • Using a search algorithm you can: • Determine whether a particular item is in the list • If the data are specially organized, find the location where a new item can be inserted • Find the location of an item to be deleted • The search algorithm’s performance (search time) is crucial C++ Programming: Program Design Including Data Structures, Second Edition
The Sequential Search • Always starts at the first element in the list • Continues until either the item is found in the list or the entire list is searched • If found, its index is returned • If not found, -1 is returned • To analyze a search algorithm, count the number of key comparisons C++ Programming: Program Design Including Data Structures, Second Edition
The Sequential Search (continued) • Best Case: search item is the first element in the list • The algorithm makes one comparison • Worst Case: search item is the last element in the list • The algorithm makes n comparisons, where n is the number of elements in the list C++ Programming: Program Design Including Data Structures, Second Edition
Sequential Search Algorithm • Best and worst cases are not likely to occur • To determine the average number of comparisons for the successful case of the sequential search algorithm: • Consider all possible cases • Find the number of comparisons for each case • Add the number of comparisons and divide by the number of cases C++ Programming: Program Design Including Data Structures, Second Edition
Sequential Search Algorithm (continued) • Assuming n elements in the list, the following expression gives the average number of comparisons 1+2+…+n n • It is known that 1+2+…+n = n(n+1) 2 C++ Programming: Program Design Including Data Structures, Second Edition
Sequential Search Algorithm • The average number of comparisons in the successful case is: 1+2+…+n = 1 n(n+1) = n+1 n n 2 2 • On average the sequential search algorithm searches half the list • Not good for large lists C++ Programming: Program Design Including Data Structures, Second Edition
Ordered Lists • A list is ordered if its elements are sequenced according to some criteria • Usually in ascending order • These operations are the same for ordered or unordered lists • If list is empty or full • Length of the list • Print the list • Clear the list C++ Programming: Program Design Including Data Structures, Second Edition
Binary Search • A sequential search is not good for large lists because, on average, the sequential search searches half the list • A binary search • Very fast • Can only be performed on ordered lists • Uses the “divide and conquer” technique to search the list C++ Programming: Program Design Including Data Structures, Second Edition
Binary Search (continued) • First the search item is compared with the middle element of the list • If the search item is less than the middle element of the list • Restrict the search to the upper half of the list • Otherwise, search the lower half of the list • Because we need to determine the middle element of the list frequently • Binary search algorithm is typically implemented for array-based lists C++ Programming: Program Design Including Data Structures, Second Edition
Binary Search (continued) • To determine the middle element of the list • Add the starting index to the ending index • Divide by 2 to calculate its index That is: mid = (first + last) / 2 • If the item is found, its location is returned • If it is not in the list, -1 is returned • Two comparisons each time through the loop C++ Programming: Program Design Including Data Structures, Second Edition
Performance of Binary Search • Every iteration cuts the search list in half • Assume n = 1000 • Because 1000 210, the while loop has at most 11 iterations to determine if x is in L • Each iteration makes two comparisons • The binary search makes at most 22 comparisons to determine whether x is in L • By comparison sequential search makes 500 comparisons on average C++ Programming: Program Design Including Data Structures, Second Edition
Performance of Binary Search (continued) • Suppose L is of size 1,000,000 • 1,000,000 220 , therefore the loop has approximately 21 iterations to determine whether an element is in L • Each iteration makes 2 comparisons • Therefore to determine whether an element is in L, a binary search makes at most 42 comparisons C++ Programming: Program Design Including Data Structures, Second Edition
Performance of Binary Search (continued) • By contrast, on average the sequential search makes 500,000 comparisons to determine whether an element is in L • If L is a sorted list of size n, to determine whether or not an element is in L, a binary search makes at most 2*log2n + 2 comparisons C++ Programming: Program Design Including Data Structures, Second Edition
Searches • Unsuccessful Search • In the case of an unsuccessful search of a list of length n, the number of comparisons made by the binary search is approximately: 2*log2 (n + 1) comparisons • Successful Search • In the case of a successful search it can be shown that for a list of length n, on average, a binary search makes: 2(n + 1) log2 (n + 1) – 3 n C++ Programming: Program Design Including Data Structures, Second Edition
Insertion into an Ordered List • To insert an item in an ordered list: • Use an algorithm similar to a binary search to find the place where the item is to be inserted • If item is already in this list Output an appropriate message Else Use insertAt to insert the item in the list C++ Programming: Program Design Including Data Structures, Second Edition
Asymptotic Notations • For a list of size n, assume the function f(n) gives the number of comparisons done by the search algorithm • If we know how the function f(n) grows as the size of the problem grows, we can determine the efficiency of the algorithm C++ Programming: Program Design Including Data Structures, Second Edition
Asymptotic Notations (continued) • This table shows how certain functions grow as the problem size (n) grows C++ Programming: Program Design Including Data Structures, Second Edition
Asymptotic Notations (continued) • If the number of basic operations is a function of f(n) = n2, then the number of operations is quadrupled when n is doubled • If the number of basic operations is a function of f(n) = 2n, then the number of operations is squared when n is doubled • If the number of operations is a function of f(n) = log2 n, then the change in the number of basic operations is insignificant C++ Programming: Program Design Including Data Structures, Second Edition
Asymptotic Notations (continued) • We say that f(n) is Big-O of g(n) written f (n)=O( g(n)), if there exists positive constants c and n0 such that • f(n) ≤ cg(n) for all n ≥ n0 C++ Programming: Program Design Including Data Structures, Second Edition
Asymptotic Notations (continued) C++ Programming: Program Design Including Data Structures, Second Edition
Asymptotic Notations (continued) • Algorithm analysis of the sequential and binary search algorithm C++ Programming: Program Design Including Data Structures, Second Edition
Lower Bound on Comparison-Based Search Algorithms • Theorem: Let L be a list of size n > 1 • Suppose that the elements of L are sorted • If SRH(n) is the minimum number of comparisons needed, in the worst case, by using a comparison-based algorithm to recognize whether an element x is in L, then SRH(n)ƒ≥ƒlog2(n + 1) • Corollary: The binary search algorithm is the optimal worst-case algorithm for solving search problems by the comparison method C++ Programming: Program Design Including Data Structures, Second Edition
Selection Sort: Array-Based Lists • Selection sort algorithm: sorts a list by selecting the smallest element in the list and then moving this element to the top of the list • The first time we locate the smallest item in the entire list • The second time we locate the smallest item in the rest of the list starting, etc. C++ Programming: Program Design Including Data Structures, Second Edition
Bubble Sort • Suppose list[0]...list[n - 1] is a list of n elements, indexed 0 to n – 1 • The bubble sort algorithm works as follows: • During n - 1 iterations, the successive elements, list[index] and list[index+1] are compared • If list[index] is greater than list[index + 1], then the elements list[index] and list[index + 1] are swapped (interchanged) • The bubble sort algorithm is of O(n2) C++ Programming: Program Design Including Data Structures, Second Edition
Bubble Sort (continued) C++ Programming: Program Design Including Data Structures, Second Edition
Bubble Sort (continued) C++ Programming: Program Design Including Data Structures, Second Edition
Bubble Sort (continued) C++ Programming: Program Design Including Data Structures, Second Edition
Bubble Sort (continued) C++ Programming: Program Design Including Data Structures, Second Edition
Selection Sort: Array-Based Lists • For a list of length n > 0, selection sort makes 1/2n2 – 1/2n key comparisons and 3(n-1) item assignments • Selection sorts the list by finding the smallest (or largest) element in the list, and moving it to the beginning (end) of the list C++ Programming: Program Design Including Data Structures, Second Edition
Insertion Sort: Array-Based Lists • Insertion sort algorithm sorts the list by moving each element to its proper place • For a list of length n > 0, on average, the insertion sort makes 1/4n2 +O(n) key comparisons and 1/4n2 + O(n) item assignments C++ Programming: Program Design Including Data Structures, Second Edition
Insertion Sort: Linked List-Based Lists • The insertion sort algorithm can also be applied to linked lists • In a linked list, traversal is in only one direction starting at the first node C++ Programming: Program Design Including Data Structures, Second Edition
Analysis: Insertion Sort • The average number of comparisons and the average number of item assignments in an insertion sort algorithm are: 1 n2 + O(n) = O(n2) 4 • Average Case Behavior for a list of length n C++ Programming: Program Design Including Data Structures, Second Edition
Lower Bound on Comparison-Based Sort Algorithms • Execution of a comparison-based algorithm using a graph is called a comparison tree • Each comparison is a circle, called a node; the rectangle, called a leaf, represents the final ordering of the nodes • The top node is the root node • The straight line that connects the two nodes is called a branch • A sequence of branches from a node, x, to another node, y, is called a path from x to y C++ Programming: Program Design Including Data Structures, Second Edition
Lower Bound on Comparison-Based Sort Algorithms (continued) • Theorem: Let L be a list of n distinct elements • Any sorting algorithm that sorts L by comparisons of keys only, in its worst case, makes at least O(n*log2n) key comparisons C++ Programming: Program Design Including Data Structures, Second Edition
Quick Sort: Array-Based Lists • The quick sort algorithm uses the divide-and-conquer technique to sort a list • The list is partitioned into two sublists, and the two sublists are then sorted and combined into one sorted list C++ Programming: Program Design Including Data Structures, Second Edition
Quick Sort: Array-Based Lists (continued) • The general algorithm is: if(list size is greater than 1) { a. Partition the list into two sublists, say lowerSublist and upperSublist. b. Quick sort lowerSublist. c. Quick sort upperSublist d. Combine the sorted lowerSublist and sorted upperSublist } C++ Programming: Program Design Including Data Structures, Second Edition
Analysis: Quick Sort • Analysis of Quick Sort Algorithm for List of length n C++ Programming: Program Design Including Data Structures, Second Edition
Merge Sort: Linked List-Based Lists • Merge sort uses divide-and-conquer • Partition the list into two sublists, and then combine the sorted sublists into one sorted list • For example, consider the list: List: 35 28 18 45 62 48 30 38 • Merge sort partitions this list into two sublists as follows first sublist: 35 28 18 45 second sublist: 62 48 30 38 C++ Programming: Program Design Including Data Structures, Second Edition
Divide • Because the data are stored in a linked list, we do not know the length of the list • To find the middle of the list we traverse the list with two pointers, say middle and current C++ Programming: Program Design Including Data Structures, Second Edition
Merge • Once the sublists are sorted, merge the sorted sublists • Sublists are merged by comparing the elements of the sublists and adjusting the pointers of the nodes with the smaller info C++ Programming: Program Design Including Data Structures, Second Edition
Analysis: Merge Sort • L is a list of n elements, where n > 0 • A(n) is the number of key comparisons in the average case • W(n) is the number of key comparisons in the worst case to sort L • A(n) = n*log2n - 1.26 = O(n*log2n) • W(n) = n*log2n - (n-1) = O(n*log2n) C++ Programming: Program Design Including Data Structures, Second Edition