Computer Science 112

Computer Science 112 Fundamentals of Programming II Finding Faster Algorithms

Example: Exponentiation Recursive definition: bn = 1, when n = 0 bn = b * bn-1 otherwise def ourPow(base, expo): if expo == 0: return 1 else: return base * ourPow(base, expo – 1) What is the best case performance? Worst case? Average case?

Faster Exponentiation Recursive definition: bn = 1, when n = 0 bn = b * bn-1, when n is odd bn = (bn/2)2, when n is even def fastPow(base, expo): if expo == 0: return 1 elif n % 2 == 1: return base * fastPow(base, expo – 1) else: result = fastPow(base, expo // 2) return result * result What is the best case performance? Worst case? Average case?

The Fibonacci Series 1 1 2 3 5 8 13 . . . fib(n) = 1, when n = 1 or n = 2 fib(n) = fib(n – 1) + fib(n – 2) otherwise def fib(n): if n == 1 or n == 2: return 1 else: return fib(n – 1) + fib(n – 2)

Tracing fib(5)with a Call Tree fib(5) fib(3) fib(4) fib(3) fib(2) fib(2) fib(1) fib(2) fib(1) 1 1 1 1 1

Work Done – Function Calls fib(5) fib(3) fib(4) fib(3) fib(2) fib(2) fib(1) fib(2) fib(1) 1 1 1 1 1 Somewhere between 1n and 2n

Memoization def fib(n): if n == 1 or n == 2: return 1 else: return fib(n – 1) + fib (n – 2) Intermediate values returned by the function can be memoized, or saved in a cache, for subsequent access Then they don’t have to be recomputed!

Memoization def fib(n): cache = dict() deffastFib(n): if n == 1 or n == 2: return 1 elif n in cache: return cache[n] else: value = fastFib(n – 1) + fastFib(n – 2) cache[n] = value return value return fastFib(n) The cache is a dictionary whose keys are the arguments of fib and whose values are the values of fib at those keys

Improving on n2 Sorting • Selection sort uses a linear method within a linear method, so it’s an O(n2) method • Find a way of using a linear method with a method that’s better than linear

A Hint from Binary Search • Binary search is better than linear, because we divide the problem size by 2 on each step • Find a way of dividing the size of sorting problem by 2 on each step, even though each step will itself be linear • This should produce an O(nlogn) algorithm

Quick Sort • Select a pivot element (say, the element at the midpoint) • Shift all of the smaller values to the left of the pivot, and all of the larger values to the right of the pivot (the linear part) • Sort the values to the left and to the right of the pivot (ideally, done logn times)

89 56 63 72 34 95 56 41 41 34 72 89 95 63 Trace of Quick Sort Step 1: select the pivot (at the midpoint) 0 1 2 3 4 5 6 pivot Step 2: shift the data 0 1 2 3 4 5 6 pivot

56 63 41 34 89 95 34 72 56 63 72 89 95 41 Trace of Quick Sort Step 3: sort to the left of the pivot 0 1 2 3 4 5 6 pivot Step 4: sort to the right of the pivot 0 1 2 3 4 5 6 pivot

Design of Quick Sort: First Cut quickSort(lyst, left, right) if left < right pivotPosition = partition(lyst, left, right) quickSort (lyst, left, pivotPosition - 1); quickSort (lyst, pivotPosition + 1, right)

Design of Quick Sort: First Cut quickSort(lyst, left, right) if left < right pivotPosition = partition(lyst, left, right) quickSort (lyst, left, pivotPosition - 1); quickSort (lyst, pivotPosition + 1, right) partition(lyst, left, right) pivotValue = lyst[(left + right) // 2] shift smaller values to left of pivotValue shift larger values to right of pivotValue returnpivotPosition • This version selects the midpoint element as the pivot • The position of the pivot might change during the shifting of data

Implementation of Partition defpartition(lyst, left, right): # Find the pivot and exchange it with the last item middle = (left + right) // 2 pivot = lyst[middle] lyst[middle] = lyst[right] lyst[right] = pivot # Set boundary point to first position boundary = left # Move items less than pivot to the left for index in range(left, right): iflyst[index] < pivot: swap(lyst, index, boundary) boundary += 1 # Exchange the pivot item and the boundary item swap(lyst, right, boundary) return boundary The number of comparisons required to shift values in each sublist is equal to the size of the sublist.

defquickSort(lyst): def recurse(left, right): if left < right: pivotPosition = partition(lyst, left, right) recurse(left, pivotPosition - 1); recurse(pivotPosition + 1, right) def partition(lyst, left, right): # Find the pivot and exchange it with the last item middle = (left + right) // 2 pivot = lyst[middle] lyst[middle] = lyst[right] lyst[right] = pivot # Set boundary point to first position boundary = left # Move items less than pivot to the left for index in range(left, right): if lyst[index] < pivot: swap(lyst, index, boundary) boundary += 1 # Exchange the pivot item and the boundary item swap(lyst, right, boundary) return boundary recurse(0, len(lyst) – 1)

Complexity Analysis The number of comparisons in the top-level call is n The sum of the comparisons in the two recursive calls is also n The sum of the comparisons in the four recursive calls beneath these is also n, etc. Thus, the total number of comparisons equals n * the number of times the list must be subdivided

How Many Times Must the Array Be Subdivided? • It depends on the data and on the choice of the pivot element • Ideally, when the pivot is the median on each call, the list is subdivided log2n times • Best-case behavior is O(nlogn)

41 72 95 34 41 56 63 72 89 56 95 56 72 89 95 34 34 Call Tree For a Best Case We select the midpoint element as the pivot. The median element happens to be at the midpoint on each call. But the list was already sorted!

Worst Case • What if the value at the midpoint is near the largest value on each call? • Or near the smallest value on each call? • Then there will be approximately n subdivisions, and quick sort will degenerate to O(n2)

95 95 34 41 56 63 72 89 95 41 56 63 72 89 95 56 63 72 89 95 63 72 89 95 72 89 95 89 Call Tree For a Worst Case We select the first element as the pivot. The smallest element happens to be the first one on each call. n subdivisions!

Other Methods of Selecting the Pivot Element • Pick a random element • Pick the median of the first three elements • Pick the median of the first, middle, and last elements • Pick the median element - not!! This is an O(n) algorithm

For Friday Working with the Array Data Structure Chapter 4

Computer Science 112