1 / 36

Analysis of Algorithms

Analysis of Algorithms. Time and Space. to analyze an algorithm means: developing a formula for predicting how fast an algorithm is, based on the size of the input time complexity and/or

snoon
Download Presentation

Analysis of Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Algorithms

  2. Time and Space • to analyze an algorithm means: • developing a formula for predicting how fast an algorithm is, based on the size of the input • time complexity and/or • developing a formula for predicting how much memory an algorithm requires, based on the size of the input • space complexity • usually time is our biggest concern • most algorithms require a fixed amount of space These slides are based on Dr. David Matuszek’s lectures

  3. What Does “Size of the Input” Mean? • if we are searching an array, the “size” of the input could be the size of the array • if we are merging two arrays, the “size” could be the sum of the two array sizes • if we are computing the nth Fibonacci number, or the nth factorial, the “size” is n • we choose the “size” to be a parameter that determines the actual time (or space) required • it is usually obvious what this parameter is • sometimes we need two or more parameters

  4. Characteristic Operations • in computing time complexity, one good approach is to count characteristic operations • what a “characteristic operation” is depends on the particular problem • if searching, it might be comparing two values • if sorting an array, it might be: • comparing two values • swapping the contents of two array locations • both of the above • sometimes we just look at how many times the innermost loop is executed

  5. Exact Values • it is sometimes possible, in assembly language, to compute exact time and space requirements • we know exactly how many bytes and how many cycles each machine instruction takes • for a problem with a known sequence of steps (factorial, Fibonacci), we can determine how many instructions of each type are required • however, often the exact sequence of steps cannot be known in advance • the steps required to sort an array depend on the actual numbers in the array (which we do not know in advance)

  6. Higher-Level Languages • in a higher-level language (such as Java), we do not know how long each operation takes • which is faster,x < 10orx <= 9? • we don’t know exactly what the compiler does with this • the compiler almost certainly optimizes the test anyway (replacing the slower version with the faster one) • in a higher-level language we cannot do an exact analysis • our timing analyses will use majoroversimplifications • nevertheless, we can get some very useful results

  7. Average, Best, and Worst Cases • usually we would like to find the average time to perform an algorithm • however, • sometimes the “average” isn’t well defined • example: Sorting an “average” array • time typically depends on how out of order the array is • how out of order is the “average” unsorted array? • sometimes finding the average is too difficult • often we have to be satisfied with finding the worst (longest) time required • sometimes this is even what we want (say, for time-critical operations) • the best (fastest) case is seldom of interest

  8. Constant Time • constant time means there is some constant k such that this operation always takes k nanoseconds • a Java statement takes constant time if: • it does not include a loop • it does not include calling a method whose time is unknown or is not a constant • if a statement involves a choice (if or switch) among operations, each of which takes constant time, we consider the statement to take constant time • this is consistent with worst-case analysis

  9. Linear Time • We may not be able to predict to the nanosecond how long a Java program will take, but do know some things about timing: • for (i = 0, j = 1; i < n; i++) { j = j * i; } • This loop takes time k*n + c, for some constants k and c • k : How long it takes to go through the loop once (the time forj = j * i, plus loop overhead) • n : The number of times through the loop (we can use this as the “size” of the problem) • c: The time it takes to initialize the loop • the total time k*n + c is linear in n

  10. Constant Time vs. Linear Time • constant time is (usually) better than linear time • suppose we have two algorithms to solve a task: • algorithm A takes 5000 time units • algorithm B takes 100*n time units • which is better? • clearly, algorithm B is better if our problem size is small, that is, ifn < 50 • algorithm A is better for larger problems, withn > 50 • So B is better on small problems that are quick anyway • but A is better for large problems, where it matters more • we usually care most about very large problems • but not always!

  11. The Array subset Problem • Suppose you have two sets, represented as unsorted arrays: • int[] sub = { 7, 1, 3, 2, 5 };int[] super = { 8, 4, 7, 1, 2, 3, 9 }; • and you want to test whether every element of the first set (sub) also occurs in the second set (super): • System.out.println(subset(sub, super)); • (the answer in this case should be false, because subcontains the integer5, and superdoes not) • we are going to write method subset and compute its time complexity (how fast it is) • let’s start with a helper function, member, to test whether one number is in an array

  12. member • static booleanmember(int x, int[] a) { int n = a.length; for (int i = 0; i < n; i++) { if (x == a[i]) return true; } return false;} • if x is not in a, the loop executes n times, wheren =a.length • this is the worst case • if xis in a, the loop executes n/2 times on average • either way, linear time is required: k*n+c

  13. subset static boolean subset(int[] sub, int[] super) { int m = sub.length; for (int i = 0; i < m; i++) if (!member(sub[i], super) return false; return true;} • the loop (and the call to member) will execute: • m =sub.lengthtimes, if subis a subset of super • this is the worst case, and the one we’re most interested in • fewer thansub.lengthtimes (but we don’t know how many) • we would need to figure this out in order to compute average time complexity • the worst case is a linear number of times through the loop • but the loop body doesn’t take constant time, since it calls member, which takes linear time

  14. Analysis of Array subset Algorithm • We have seen that the loop in subset executesm = sub.lengthtimes (in the worst case) • also, the loop in subset calls member, which executes in time linear in n = super.length • hence, the execution time of the array subset method is m*n, along with assorted constants • we go through the loop in subsetm times, calling membereach time • we go through the loop in memberntimes • If mand nare similar, this is roughly quadratic, i.e., n2

  15. What About the Constants? • an added constant,f(n)+ c, becomes less and less important asngets larger • a constant multiplier, k*f(n), does not get less important, but... • improvingkgives a linear speedup (cutting k in half cuts the time required in half) • improvingkis usually accomplished by careful code optimization, not by better algorithms • we aren’t that concerned with only linear speedups! • bottom line – forget the constants!

  16. Simplifying the Formulae • throwing out the constants is one of two things we do in analysis of algorithms • by throwing out constants, we simplify12n2 + 35 to just n2 • our timing formula is a polynomial, and may have terms of various orders (constant, linear, quadratic, cubic, etc.) • we usually discard all but the highest-order term • we simplify n2 + 3n + 5 to just n2

  17. Big O Notation • when we have a polynomial that describes the time requirements of an algorithm, we simplify it by: • throwing out all but the highest-order term • throwing out all the constants • if an algorithm takes 12n3+4n2+8n+35 time, we simplify this formula to just n3 • we say the algorithm requiresO(n3) time • we call this Big O notation • (more accurately, it’s Big , but we’ll leave that for Turpin’s class)

  18. Big O for subset Algorithm • recall that, if n is the size of the set, and m is the size of the (possible) subset: • we go through the loop in subsetm times, calling membereach time • we go through the loop in memberntimes • hence, the actual running time should be k*(m*n) + c, for some constants k and c • we say that subset takes O(m*n)time

  19. Can We Justify Big O Notation? • big O notation is a huge simplification; can we justify it? • it only makes sense for large problem sizes • for sufficiently large problem sizes, the highest-order term swamps all the rest! • considerR = x2 + 3x + 5 as x varies: x = 0 x2 = 0 3x = 0 5 = 5 R = 5 x = 10 x2 = 100 3x = 30 5 = 5 R = 135 x = 100 x2 = 10000 3x = 300 5 = 5 R = 10,305 x = 1000 x2 = 1000000 3x = 3000 5 = 5 R = 1,003,005 x = 10,000 x2 = 108 3x = 3*104 5 = 5 R = 100,030,005 x = 100,000 x2 = 1010 3x = 3*105 5 = 5 R = 10,000,300,005

  20. y = x2 + 3x + 5, for x=1..10

  21. y = x2 + 3x + 5, for x=1..20

  22. Common Time Complexities • O(1) constant time • O(log n) log time • O(n) linear time • O(n log n) log linear time • O(n2) quadratic time • O(n3) cubic time • O(2n) exponential time BETTER WORSE

  23. Simple Sorting Algorithms

  24. Bubble Sort • compare each element (except the last one) with its neighbor to the right • if they are out of order, swap them • this puts the largest element at the very end • the last element is now in the correct and final place • compare each element (except the last two) with its neighbor to the right • if they are out of order, swap them • this puts the second largest element next to last • the last two elements are now in their correct and final places • compare each element (except the last three) with its neighbor to the right • continue as above until you have no unsorted elements on the left

  25. 7 2 2 2 2 2 2 2 2 2 2 2 2 7 7 7 7 7 5 2 4 5 4 4 7 5 5 5 5 8 8 4 7 5 5 5 4 5 8 5 7 7 8 5 4 5 7 4 7 4 7 4 8 8 8 8 4 8 8 4 8 4 4 8 8 2 5 4 7 8 Example of Bubble Sort (done)

  26. Code for Bubble Sort public static void bubbleSort(int[] a) { int outer, inner; for (outer = a.length - 1; outer > 0; outer--) { // counting down for (inner = 0; inner < outer; inner++) { // bubbling up if (a[inner] > a[inner + 1]) { // if out of order... int temp = a[inner]; // ...then swap a[inner] = a[inner + 1]; a[inner + 1] = temp; } } } }

  27. Analysis of Bubble Sort for (outer = a.length - 1; outer > 0; outer--) { for (inner = 0; inner < outer; inner++) { if (a[inner] > a[inner + 1]) { // code for swap omitted} } } • letn = a.length =size of the array • the outer loop is executed n-1times (call itn, that’s close enough) • each time the outer loop is executed, the inner loop is executed • inner loop executesn-1times at first, linearly dropping to just once • on average, inner loop executes aboutn/2times for each execution of the outer loop • in the inner loop, the comparison is always done (constant time), the swap might be done (also constant time) • result is n * n/2 * k, that is,O(n2/2 + k) = O(n2)

  28. Loop Invariants • you run a loop in order to change things • oddly enough, what is usually most important in understanding a loop is finding an invariant: that is, a condition that doesn’t change • in bubble sort, we put the largest elements at the end, and once we put them there, we don’t move them again • the variable outer starts at the last index in the array and decreases to 0 • our invariant is: every element to the right of outer is in the correct place for all j > outer, if i < j, then a[i] <= a[j] • when this is combined with the loop exit test, outer == 0, we know that all elements of the array are in the correct place

  29. Selection Sort • given an array of length n, • search elements0 throughn-1 and select the smallest • swap it with the element in location 0 • search elements1 through n-1 and select the smallest • swap it with the element in location 1 • search elements 2 through n-1 and select the smallest • swap it with the element in location 2 • search elements 3 through n-1 and select the smallest • swap it with the element in location 3 • continue in this fashion until there’s nothing left to search

  30. 7 2 2 2 2 4 4 2 7 4 5 8 8 8 5 7 8 5 5 5 7 4 4 8 7 Example and Analysis of Selection Sort • selection sort might swap an array element with itself – this is harmless, and not worth checking for • analysis: • the outer loop executes n-1 times • the inner loop executes about n/2 times on average (from n to 2 times) • work done in the inner loop is constant (swap two array elements) • time required is roughly (n-1)*(n/2) • you should recognize this asO(n2)

  31. Code for Selection Sort public static void selectionSort(int[] a) { int outer, inner, min; for (outer = 0; outer < a.length - 1; outer++) { min = outer; for (inner = outer + 1; inner < a.length; inner++) { if (a[inner] < a[min]) { min = inner; } // Invariant: for all i, // if outer <= i <= inner, then a[min] <= a[i] } // a[min] is least among a[outer]..a[a.length - 1] int temp = a[outer]; a[outer] = a[min]; a[min] = temp; // Invariant: for all i <= outer, // if i< j then a[i] <= a[j]} }

  32. Invariants for Selection Sort • for the inner loop: • this loop searches through the array, incrementing inner from its initial value of outer+1 up to a.length-1 • as the loop proceeds,minis set to the index of the smallest number found so far • our invariant is: for all i such that outer <= i <= inner, a[min] <= a[i] • for the outer (enclosing) loop: • the loop counts up from outer = 0 • each time through the loop, the minimum remaining value is put in a[outer] • our invariant is:for all i <= outer, if i < j then a[i] <= a[j]

  33. Insertion Sort • the outer loop of insertion sort is:for (outer = 1; outer < a.length; outer++) {...} • the invariant is that all the elements to the left of outerare sorted with respect to one another for all i < outer, j < outer, if i < j then a[i] <= a[j] • this does not mean they are all in their final correct place; the remaining array elements may need to be inserted • when we increaseouter, a[outer-1] becomes to its left; we must keep the invariant true by inserting a[outer-1]into its proper place • this means: • finding the element’s proper place • making room for the inserted element (by shifting over other elements) • inserting the element

  34. sorted next to be inserted 3 4 7 12 14 33 14 20 21 38 10 55 9 23 28 16 temp less than 10 10 10 12 21 14 14 20 33 38 3 4 7 55 9 23 28 16 sorted One Step of Insertion Sort

  35. Analysis of Insertion Sort • we run once through the outer loop, inserting each of n elements; this is a factor of n • on average, there are n/2 elements already sorted • the inner loop looks at (and moves) half of these • this gives a second factor of n/4 • hence, the time required for an insertion sort of an array of nelements is proportional ton2/4 • discarding constants, we find that insertion sort isO(n2)

  36. Summary • bubble sort, selection sort, and insertion sort are all O(n2) • we can do much better than this with somewhat more complicated sorting algorithms (wait for Jeff’s class) • withinO(n2), • bubble sort is very slow, and should probably never be used for anything • selection sort is intermediate in speed • insertion sort is usually the fastest of the three – in fact, for small arrays (say, 10 or 15 elements), insertion sort is faster than more complicated sorting algorithms • selection sort and insertion sort are good enoughfor small arrays

More Related