1 / 58

Topic Number 8 Algorithm Analysis

Topic Number 8 Algorithm Analysis. " bit twiddling: 1. (pejorative) An exercise in tuning (see tune ) in which incredible amounts of time and effort go to produce little noticeable improvement, often with the result that the code becomes incomprehensible."

roch
Download Presentation

Topic Number 8 Algorithm Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic Number 8Algorithm Analysis "bit twiddling: 1. (pejorative) An exercise in tuning (see tune) in which incredible amounts of time and effort go to produce little noticeable improvement, often with the result that the code becomes incomprehensible." - The Hackers Dictionary, version 4.4.7 Algorithm Analysis

  2. Is This Algorithm Fast? • Problem: given a problem, how fast does this code solve that problem? • Could try to measure the time it takes, but that is subject to lots of errors • multitasking operating system • speed of computer • language solution is written in Algorithm Analysis

  3. Attendance Question 1 • "My program finds all the primes between 2 and 1,000,000,000 in 1.37 seconds." • how good is this solution? • Good • Bad • It depends Algorithm Analysis

  4. Grading Algorithms • What we need is some way to grade algorithms and their representation via computer programs for efficiency • both time and space efficiency are concerns • are examples simply deal with time, not space • The grades used to characterize the algorithm and code should be independent of platform, language, and compiler • We will look at Java examples as opposed to pseudocode algorithms Algorithm Analysis

  5. Big O • The most common method and notation for discussing the execution time of algorithms is "Big O" • Big O is the asymptotic execution time of the algorithm • Big O is an upper bounds • It is a mathematical tool • Hide a lot of unimportant details by assigning a simple grade (function) to algorithms Algorithm Analysis

  6. Typical Big O Functions – "Grades" Algorithm Analysis

  7. Big O Functions • N is the size of the data set. • The functions do not include less dominant terms and do not include any coefficients. • 4N2 + 10N – 100 is not a valid F(N). • It would simply be O(N2) • It is possible to have two independent variables in the Big O function. • example O(M + log N) • M and N are sizes of two different, but interacting data sets Algorithm Analysis

  8. Actual vs. Big O Simplified Time foralgorithm to complete Actual Amount of data Algorithm Analysis

  9. Formal Definition of Big O • T(N) is O( F(N) ) if there are positive constants c and N0 such that T(N) < cF(N) when N > N0 • N is the size of the data set the algorithm works on • T(N) is a function that characterizes the actual running time of the algorithm • F(N) is a function that characterizes an upper bounds on T(N). It is a limit on the running time of the algorithm. (The typical Big functions table) • c and N0 are constants Algorithm Analysis

  10. What it Means • T(N) is the actual growth rate of the algorithm • can be equated to the number of executable statements in a program or chunk of code • F(N) is the function that bounds the growth rate • may be upper or lower bound • T(N) may not necessarily equal F(N) • constants and lesser terms ignored because it is a bounding function Algorithm Analysis

  11. Yuck • How do you apply the definition? • Hard to measure time without running programs and that is full of inaccuracies • Amount of time to complete should be directly proportional to the number of statements executed for a given amount of data • Count up statements in a program or method or algorithm as a function of the amount of data • This is one technique • Traditionally the amount of data is signified by the variable N Algorithm Analysis

  12. Counting Statements in Code • So what constitutes a statement? • Can’t I rewrite code and get a different answer, that is a different number of statements? • Yes, but the beauty of Big O is, in the end you get the same answer • remember, it is a simplification Algorithm Analysis

  13. Assumptions in For Counting Statements • Once found accessing the value of a primitive is constant time. This is one statement: x = y; //one statement • mathematical operations and comparisons in boolean expressions are all constant time. x = y * 5 + z % 3; // one statement • if statement constant time if test and maximum time for each alternative are constants if( iMySuit == DIAMONDS || iMySuit == HEARTS ) return RED; else return BLACK; // 2 statements (boolean expression + 1 return) Algorithm Analysis

  14. Counting Statements in LoopsAttendenance Question 2 • Counting statements in loops often requires a bit of informal mathematical induction • What is output by the following code?int total = 0;for(int i = 0; i < 2; i++) total += 5;System.out.println( total ); A. 2 B. 5 C. 10 D. 15 E. 20 Algorithm Analysis

  15. Attendances Question 3 • What is output by the following code?int total = 0;// assume limit is an int >= 0for(inti = 0; i < limit; i++) total += 5;System.out.println( total ); • 0 • limit • limit * 5 • limit * limit • limit5 Algorithm Analysis

  16. Counting Statements in Nested LoopsAttendance Question 4 • What is output by the following code?int total = 0;for(inti = 0; i < 2; i++) for(int j = 0; j < 2; j++) total += 5;System.out.println( total ); • 0 • 10 • 20 • 30 • 40 Algorithm Analysis

  17. Attendance Question 5 • What is output by the following code?int total = 0;// assume limit is an int >= 0for(inti = 0; i < limit; i++) for(int j = 0; j < limit; j++) total += 5;System.out.println( total ); • 5 • limit * limit • limit * limit * 5 • 0 • limit5 Algorithm Analysis

  18. Loops That Work on a Data Set • The number of executions of the loop depends on the length of the array, values. • How many many statements are executed by the above method • N = values.length. What is T(N)? F(N)? public int total(int[] values){ int result = 0; for(int i = 0; i < values.length; i++) result += values[i]; return result; } Algorithm Analysis

  19. Counting Up Statements • int result = 0; 1 time • int i = 0; 1 time • i < values.length; N + 1 times • i++ N times • result += values[i]; N times • return total; 1 time • T(N) = 3N + 4 • F(N) = N • Big O = O(N) Algorithm Analysis

  20. Showing O(N) is Correct • Recall the formal definition of Big O • T(N) is O( F(N) ) if there are positive constants c and N0 such that T(N) < cF(N) when N > N0 • In our case given T(N) = 3N + 4, prove the method is O(N). • F(N) is N • We need to choose constants c and N0 • how about c = 4, N0 = 5 ? Algorithm Analysis

  21. vertical axis: time for algorithm to complete. (approximate with number of executable statements) c * F(N), in this case, c = 4, c * F(N) = 4N T(N), actual function of time. In this case 3N + 4 F(N), approximate function of time. In this case N No = 5 horizontal axis: N, number of elements in data set Algorithm Analysis

  22. Attendance Question 6 • Which of the following is true? • Method total is O(N) • Method total is O(N2) • Method total is O(N!) • Method total is O(NN) • All of the above are true Algorithm Analysis

  23. Just Count Loops, Right? // assume mat is a 2d array of booleans // assume mat is square with N rows, // and N columns int numThings = 0; for(int r = row - 1; r <= row + 1; r++) for(int c = col - 1; c <= col + 1; c++) if( mat[r][c] ) numThings++; What is the order of the above code? O(1) B. O(N) C. O(N2) D. O(N3) E. O(N1/2) Algorithm Analysis

  24. It is Not Just Counting Loops // Second example from previous slide could be // rewritten as follows: int numThings = 0; if( mat[r-1][c-1] ) numThings++; if( mat[r-1][c] ) numThings++; if( mat[r-1][c+1] ) numThings++; if( mat[r][c-1] ) numThings++; if( mat[r][c] ) numThings++; if( mat[r][c+1] ) numThings++; if( mat[r+1][c-1] ) numThings++; if( mat[r+1][c] ) numThings++; if( mat[r+1][c+1] ) numThings++; Algorithm Analysis

  25. Sidetrack, the logarithm • Thanks to Dr. Math • 32 = 9 • likewise log3 9 = 2 • "The log to the base 3 of 9 is 2." • The way to think about log is: • "the log to the base x of y is the number you can raise x to to get y." • Say to yourself "The log is the exponent." (and say it over and over until you believe it.) • In CS we work with base 2 logs, a lot • log2 32 = ? log2 8 = ? log2 1024 = ? log10 1000 = ? Algorithm Analysis

  26. When Do Logarithms Occur • Algorithms have a logarithmic term when they use a divide and conquer technique • the data set keeps getting divided by 2 public intfoo(int n){ // pre n > 0int total = 0; while( n > 0 ) { n = n / 2; total++; } return total;} • What is the order of the above code? • O(1) B. O(logN) C. O(N) D. O(Nlog N) E. O(N2) Algorithm Analysis

  27. Dealing with other methods • What do I do about method calls? double sum = 0.0; for(int i = 0; i < n; i++) sum += Math.sqrt(i); • Long way • go to that method or constructor and count statements • Short way • substitute the simplified Big O function for that method. • if Math.sqrt is constant time, O(1), simply count sum += Math.sqrt(i); as one statement. Algorithm Analysis

  28. Dealing With Other Methods public intfoo(int[] list){ int total = 0; for(inti = 0; i < list.length; i++){ total += countDups(list[i], list); } return total; } // method countDups is O(N) where N is the // length of the array it is passed What is the Big O of foo? • O(1) B. O(N) C. O(NlogN) D. O(N2) E. O(N!) Algorithm Analysis

  29. Quantifiers on Big O • It is often useful to discuss different cases for an algorithm • Best Case: what is the best we can hope for? • least interesting • Average Case (a.k.a. expected running time): what usually happens with the algorithm? • Worst Case: what is the worst we can expect of the algorithm? • very interesting to compare this to the average case Algorithm Analysis

  30. Best, Average, Worst Case • To Determine the best, average, and worst case Big O we must make assumptions about the data set • Best case -> what are the properties of the data set that will lead to the fewest number of executable statements (steps in the algorithm) • Worst case -> what are the properties of the data set that will lead to the largest number of executable statements • Average case -> Usually this means assuming the data is randomly distributed • or if I ran the algorithm a large number of times with different sets of data what would the average amount of work be for those runs? Algorithm Analysis

  31. Another Example • T(N)? F(N)? Big O? Best case? Worst Case? Average Case? • If no other information, assume asking average case public double minimum(double[] values){ int n = values.length; double minValue = values[0]; for(int i = 1; i < n; i++) if(values[i] < minValue) minValue = values[i]; return minValue;} Algorithm Analysis

  32. Independent Loops // from the Matrix class public void scale(int factor){ for(int r = 0; r < numRows(); r++) for(int c = 0; c < numCols(); c++) iCells[r][c] *= factor; } Assume an numRows() = N and numCols() = N. In other words, a square Matrix. What is the T(N)? What is the Big O? O(1) B. O(N) C. O(NlogN) D. O(N2) E. O(N!) Algorithm Analysis

  33. Significant Improvement – Algorithm with Smaller Big O function • Problem: Given an array of ints replace any element equal to 0 with the maximum value to the right of that element. Given: [0, 9, 0, 8, 0, 0, 7, 1, -1, 0, 1, 0] Becomes: [9, 9, 8, 8, 7, 7, 7, 1, -1, 1, 1, 0] Algorithm Analysis

  34. Replace Zeros – Typical Solution public void replace0s(int[] data){ int max; for(int i = 0; i < data.length -1; i++){ if( data[i] == 0 ){ max = 0; for(int j = i+1; j<data.length;j++) max = Math.max(max, data[j]); data[i] = max; } } } Assume most values are zeros. Example of a dependent loops. Algorithm Analysis

  35. Replace Zeros – Alternate Solution public void replace0s(int[] data){ int max = Math.max(0, data[data.length – 1]); int start = data.length – 2; for(int i = start; i >= 0; i--){ if( data[i] == 0 ) data[i] = max; else max = Math.max(max, data[i]); } } Big O of this approach? O(1) B. O(N) C. O(NlogN) D. O(N2) E. O(N!) Algorithm Analysis

  36. A Caveat • What is the Big O of this statement in Java? int[] list = new int[n]; • O(1) B. O(N) C. O(NlogN) D. O(N2) E. O(N!) • Why? Algorithm Analysis

  37. Summing Executable Statements • If an algorithms execution time is N2 + N the it is said to have O(N2) execution time not O(N2 + N) • When adding algorithmic complexities the larger value dominates • formally a function f(N) dominates a function g(N) if there exists a constant value n0 such that for all values N > N0 it is the case that g(N) < f(N) Algorithm Analysis

  38. Example of Dominance • Look at an extreme example. Assume the actual number as a function of the amount of data is: N2/10000 + 2Nlog10 N+ 100000 • Is it plausible to say the N2 term dominates even though it is divided by 10000 and that the algorithm is O(N2)? • What if we separate the equation into (N2/10000) and (2N log10 N + 100000) and graph the results. Algorithm Analysis

  39. Summing Execution Times • For large values of N the N2 term dominates so the algorithm is O(N2) • When does it make sense to use a computer? red line is 2Nlog10 N + 100000 blue line is N2/10000 Algorithm Analysis

  40. Comparing Grades • Assume we have a problem • Algorithm A solves the problem correctly and is O(N2) • Algorithm B solves the same problem correctly and is O(N log2N ) • Which algorithm is faster? • One of the assumptions of Big O is that the data set is large. • The "grades" should be accurate tools if this is true Algorithm Analysis

  41. Running Times • Assume N = 100,000 and processor speed is 1,000,000,000 operations per second Algorithm Analysis

  42. Theory to Practice ORDykstra says: "Pictures are for the Weak." Times in Seconds. Red indicates predicated value. Algorithm Analysis

  43. Change between Data Points Value obtained by Timex / Timex-1 Algorithm Analysis

  44. Okay, Pictures Algorithm Analysis

  45. Put a Cap on Time Algorithm Analysis

  46. No O(N^2) Data Algorithm Analysis

  47. Just O(N) and O(NlogN) Algorithm Analysis

  48. Just O(N) Algorithm Analysis

  49. Reasoning about algorithms • We have an O(N) algorithm, • For 5,000 elements takes 3.2 seconds • For 10,000 elements takes 6.4 seconds • For 15,000 elements takes ….? • For 20,000 elements takes ….? • We have an O(N2) algorithm • For 5,000 elements takes 2.4 seconds • For 10,000 elements takes 9.6 seconds • For 15,000 elements takes …? • For 20,000 elements takes …? Algorithm Analysis

  50. A Useful Proportion • Since F(N) is characterizes the running time of an algorithm the following proportion should hold true: F(N0) / F(N1) ~= time0 / time1 • An algorithm that is O(N2) takes 3 seconds to run given 10,000 pieces of data. • How long do you expect it to take when there are 30,000 pieces of data? • common mistake • logarithms? Algorithm Analysis

More Related