1 / 31

Tradeoffs, intuition analysis, understanding big-Oh aka O-notation

Explore analysis methods in algorithm design, discussing trade-offs, performance vocabulary, and mathematical tools like Big-Oh notation. Compare and reason about algorithm efficiency for different implementations.

irvingh
Download Presentation

Tradeoffs, intuition analysis, understanding big-Oh aka O-notation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tradeoffs, intuitionanalysis, understanding big-Ohaka O-notation Owen Astrachan ola@cs.duke.edu http://www.cs.duke.edu/~ola

  2. Analysis • Vocabulary to discuss performance and to reason about alternative algorithms and implementations • It’s faster! It’s more elegant! It’s safer! It’s cooler! • Use mathematics to analyze the algorithm, • Implementation is another matter • cache, compiler optimizations, OS, memory,…

  3. What do we need? • Need empirical tests and mathematical tools • Compare by running • 30 seconds vs. 3 seconds, • 5 hours vs. 2 minutes • Two weeks to implement code • We need a vocabulary to discuss tradeoffs

  4. Analyzing Algorithms • Three solutions to online problem sort1 • Sort strings, scan looking for runs • Insert into Set, count each unique string • Use map of (String,Integer) to process • We want to discuss trade-offs of these solutions • Ease to develop, debug, verify • Runtime efficiency • Vocabulary for discussion

  5. What is big-Oh about? • Intuition: avoid details when they don’t matter, and they don’t matter when input size (N) is big enough • For polynomials, use only leading term, ignore coefficients: linear, quadratic y = 3x y = 6x-2 y = 15x + 44 y = x2 y = x2-6x+9 y = 3x2+4x

  6. O-notation, family of functions • first family is O(n), the second is O(n2) • Intuition: family of curves, same shape • More formally: O(f(n)) is an upper-bound, when n is large enough the expression cf(n) is larger • Intuition: linear function: double input, double time, quadratic function: double input, quadruple the time

  7. Reasoning about algorithms • We have an O(n) algorithm, • For 5,000 elements takes 3.2 seconds • For 10,000 elements takes 6.4 seconds • For 15,000 elements takes ….? • We have an O(n2) algorithm • For 5,000 elements takes 2.4 seconds • For 10,000 elements takes 9.6 seconds • For 15,000 elements takes …?

  8. More on O-notation, big-Oh • Big-Oh hides/obscures some empirical analysis, but is good for general description of algorithm • Compare algorithms in the limit • 20N hours v. N2 microseconds: • which is better?

  9. cf(N) g(N) x = n More formal definition • O-notation is an upper-bound, this means that N is O(N), but it is also O(N2); we try to provide tight bounds. Formally: • A function g(N) is O(f(N)) if there exist constants c and n such that g(N) < cf(N) for all N > n

  10. Big-Oh calculations from code • Search for element in array: • What is complexity (using O-notation)? • If array doubles, what happens to time? for(int k=0; k < a.length; k++) { if (a[k].equals(target)) return true; } return false; • Best case? Average case? Worst case?

  11. Measures of complexity • Worst case • Good upper-bound on behavior • Never get worse than this • Average case • What does average mean? • Averaged over all inputs? Assuming uniformly distributed random data?

  12. Some helpful mathematics • 1 + 2 + 3 + 4 + … + N • N(N+1)/2 = N2/2 + N/2 is O(N2) • N + N + N + …. + N (total of N times) • N*N = N2 which is O(N2) • 1 + 2 + 4 + … + 2N • 2N+1 – 1 = 2 x 2N – 1 which is O(2N )

  13. 106 instructions/sec, runtimes

  14. Multiplying and adding big-Oh • Suppose we do a linear search then we do another one • What is the complexity? • If we do 100 linear searches? • If we do n searches on an array of size n?

  15. Multiplying and adding • Binary search followed by linear search? • What are big-Oh complexities? Sum? • 50 binary searches? N searches? • What is the number of elements in the list (1,2,2,3,3,3)? • What about (1,2,2, …, n,n,…,n)? • How can we reason about this?

  16. Reasoning about big-O • Given an n-list: (1,2,2,3,3,3, …n,n,…n) • If we remove all n’s, how many left? • If we remove all larger than n/2?

  17. Reasoning about big-O • Given an n-list: (1,2,2,3,3,3, …n,n,…n) • If we remove every other element? • If we remove all larger than n/1000? • Remove all larger than square root n?

  18. Money matters while (n != 0) { count++; n = n/2; } • Penny on a statement each time executed • What’s the cost?

  19. Money matters void stuff(int n){ for(int k=0; k < n; k++) System.out.println(k); } while (n != 0) { stuff(n); n = n/2; } • Is a penny always the right cost/statement? • What’s the cost?

  20. Find k-th largest • Given an array of values, find k-th largest • 0th largest is the smallest • How can I do this? • Do it the easy way… • Do it the fast way …

  21. Easy way, complexity? public int find(int[] list, int index) { Arrays.sort(list); return list[index]; }

  22. Fast way, complexity? public int find(int[] list, int index) { return findHelper(list,index, 0, list.length-1); }

  23. Fast way, complexity? private int findHelper(int[] list, int index, int first, int last) { int lastIndex = first; int pivot = list[first]; for(int k=first+1; k <= last; k++){ if (list[k] <= pivot){ lastIndex++; swap(list,lastIndex,k); } } swap(list,lastIndex,first);

  24. Continued… if (lastIndex == index) return list[lastIndex]; else if (index < lastIndex ) return findHelper(list,index, first,lastIndex-1); else return findHelper(list,index, lastIndex+1,last);

  25. Recurrences int length(ListNode list) { if (0 == list) return 0; else return 1+length(list.getNext()); } • What is complexity? justification? • T(n) = time of length for an n-node list T(n) = T(n-1) + 1 T(0) = O(1)

  26. Recognizing Recurrences • T must be explicitly identified • n is measure of size of input/parameter • T(n) time to run on an array of size n T(n) = T(n/2) + O(1) binary search O( log n ) T(n) = T(n-1) + O(1) sequential search O( n )

  27. Recurrences T(n) = 2T(n/2) + O(1) tree traversal O( n ) T(n) = 2T(n/2) + O(n) quicksort O( n log n) T(n) = T(n-1) + O(n) selection sort O( n2 )

  28. Compute bn, b is BigInteger • What is complexity using BigInteger.pow? • What method does it use? • How do we analyze it? • Can we do exponentiation ourselves? • Is there a reason to do so? • What techniques will we use?

  29. Correctness and Complexity? public BigInteger pow(BigInteger b, int expo) { if (expo == 0) { return BigInteger.ONE; } BigInteger half = pow(b,expo/2); half = half.multiply(half); if (expo % 2 == 0) return half; else return half.multiply(b); }

  30. Correctness and Complexity? public BigInteger pow(BigInteger b, int expo) { BigInteger result =BigInteger.ONE; while (expo != 0){ if (expo % 2 != 0){ result = result.multiply(b); } expo = expo/2; b = b.multiply(b); } return result; }

  31. Solve and Analyze • Given N words (e.g., from a file) • What are 20 most frequently occurring? • What are k most frequently occurring? • Proposals? • Tradeoffs in efficiency • Tradeoffs in implementation

More Related