230 likes | 372 Views
Timing Analysis. Reference: Tremblay and Cheston : Section 5.1. Quality of a project/method is determined by many factors: Simplicity Readability Verifiability Correctness: yields the correct results according to its specification Validity Solves the original problem Robustness
E N D
Timing Analysis Reference: Tremblay and Cheston: Section 5.1
Quality of a project/method is determined by many factors: • Simplicity • Readability • Verifiability • Correctness: yields the correct results according to its specification • Validity • Solves the original problem • Robustness • Ability to handle errors • Modifiability • Reusability • Portability • Ease in moving the project/method to another machine, system, language, etc • Integrity • Able to withstand unauthorized access • Efficiency • Time • Space • In this segment, the focus is on measuring time efficiency. It is an important measure of quality, but certainly not the most important.
publicstaticint count(String[ ] a, String s) { int count = 0; inti = 0; while (i < a.length) { if (s.equals(a[i])) count = count + 1; i = i + 1; } return count; } • Time efficiency • estimate the order of magnitude for the time requirements • usually expressed a rate of growth • The rate of growth is expressed as a function that is dependent upon the size of the problem, the amount of data, and/or the size of structures involved. • Example
I. Statement count approach • If all statements take about the same amount of time then count the number of statements executed and use it as a measure of the amount of time required to execute the algorithm. Usually a reasonable approach, provided that there are no method calls. For the count method, what determines the size of the problem? how many statements are executed?
Size of the problem for the count method: size of the array • Let n = the size of the array • Let Tcount(n) = number of statements executed in running method counton an array of size n Tcount(n) = 3n + q + 4 where q equals the number of times “count = count + 1” is done Best case: q = 0 TcountB(n) = 3n + 4 Worst case: q = n TcountW(n) = 4n + 4 Average/expected case: ??
Notation Definition: O(n) = { function t | c, n0>0, such that t(n) ≤ c*n, n ≥ n0} Intuitively: O(n) is the set of functions which grow linearly or less than linearly O(n) = { n, 5n, 100n + 1000, log(n), 2n + 500*log(n), (log(n))2, 5, 1000000, … } Note that the constant on the fastest growing term doesn’t matter and terms with slower growth don’t matter. Note that Tcount(n) O(n) Alternate notation: Tcount(n) = O(n) Alternate definition: t(n) O(n) if limt(n) < n n Note that L’Hopitals rule is often needed if f(n) and g(n) as n, then limn f(n)/g(n) = limn f (n)/g (n)
II. Statement count approach Determine an operation that is done is often as any other operation (within a constant factor) and is central to the algorithm. This operation is called an active operation. Count the number of times that this operation is done, and use it as a measure of the order of magnitude of the time requirements of the algorithm. What makes a good active operation for the count method?
The active operation for the count method is s.equals(a[i]) • Let K(n) = number of active operations done by the count method for an array of size n It is easy to see that K(n) = n Note that K(n) O(n) The active operation approach is usually much easier and yields the same result. The key is the selection of the active operation. If it isn’t clear what is the active operation, include the count for each possible active operation and add them together. To sure to include the active operations in methods called.
Summation notation: Reference Appendix C.1 of text i=1n xi = x1 + x2 + x3 + … + xn by definition Summation manipulation: i=1n xi = i=1k-1 xi + xk + xk+1 + j=k+2n xj i=1n (2xi + 3)= i=1n 2xi + i=1n 3= 2*i=1n xi + i=1n 3 = 2*i=1n xi + 3*n Useful summation formulae: i=1n i = n(n+1)/2 i=1n i2 = n(n+1)(2n+1)/6 i=0n ai = (an+1 – 1)/(a – 1) for a ≠ 1 i=1n i*ai = nan+1/(a-1) + a(an – 1)/(a – 1)2 for a ≠ 1 (each of these can be proved by mathematical induction)
publicstatic <T extends Comparable<T>> voidinsertionSort(T[ ] a) { inti = 1; while (i < a.length) { T temp = a[i]; int j = i - 1; while (j >= 0 && temp.compareTo(a[j]) < 0) { a[j+1] = a[j]; j = j - 1; } a[j+1] = temp; i = i + 1; } } • Insertion sort algorithm General algorithm: For each item to be sorted insert it into its proper place relative to previous items i.e., move larger items one position further from the front
Timing analysis using statement count approach: Let TinsortW(n) = worst case time for the insertionSort algorithm for an array of size n TinsortW(n) = 1 + 1 + i=1n -1 ( 1 + 1+ 1 + 1 + j=i -10(Decreasing)(1 + 1 + 1) + 1 + 1) = 2 + i=1n -1 ( 4 + j=0i-13 + 2) = 2 + i=1n -1 (6 + 3*i) = 2 + i=1n -1 6 + i=1n -1 3*i = 2 + 6*(n-1) + 3* i=1n -1 i = 2 + 6*(n-1) + 3*(n-1)(n)/2 = 2 + 6n - 6 + 3/2*n2 - 3/2n = 1.5 n2 + 4.5 n – 4 Note that TinsortW(n) O(n)
Definition For f(n) : I+ R+ where I+ is the set of positive integers and R+ is the set of positive real values O(f(n)) = { function t | c, n0>0, such that t(n) ≤ c*f(n), n ≥ n0} Intuitively: O(f(n)) is the set of functions which grow no faster than f(n). If t(n) O(f(n)), then t(n) is in the order of f(n), or more simply, t(n) is order f(n). Alternate definition: t(n) O(f(n)) if limnt(n)/f(n) < TinsortW(n) = 1.5 n2 +4.5 n – 4 O(n2) 50 n + 100 O(n2) n*log(n) O(n2) (log(n))j O(n) for j < n5 O(2n) 2n O(n!) n! O(nn)
Definition For f(n) : I+ R+ where I+ is the set of positive integers and R+ is the set of positive real values (f(n)) = { function t | b, c, n0>0, such that b*f(n) ≤ t(n) ≤ c*f(n), n ≥ n0} If t(n) (f(n)), then t(n) is in the exact order of f(n). Alternate definition: t(n) (f(n)) if limnt(n)/f(n) = c, 0 < c < TcountW(n) = 4n + 4 (n) TinsortW(n) (n2) TcountW(n) (n2) Often = and ≠ are used instead of and .
Action operation analysis of insertion sort algorithm Active operation j >= 0 && temp.compareTo(a[j]) < 0 TinsortW(n) = i=1n -1 ( 1 + j=i -10(Decreasing)(1)) = i=1n -1 ( 1 + i ) = (n – 1) + (n – 1)(n)/2 = n2/2 + n/2 – 1 (n2) TinsortB(n) = i=1n -1 ( 1 ) = n – 1 (n) TinsortE(n) = i=1n -1 ( 1 + (1/2) j=i -10(Decreasing)(1)) assuming usually insert ½ way back ~ n2/4 + 3n/4 (n2)
Does rate of growth make a difference, or are computers so fast that any algorithm can be done quickly? • Consider how large a problem can be solved in one minute using algorithms of different time complexity: • Linear algorithm: 1,000,000,000 • Quadric algorithm: 50,000 • Factorial algorithm (n!): 11 • Some algorithms take huge amounts of time on even very small problems. • Even the difference between linear and quadratic can make a big difference.
Combining growth functions • For sets A and B and element c, define the following: A + B = { a + b | a A and b B} A*B = { a * b | a A and b B} c * B = { c * b | b B} • Using these operations, the following are implied: O(f(n)) + O(g(n)) = O(f(n) + g(n)) k * O(f(n)) = O(k*f(n)) = O(f(n)) for k a constant O(f(n)) + O(g(n)) = O(2*max(f(n), g(n))) = O(max(f(n), g(n)))
More elaborate example: Given int method k(inti) with time Tk(m) = O(log(m)) for some m independent of i. Given void method p with time Tp(n) = O(n2) for some n independent of m. What is the time requirement for each of the following methods: publicvoid r() { q(); for (inti = 1; i < m; i = 2*i) p(); s(); } publicvoid q() { int c = 0; for (inti = 1; i < n + 1; i++) c = c + k(i); } publicvoid s() { int x = 0; for (inti = 0; i < m + 1; i++) x = x + k(i)*k(2*m - i); System.out.println(x); }
Analysis of method q: • active operation: c = c + k(i) • number of times that it is done: n • cost of doing the active operation: O(log(m)) • total cost: n*O(log(m)) = O(n*log(m)) • Analysis of method s: • active operation: x = x + k(i)*k(2*m - i) • number of times that it is done: m + 1 • cost of doing the active operation: 2*O(log(m)) • total cost: (m + 1)*2*O(log(m)) = O(m*log(m))
Tq(n, m) = n*O(log(m)) = O(n*log(m)) • Ts(m) = m*2*O(log(m)) = O(m*log(m)) • Analysis of method r: Three possible active operations, so analyze all three: • active operation: q() • number of times that it is done: 1 • cost of doing the active operation: O(n*log(m)) • cost for q: O(n*log(m)) • active operation: s() • number of times that it is done: 1 • cost of doing the active operation: O(m*log(m)) • cost for s: O(m*log(m)) • active operation: p(); • number of times that it is done: ?? • cost of doing the active operation: O(n2) • cost for p: ?? * O(n2) Total cost for r: cost for active op q + cost for active op s + cost for active op p Tr(n, m) = O(n*log(m)) + O(n*log(m)) + ?? * O(n2)
How many times is the loop in method r done? Let x = number of times that the loop in method r is done Consider the value of i 1, 2, 4, 8, 16, … , 2x-1, 2x Consider when the loop is exited; when i m; i.e., 2x m 2x-1 < m 2x x-1 < log2(m) x (taking the log to the base 2 of each term) x is log2(m) or the next integer value larger than it Therefore x = log2(m) • Notation: p = smallest integer greater than or equal to p, called the ceiling function p = largest integer less than or equal to p, called the floor function
Tr(n, m) = Tq (n, m) + log2(m)*Tp(n) + Ts(m) = O(n*log(m)) + log2(m)*O(n2) + O(m*log(m)) = O(n*log(m) + n2*log(m) + m*log(m)) = O(n2*log(m) + m*log(m)) = O(max(n2, m)*log(m))
staticpublicint f(int n) { int result = 0; double p =n; while (p > 1) { p = p/2; result = result + 1; } return result; } • Example What is the time complexity of the method? How many times is the loop done? Let r = number of times that the loop is done
Consider the values of p n, n/2, n/4, n/8, …, n/2r-1, n/2r <= 1 When the method finishes n/2r <= 1 < n/2r-1 i.e, n <= 2r and 2r-1 < n and taking log to the base 2 of each term r – 1 < log2(n) <= r Therefore, since r is an integer r = log2(n) Thus, f is the function log2(n) for n>= 1, and Tf(n) = (log2(n) ).