530 likes | 548 Views
Algorithm Analysis. CSCI 3333 Data Structures. by Dr. Bun Yue Professor of Computer Science yue@uhcl.edu http://sce.uhcl.edu/yue/ 2013. Acknowledgement. Mr. Charles Moen Dr. Wei Ding Ms. Krishani Abeysekera Dr. Michael Goodrich. Algorithm.
E N D
Algorithm Analysis CSCI 3333 Data Structures byDr. Bun YueProfessor of Computer Scienceyue@uhcl.eduhttp://sce.uhcl.edu/yue/2013
Acknowledgement • Mr. Charles Moen • Dr. Wei Ding • Ms. Krishani Abeysekera • Dr. Michael Goodrich
Algorithm • “A finite set of instructions that specify a sequence of operations to be carried out in order to solve a specific problem or class of problems.”
Example Algorithm • What do you think about this algorithm?
Some Properties of Algorithms • A deterministic sequential algorithm should be: • Finiteness: Algorithm must complete after a finite number of instructions. • Absence of ambiguity: Each step must be clearly defined, having only one interpretation. • Definition of sequence: Each step must have a unique defined preceding & succeeding step. The first step (start step) & last step (halt step) must be clearly noted. • Input/output: There must be a specified number of input values and result values. • Feasibility: It must be possible to perform each instruction.
Specifying Algorithms • There are no universal formal. • By English. • By Pesudocode • By a programming language (not usually desirable.)
Specifying Algorithms • Must be included: • Input: type and meaning • Output: type and meaning • Instruction sequences. • May be included: • Assumptions • Side effects • Pre-conditions and post-conditions
Specifying Algorithms • Like sub-programs, algorithms can be broken down into sub-algorithms to reduce complexity. • High level algorithms may use English more. • Low level algorithms may use pseudocode more.
Pseudocode • High-level description of an algorithm • More structured than English prose • Less detailed than a program • Preferred notation for describing algorithms • Hide program design issues
Example: ArrayMax AlgorithmarrayMax(A, n) Input array A of n integers Output maximum element value of A currentMax A[0] for i 1 to n 1 do if A[i] currentMax then currentMax A[i] return currentMax • What do you think?
Example Algorithm: BogoSort Algorithm BogoSort(A); Do while (A is not sorted) randomize(A); End do; • What do you think?
Better Specification Algorithm BogoSort(A) Input: A: an array of comparable elements. Output: none Side Effect: A is sorted. while not sorted(A) randomize(A); end while;
Sorted(A) Algorithm Sorted(A) Input: A: an array of comparable elements. Output: a boolean value indicating whether A is sorted (in non-descending order); for (i=0; i<A.size-2; i++) if A[i] > A[i+1] return false; end for; return true;
BogoSort • Is correct! • Terrible performance!
Criteria for Judging Algorithms • Correctness • Accuracy (especially for numerical methods and approximation problems) • Performance • Simplicity • Ease of implementations
How Bogosort Fare? • Correctness: yes • Accuracy: yes (not really applicable) • Performance: terrible even for small arrays • Simplicity: yes • Ease of implementation: medium
Summation Algorithm • Correctness: ok. • Performance: bad, proportional to N. • Simplicity: not too bad • Ease: not too bad
Summation Algorithm • A better summation algorithm Algorithm Summation(N) Input: N: Natural Number Output: Summation of 1 to N. return N * (N+1) / 2; • Performance: constant time independent of N.
Performance of Algorithms • Performance of algorithms can be measured by runtime experiments, called benchmark analysis.
Experimental Studies • Write a program implementing the algorithm • Run the program with inputs of varying size and composition • Use a method like System.currentTimeMillis() to get an accurate measure of the actual running time • Plot the results
Limitations of Experiments • It is necessary to implement the algorithm, which may: • be difficult, • introduce implementation related performance issues. • Results may not be indicative of the running time on other inputs not included in the experiment. • In order to compare two algorithms, the same hardware and software environments must be used.
Theoretical Analysis • Uses a high-level description of the algorithm instead of an implementation • Characterizes running time as a function of the input sizes (usually n) • Takes into account all possible inputs • Allows us to evaluate the speed of an algorithm independent of the hardware/software environment
Analysis Tools (Goodrich, 166) Asymptotic Analysis • Developed by computer scientists for analyzing the running time of algorithms • Describes the running time of an algorithm in terms of the size of the input – in other words, how well does it “scale” as the input gets larger • The most common metric used is: • Upper bound of an algorithm • Called the Big-Oh of an algorithm, O(n)
Analysis Tools (Goodrich web page) Algorithm Analysis • Use asymptotic analysis • Assume that we are using an idealized computer called the Random Access Machine (RAM) • CPU • Unlimited amount of memory • Accessing a memory cell takes one unit of time • Primitive operations take one unit of time • Perform the analysis on the pseudocode
Analysis Tools (Goodrich, 48, 166) Pseudocode • High-level description of an algorithm • Structured • Not as detailed as a program • Easier to understand (written for a human) Algorithm arrayMax(A,n): Input: An array A storing n >= 1 integers. Output: The maximum element in A. currentMax A[0] for i 1 to n – 1 do if currentMax < A[i] then currentMax A[i] return currentMax
Analysis Tools (Goodrich, 48) Pseudocode Details Declaration Algorithm functionName(arg,…) Input:… Output:… Control flow • if…then…[else…] • while…do… • repeat…until… • for…do… Indentation instead of braces Return value return expression Expressions • Assignment (like in Java) • Equality testing (like in Java)
Analysis Tools (Goodrich, 48) Pseudocode Details Declaration Algorithm functionName(arg,…) Input:… Output:… Control flow • if…then…[else…] • while…do… • repeat…until… • for…do… Indentation instead of braces Return value return expression Expressions • Assignment (like in Java) • Equality testing (like in Java) Tips for writing an algorithm Use the correct data structure Use the correct ADT operations Use object-oriented syntax Indent clearly Use a return statement at the end
Analysis Tools (Goodrich, 164–165) Primitive Operations • To do asymptotic analysis, we can start by counting the primitive operations in an algorithm and adding them up (need not be very accurate). • Primitive operations that we assume will take a constant amount of time: • Assigning a value to a variable • Calling a function • Performing an arithmetic operation • Comparing two numbers • Indexing into an array • Returning from a function • Following a pointer reference
Analysis Tools (Goodrich, 166) Counting Primitive Operations • Inspect the pseudocode to count the primitive operations as a function of the input size (n) Algorithm arrayMax(A,n): currentMax A[0] for i 1 to n – 1 do if currentMax < A[i] then currentMax A[i] return currentMax Count Array indexing + Assignment 2 Initializing i 1 Verifying i<n n Array indexing + Comparing 2(n-1) Array indexing + Assignment 2(n-1) worst Incrementing the counter 2(n-1) Returning 1 Best case: 2+1+n+4(n–1)+1 = 5n Worst case: 2+1+n+6(n–1)+1 = 7n-2
Analysis Tools (Goodrich, 165) Best, Worst, or Average Case • A program may run faster on some input data than on others • Best case, worst case, and average case are terms that characterize the input data • Best case – the data is distributed so that the algorithm runs fastest • Worst case – the data distribution causes the slowest running time • Average case – very difficult to calculate • We will concentrate on analyzing algorithms by identifying the running time for the worst case data
Analysis Tools (Goodrich, 166) Running Time of arrayMax • Worst case running time of arrayMax f(n) = (7n – 2) primitive operations • Actual running time depends on the speed of the primitive operations—some of them are faster than others • Let t = speed of the slowest primitive operation • Let f(n) = the worst-case running time of arrayMax f(n) = t (7n – 2)
Analysis Tools (Goodrich, 166) Growth Rate of arrayMax • Growth rate of arrayMax is linear • Changing the hardware alters the value of t, so that arrayMax will run faster on a faster computer • Growth rate does not change. Slow PC 10(7n-2) Fast PC 5(7n-2) Fastest PC 1(7n-2)
Analysis Tools Tests of arrayMax • Does it really work that way? • Used arrayMax algorithm from p.166, Goodrich • Tested on two PCs • Yes, results both show a linear growth rate Pentium III, 866 MHz Pentium 4, 2.4 GHz
Big-Oh Notation • Given functions f(n) and g(n), we say that f(n) is O(g(n))if there are positive constantsc and n0 such that f(n)cg(n) for n n0 • Example: 2n+10 is O(n) 2n+10cn (c 2) n 10 n 10/(c 2) Pick c = 3 and n0 = 10
Big-Oh Meaning • Big-O expresses an upper bound on the growth rate of a function, for sufficiently large values of n. • It shows the asymptotic behavior. • Example: 3n2 + n lg n +2000 = O(n2) 3n2 + n lg n +2000 = O(n3)
Big-Oh Example • Example: the function n2is not O(n) • n2cn • n c • The above inequality cannot be satisfied since c must be a constant
Big-Oh Proof Example 50n2+100n lgn + 1500 is O(n2) Proof: Need to find c such that 50n2+100n lgn + 1500 < cn2 For example, pick c = 151 and n0 = 40. You may complete the proof.
Big-Oh and Growth Rate • The big-Oh notation gives an upper bound on the growth rate of a function • The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no more than the growth rate of g(n) • We can use the big-Oh notation to rank functions according to their growth rate
Big-O Theorems For all the following theorems, assume that f(n) is a function of n and that K is an arbitrary constant. • Theorem1: K is O(1) • Theorem 2: A polynomial is O(the term containing the highest power of n) f(n) = 7n4 + 3n2 + 5n + 1000 is O(n4) • Theorem 3: K*f(n) is O(f(n)) [that is, constant coefficients can be dropped] g(n) = 7n4 is O(n4) • Theorem 4: If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n)). [transitivity]
Big-O Theorems • Theorem 5: Each of the following functions is strictly big-O of its successors: K [constant] logb(n) [always log base 2 if no base is shown] n nlogb(n) n2 n to higher powers 2n 3n larger constants to the n-th power n! [n factorial] nn For Example: f(n) = 3nlog(n) is O(nlog(n)) and O(n2) and O(2n) smaller larger
Big-O Theorems • Theorem 6: In general, f(n) is big-O of the dominant term of f(n), where “dominant” may usually be determined from Theorem 5. f(n) = 7n2+3nlog(n)+5n+1000 is O(n2) g(n) = 7n4+3n+106 is O(3n) h(n) = 7n(n+log(n)) is O(n2) • Theorem 7: For any base b, logb(n) is O(log(n)).
Asymptotic Algorithm Analysis • The asymptotic analysis of an algorithm determines the running time in big-Oh notation • To perform the asymptotic analysis • We find the worst-case number of primitive operations executed as a function of the input size • We express this function with big-Oh notation • Example: • We determine that algorithm arrayMax executes at most 7n 2 primitive operations • We say that algorithm arrayMax “runs in O(n) time” • Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard them when counting primitive operations
Example: Computing Prefix Averages • We further illustrate asymptotic analysis with two algorithms for prefix averages • The i-th prefix average of an array X is average of the first (i+ 1) elements of X: A[i]= (X[0] +X[1] +… +X[i])/(i+1) • Computing the array A of prefix averages of another array X has applications to financial analysis
Prefix Average (Quadratic) • The following algorithm computes prefix averages in quadratic time by applying the definition AlgorithmprefixAverages1(X, n) Inputarray X of n integers Outputarray A of prefix averages of X #operations A new array of n integers n fori0ton 1do n sX[0] n forj1toido 1 + 2 + …+ (n 1) ss+X[j] 1 + 2 + …+ (n 1) A[i]s/(i+ 1)n returnA 1
The running time of prefixAverages1 isO(1 + 2 + …+ n) The sum of the first n integers is n(n+ 1) / 2 There is a simple visual proof of this fact Thus, algorithm prefixAverages1 runs in O(n2) time Arithmetic Progression
Prefix Averages (Linear) • The following algorithm computes prefix averages in linear time by keeping a running sum AlgorithmprefixAverages2(X, n) Inputarray X of n integers Outputarray A of prefix averages of X #operations A new array of n integers n s 0 1 fori0ton 1do n ss+X[i] n A[i]s/(i+ 1)n returnA 1 • Algorithm prefixAverages2 runs in O(n) time
Math you need to Review • Summations • Logarithms and Exponents • Proof techniques • Basic probability • properties of logarithms: logb(xy) = logbx + logby logb (x/y) = logbx - logby logbxa = alogbx logba = logxa/logxb • properties of exponentials: a(b+c) = aba c abc = (ab)c ab /ac = a(b-c) b = a logab bc = a c*logab
Relatives of Big-Oh • big-Omega (Lower bound) • f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) c•g(n) for n n0 • big-Theta (order) • f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n0 1 such that c’•g(n) f(n) c’’•g(n) for n n0
Intuition for Asymptotic Notation Big-Oh • f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega • f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta • f(n) is (g(n)) if f(n) is asymptotically equal to g(n)
Relatives of Big-Oh Examples • 5n2 is (n2) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) c•g(n) for n n0 let c = 5 and n0 = 1 • 5n2 is also (n) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) c•g(n) for n n0 let c = 1 and n0 = 1 • 5n2 is (n2) f(n) is (g(n)) if it is (n2) and O(n2). We have already seen the former, for the latter recall that f(n) is O(g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) <c•g(n) for n n0 Let c = 5 and n0 = 1