270 likes | 349 Views
Algorithms. An algorithm is a well-defined computational procedure that takes a value or set of values, called the input , and produces a value or set of values called the output
E N D
Algorithms • An algorithm is a well-defined computational procedure that takes a value or set of values, called the input, and produces a value or set of values called the output • By “well-defined procedure”, we mean a sequence of precise computational steps, each of which can be realized on a computer • Normal properties of an algorithm: • Input receives input • Output produces output • Precision the steps are precisely stated (no ambiguity) • Finiteness for each input, the algorithm terminates after the execution of finitely many steps. • Generality applies to a general set of values • Determinism at each step, the result depends only on the inputs and values produced by earlier steps
Computational Problems • A computational problem consists of two specifications: • The set of valid inputs • The output desired, given as a relationship between input and output • Example: The Sorting Problem • Input: a sequence of n numbers <a1,a2, . . . , an> • Output: a permutation (rearrangement) <a1,a2 , . . . , an > of the input sequence satisfying the condition a1 a2 . . . an • An algorithm solves a given computational problem if for every valid input for the problem it produces an output that satisfies the condition of the problem
Algorithm Design and Analysis • The two main tasks in the study of algorithms: • Design: devising an algorithm that correctly solves a given problem and proving its correctness • Analysis: determining the properties of a given algorithm • Algorithm design is a creative process, more art than science • A number of basic techniques have been discovered that are useful in algorithm design: • Searching techniques • Divide and conquer methods • Greedy techniques • Dynamic programming
Algorithm Analysis • In analyzing algorithms, we ask the following questions: • Correctness: does the algorithm solve the given problem? • Termination: does the described procedure halt on all valid input? • Time analysis: how many instructions does the algorithm execute? • Space analysis: how much memory does the algorithm need to execute?
New Modes • The procedures that define an operating system do not precisely meet the definition of algorithm given earlier • An operating system is an example of an “online algorithm” • This is an algorithm that never terminates, but is designed to process a continuing sequence of inputs • In a multi-user system, the operating system is also not deterministic
New Modes • Another class of non-deterministic algorithms are the randomized algorithms • A randomized algorithm makes random choices at certain points in its execution • These choices are based on the value produced by a random number generator • Actually, a pseudo-random number generator is used • We will assume we have a function rand(i,j) that produces random integer between i and j, inclusive.
Pseudocode Notation • The notation used for the pseudocode in the text by and large follows the usual constructs and notation from the languages you have been using with exceptions: • The body of an if-statement or a for-statement is indicated by indentation, not by enclosing braces ({ }) • Parameter types are not declared, but understood from input and output statements preceding the function • Semi-colons are not used to terminate statements • An object-oriented notation is used for many data objects • Assume a function println(…) for basic types • Thus a standard function to manipulate a data object s would be invoked like this s.foo(i) • Two for statements: • forvar = init_valto limit_val • forvar = init_valdownto limit_val
Mathematical Background • It is assumed you are familiar with proofs by induction • A review is given in section 2.2 of the text • Section 2.1 also provides a review of some other basic mathematical ideas (logs, etc) that we will need. • You should use these sections as needed
Analysis of Algorithms • Our two primary measures of efficiency are time of execution and memory space needed • We concentrate primarily on speed • It is almost impossible to give an exact time for execution of an algorithm on a specific input due to varying machine speed, compiler efficiency, etc • But all these factors only affect the running time by a constant amount • Thus if the algorithm with input I runs in 5 times faster on system A than on system B, then the same will hold true for any other input J • This means that when discussing running times, we will ignore multiplicative constants • It is also obvious that most algorithms need more time on larger inputs than on smaller inputs • Most often, we are concerned with the running time as a function of input size: T(n)
Analysis of Algorithms • Even if we restrict attention to inputs of a given size n, the running times can vary from one input to another of that size. • There are three ways to view running time as a function of size only • The worst-case running time of an algorithm is defined by Twc(n) = max { T(I) | I is an valid input of size n } • Another, less interesting measure is best-case running time:Tbc(n) = min { T(I) | I is an valid input of size n } • A third important measure is average-case running time • In order for average-case analysis to make sense we must have a probability distribution pr for the set of all inputs of size n • Often we assume that all inputs are equally likely to occur; thus if there are M possible inputs of size n, each has probability of 1/M • Then the average-case running time is defined to be Tav(n) = size(I) = n pr(I)T(I)
Asymptotic Growth Rate • Often, the efficiency of an algorithm is not apparent for small input • The study of algorithm running time thus concerns itself with the asymptotic growth of the running time as a function of input size • In essence, we are interested in how the running time grows for sufficiently large input sizes • The study uses some ideas from mathematics developed before algorithms were an area of interest • We now define the notions of asymptotic upper bound, asymptotic lower bound and asymptotic tight bound • Remember that in general we are not concerned with multiplicative constants
Definitions for Asymptotic Analysis • We say that g(n) is an asymptotic upper bound for function f(n), or “f(n) is O(g(n))” provided the following statement is satisfied: constants M and c > 0 such that n M: f(n) cg(n) • The statement “n M” could be paraphrased “eventually” and the inequality paraphrased as “f is bounded above by a constant multiple of g” very important!
Definitions for Asymptotic Analysis • We say that g(n) is an asymptotic lower bound for function f(n), or “f(n) is (g(n))” provided the following statement is satisfied: constants M and c > 0 such that n M: cg(n) f(n) • The statement “n M” could be paraphrased “eventually” and the inequality paraphrased as “f is bounded below by a constant multiple of g” very important!
Definitions for Asymptotic Analysis • We that g(n) is a tight asymptotic bound for f(n), or “f(n) is (g(n))” if f(n) is both O(g(n)) and (g(n)). • Thus, g(n) is a tight asymptotic bound for f(n) if and only if constants M, c1 > 0 and c2 > 0 such that n M: c1g(n) f(n) c2g(n) • Thus, except for a finite number of values, f(n) is both bounded below and bounded above by constant multiples of g(n).
Example Prove, from the definition of big-Oh, that f(n) = 3n2 + 5n + 2 is O(n2). By the definition, we must show that there is a positive constant c and an integer constant M such that 3n2 + 5n + 2 cn2for all n >= M. A little thought should lead to the choice of c = 10 and M = 1: 3n2 + 5n + 2 3n2 + 5n2 + 2n2 = 10n2 for all n 1 It should be clear that the argument used in the previous example can be used to prove that a polynomial f(n) of degree k is O(nk). See the text for the proof.
Some concepts from calculus provide a method that is often useful in proving big-Oh relationships. Theorem. Let f(n) and g(n) be positive integer-valued functions of a positive integer variable n. If lim n f(n)/g(n) exists and is a constant k < , then f(n) is O(g(n)). Moreover, if k = 0, then g(n) is not O(f(n)). Proof. If lim nf(n)/g(n) = k , then by the definition of limit as n goes to infinity, we have: For any > 0, there is an integer M such that | f(n)/g(n) - k | < for all n M. Choosing = 1, we thus have | f(n)/g(n) - k | < 1, hence -1 < f(n)/g(n) - k < 1. It then follows that f(n)/g(n) < k+1, hence f(n) < (k+1)g(n), for all n M. Choosing c = k+1, we have thus satisfied the definition for f(n) being O( g(n) ) .
Theorem. Let f(n) and g(n) be positive integer-valued functions of a positive integer variable n. If lim n f(n)/g(n) exists and is a constant k < , then f(n) is O(g(n)). Moreover, if k = 0, then g(n) is not O(f(n)). Proof continued. If lim nf(n)/g(n) = 0 , then lim ng(n)/f(n) = . It then follows that for any K, there is an integer M such that (1) g(n)/f(n) K for all n M. Suppose g(n) is O(f(n)). Then there is a constant c > 0 and a value M' such that (2) g(n) cf(n), hence g(n)/f(n) for all n M'. But if we choose K = c+1 and apply (1), we see there is a value M such that (3) g(n)/f(n) c+1 for all n M. But this means that, for all n max(M,M'), c+1 g(n)/f(n) c. Since this is a contradiction, the assumption that g(n) is O(f(n)) is false.
Example: Prove, using the above theorem, that ln(n) is O(n1/2). We need to show that ln(n)/n1/2 converges to a constant as n approaches infinity. Note that both the numerator and denominator approach infinity as n approaches infinity. Moreover, each extends to differentiable functions on the real number system. Recall l’Hospital’s rule from calculus: if f(x) and g(x) are differentiable and lim xf(x) = and lim xg(x) = , then limxf(x)/g(x) = lim xf(x)/g(c). Therefore lim xln(x)/ x1/2 = lim x(1/x)/ ½ x-1/2 = lim x2/ x1/2 = 0. Thus, we may conclude that ln(n) is O(n1/2). Moreover, we also know that ln n is not (n1/2), hence is not (n1/2).
Example: Running Times • Find an asymptotic expression for the number of times the statement x = x+1 is executed • for i = 1 to n for j = 1 to i x = x+1 Solution: The inner loop executes i times for each i from 1 to n Thus, it is executed 1 + 2 + + n = n(n+1)/2 times Thus the count is (n2)
Example: Running Times • Find an asymptotic expression for the number of times the statement x = x+1 is executed (2) j = n while (j 1) { for i = 1 to j x = x+1 j = j/2 } Solution: The inner loop is execute first n times, then n/2 times, then n/4, etc Thus, it is executed n + n/2 + n/4 + + n/2k times for some k n + n/2 + n/4 + + n/2k = n(1 + 1/2 + + 1/2k ) = n(1 - 1/2k+1 )/(1-1/2) = 2n( 1 - 1/2k+1 ) 2n Thus the count is O(n)
Classical Growth Rates • (1) Constant • (lg lg n) Log-log • (lg n) Log • (nc) Sublinear0 < c < 1 • (n) Linear • (n lg n) n Log n • (n2) Quadratic • (n3) Cubic • (nk) Polynomialk 1 • (cn), c > 1 Exponential • (n!) Factorial
Optimization and Decision Problems • Two important class of problems we will examine through the course are • Optimization Problems • Decision Problems • There are two kinds of optimization problems: maximization problems and minimization problems. • In an optimization problem we have a set of feasible solutions and each feasible solution has an associated value • In a maximization problem we are looking the maximum value among all feasible solutions and, usually, a feasible solution that achieves the maximum value • In a minimization problem, we seek the minimum value • Example: Given a graph G and two vertices u and v, the feasible solutions are the simple paths in G from u to v and the associated value is the length of the path
Optimization Problems • Example Optimization Problem: Given a graph G and two vertices u and v, the feasible solutions are the simple paths in G from u to v and the associated value is the length of the path • The corresponding minimization problem is efficiently solvable by means of breadth-first search, which we discuss later • No one knows of an algorithm for the maximization problem whose worst-case running time is polynomial.
Decision Problems • In a decision problem, you have a set of instances of the problem and a subset of positive instances of the problem. • A decision algorithm for such a problem will take as input an instance of the problem and give ouput “yes” (or 1) for every positive instance and output “no” (or 0) for every other instance. • Example 1: Hamiltonian Cycle Problem: Instances are finite, undirected graphs G; a graph is a positive instance if it contains a simple cycle covering all vertices of the graph. • No one knows of a decision algorithm for the Hamiltonian Cycle Problem that has polynomial running time. • Example 2: Shortest Path Decision Problem Instances are 4-tuples (G,u,v,k), where G is a graph, u and v are vertices of G and k is an integer between 0 and the number of vertices of G Positive instances are those where there is a simple u-v path in G of length at most k.
Decision Version of an Optimization Problem • Given an optimization problem whose associated instance values are integers in the range from 0 to N, we can construct an associated decision problem as follows: • The instances of the decision problem are pairs (X,i), where X is an instance of the optimization problem and i is an integer in {0,…,N}. • The positive instances of the decision problem are the pairs (X,i) such that X has a feasible solution whose value is • less than or equal to i if X is a minimization problem • greater than or equal to i if X is a maximization problem
Decision Version of an Optimization Problem • If the decision problem associated with an optimization problem has a polynomial-time decision algorithm, then the optimization problem also has a polynomial time algorithm. • For a minimization problem instance X: i = 0do: answer = output of decision algorithm for (X,i) if answer == no i = i+1while answer == no and i <= Nreturn i • For a maximization problem replace i = 0 with i = N,i = i+1 with i = i-1 and i <= N by i >= 0.
Homework Assignment 1 • Due: next class • Page 52, # 14, 15, 17, 18 • In the Longest Simple Path optimization problem you are to output the length of a longest simple path from vertex u to vertex v in a connected graph G. If n is the number of vertices in G, then the largest possible value is n-1 and the least possible value is 0. (a) Specify the corresponding decision problem (b) Suppose A is a decision algorithm for the decision problem. Write down the algorithm to solve the Longest Simple Path optimization problem using A as needed.