The Design & Analysis of the algorithms

The Design & Analysis of the algorithms Lecture 1. 2010.. by me M. Sakalli

You must have learnt these today.. • Why analysis?.. • What parameters to observe? Slight # 6.. • Methodological differences to achieve better outcomes, there Euclid’s algorithm with three methods. • Asymptotic order of growth. • Examples with insertion and selection sorting

Finite number of steps. An algorithm: A sequence of unambiguous – well defined procedures, instructions for solving a problem. For a desired output. For given range of input values Execution must be completed in a finite amount of time. Problem, relating input parameters with certain state parameters, analytic rendering.. algorithm “computer” input output

Important points to keep in mind • Reduce ambiguity • Keep the code simple, clean with well-defined and correct steps . • Specify the range of inputs applicable. • Investigate other approaches solving the problem leading to determine if other efficient algorithms possible. • Theoretically: • Prove its correctness.. • Efficiency: Theoretical and Empirical analysis • Its optimality

Efficiency. • Complexity • Time efficiency: Estimation in the asymptotic sense Big O, omega, theta. In comparison two also you have a hidden constant. • Space efficiency. • The methods applied, recursive, parallel. • Desired scalability: Various range of inputs and the size and dimension of the problem under consideration. Examples: Video sequences. ROI. • Computational Model in terms of an abstract computer: A Turing machine. • http://en.wikipedia.org/wiki/Analysis_of_algorithms.

Historical Perspective … • Muhammad ibn Musa al-Khwarizmi – 9th century mathematician www.lib.virginia.edu/science/parshall/khwariz.html • …

Analysis means: • evaluate the costs, time and space costs, and manage the resources and the methods.. • A generic RAM model of computation in which instructions are executed consecutively, but not concurrently or in parallel. Running time analysis: The # of the primitive operations executed for every line of the code ci (defined as machine independent as possible). • Asymptotic analysis • Ignore machine-dependent constants • Look at computational growth of T(n) as n→∞, n: input size that determines the number of iterations: Relative speed (in the same machine) Absolute speed (between computers)

Euclid’s Algorithm Problem definition: gcd(m,n) of two nonnegative, not both zero integers m and n, m > n Examples: gcd(60,24) = 12, gcd(60,0) = 60, gcd(0,0) = ? Euclid’s algorithm is based on repeated application of equality gcd(m,n) = gcd(n, m mod n) until the second number reaches to 0. Example: gcd(60,24) = gcd(24,12) = gcd(12,0) = 12 r0 = m, r1 = n, ri-1 = ri qi + ri+1, 0 < ri+1 < ri , 1i<t, … rt-1 = rt qt + 0

Step 1 If (n == 0 or m==n), return m and stop; otherwise go to Step 2 Step 2 Divide m by n and assign the value of the remainder to r Step 3 Assign the value of n to m and the value of r to n. Go to Step 1. whilen ≠ 0 do {r ← m mod n m← n n ← r} returnm The upper bound of i iterations: ilog(n)+1 where  is (1+sqrt(5))/2. !!! Not sure on this.. I need to check.. i = O(log(max(m, n))), since ri+1 ri-1/2. The Lower bound, (log(max(m, n))), (log(max(m, n))).

Proof of Correctness of the Euclid’s algorithm • Step 1, If n divides m, then gcd(m,n)=n • Step 2, gcd(mn,)=gcd(n, m mod(n)). • If gcd(m, n) divides n, this implies that gcd must be  n: gcd(m, n)n. n divides m and n, implies that n must be smaller than the gcd of the pair {m,n}, ngcd(m, n) • If m=n*b+r, for r, b integer numbers, then gcd(m, n) = gcd(n, r). Every common divisor of m and n, also divides r. • Proof: m=cp, n=cq, c(p-qb)=r, therefore g(m,n), divides r, so that this yields g(m,n)gcd(n,r).

Other methods for computing gcd(m,n) Consecutive integer checking algorithm, not a good way, it checks all the .. Step 1 Assign the value of min{m,n} to t Step 2 Divide m by t. If the remainder is 0, go to Step 3;otherwise, go to Step 4 Step 3 Divide n by t. If the remainder is 0, return t and stop; otherwise, go to Step 4 Step 4 Decrease t by 1 and go to Step 2 Exhaustive??.. Very slow even if zero inputs would be checked.. O(min(m, n)). (min(m, n)), when gcd(m,n) =1. (1) for each operation, Overall complexity is (min(m, n))

Other methods for gcd(m,n) [cont.] Middle-school procedure Step 1 Find the prime factorization of m Step 2 Find the prime factorization of n Step 3 Find all the common prime factors Step 4 Compute the product of all the common prime factors and return it as gcd(m,n) Is this an algorithm?

Sieve of Eratosthenes Input: Integer n ≥ 2 Output: List of primes less than or equal to n, sift out the numbers that are not. for p ← 2 to n do A[p] ← p for p ← 2 to n do if A[p]  0 //p hasn’t been previously eliminated from the listj ← p*p while j ≤ ndo A[j] ← 0// mark element as eliminated j ← j+ p Ex: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 2 3 5 7 9 11 13 15 17 19 21 23 25 2 3 5 7 11 13 17 19 23 25 2 3 5 7 11 13 17 19 23

Asymptotic order of growth A way of comparing functions that ignores constant factors and small input sizes • O(g(n)): class of functions f(n) that grow no faster than g(n) • Θ(g(n)): class of functions f(n) that grow at same rate as g(n) • Ω(g(n)): class of functions f(n) that grow at least as fast as g(n)

Establishing order of growth using the definition Definition: f(n) is in O(g(n)), O(g(n)), if order of growth of f(n) ≤ order of growth of g(n) (within constant multiple),i.e., there exist a positive constant c,c  N, and non-negative integer n0 such that f(n) ≤c g(n) for every n≥n0 f(n) is o(g(n)), if f(n) ≤ (1/c) g(n) for every n≥n0 which means f grows strictly!! more slowly than any arbitrarily small constant of g. ??? f(n) is (g(n)), c  N, f(n) ≥ c g(n) for n≥n0 f(n) is (n) if f(n) is both O(n) and (n) Examples: • 10n is O(cn2), c ≥ 10, • since 10n≤ 10n2 for n ≥1 or 10n≤n2 for n ≥10 • 5n+20 is O(cn), for all n>0, c>=25, • since 5n+20 ≤ 5n+20n ≤ 25n, or c>=10 for n ≥4

O, Ω, Θ

Example of computational problem: sorting • Statement of problem: • Input: A sequence of n numbers <a1, a2, …, an> • Problem: Reorder <a´1, a´2, …, a´n> in ascending or descending order. • Output desired: a´i≤a´j , for i < j or i > j • Instance: The sequence <5, 3, 2, 8, 3> • Algorithms: • Selection sort • Insertion sort • Merge sort • (many others)

Selection Sort • Input: array a[1],…,a[n] • Output for example: array a sorted in non-decreasing order *** Scanning elements of unsorted part, n-1, n-2, .. Swapping. (n-1)*n/2=theta(n^2). Independent from the input. • Algorithm: (Insertion in place) for i=1 to n swap a[i] with smallest of a[i],…a[n]

Insertion-Sort and an idea of runtime analysis Input size is n = length[A], and tj times line-4 executed 1: for j ← 2 to n do 2: key ← A[j] //Insert A[j] into the sorted sequence A[1 . . . j − 1]. 3: i ← j - 1 4: while i > 0 and A[i] > key do 5: A[i+1] ← A[i] 6: i ← i-1 -4: end while 7: A[i+1] ← key -1: end for The total runtime for is T(n) The Best Case c1n + c2(n-1)+c3(n-1)+ c4(n-1)+c7(n-1) = an + b.. TimesCost n c1 n-1 c2 - n-1 c3 Σj=2:n tjc4 Σj=2:n (tj-1) c5 Σj=2:n (tj-1) c6 n-1 0 n-1 c7 n 0 = c1n + c2(n-1)+c3(n-1)+ c4Σj=2:n tj + c5Σj=2:n (tj-1) + c6Σj=2:n (tj-1)+ c7(n-1)

Insertion-Sort Analysis continued For the worst case run time.. If the array A is sorted in reverse order, then.. while loop, Σj=2:n tj = Σj=2:n j = Σj=2:n (n(n-1)/2) -1 Σj=2:n (tj-1) = Σj=2:n (j-1) = Σj=2:n n(n-1)/2 T(n) = (c4 + c5 + c6)n2/2 + (c1 + c2 + c3 - (c4 - c5 - c6)/2 + c7 )n + (c2 + c3 + c4 + c7) = An2 + Bn + C Average case runtime: Assume that about half of the subarray is out of order, then tj = j/2, which will lead a similar kind quadratic function

Brute force Divide and conquer Decrease and conquer Transform and conquer Greedy approach Dynamic programming Iterative improvement Backtracking Branch and bound Randomized algorithms The methods

The Design & Analysis of the algorithms