420 likes | 599 Views
2IL50 Data Structures. Spring 2014 Lecture 2: Analysis of Algorithms. Analysis of algorithms. the formal way …. Analysis of algorithms. Can we say something about the running time of an algorithm without implementing and testing it?. InsertionSort (A) initialize: sort A[1]
E N D
2IL50 Data Structures Spring 2014Lecture 2: Analysis of Algorithms
Analysis of algorithms the formal way …
Analysis of algorithms • Can we say something about the running time of an algorithm without implementing and testing it? • InsertionSort(A) • initialize: sort A[1] • for j = 2 to A.length • do key = A[j] • i = j -1 • while i > 0 and A[i] > key • do A[i+1] = A[i] • i = i -1 • A[i +1] = key
Analysis of algorithms • Analyze the running time as a function of n (# of input elements) • best case • average case • worst case elementary operationsadd, subtract, multiply, divide, load, store, copy, conditional and unconditional branch, return … An algorithm has worst case running time T(n) if for any input of size n the maximal number of elementary operations executed is T(n).
n=10 n=100 n=1000 1568 150698 1.5 x 107 10466 204316 3.0 x 106 InsertionSort 6 x faster InsertionSort 1.35 x faster MergeSort 5 x faster Analysis of algorithms: example InsertionSort: 15 n2 + 7n – 2 MergeSort: 300 n lg n + 50 n The rate of growth of the running time as a function of the input is essential! n= 1,000,000InsertionSort 1.5 x 1013 MergeSort 6 x 1092500 x faster !
Θ-notation Intuition: concentrate on the leading term, ignore constants 19 n3 + 17 n2 - 3nbecomes Θ(n3) 2 n lg n + 5 n1.1 - 5becomes n - ¾ n √nbecomes Θ(n1.1) ---
Θ-notation Let g(n) : N ↦ N be a function. Then we have Θ(g(n)) = { f(n) : there exist positive constants c1, c2, and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0} “Θ(g(n)) is the set of functions that grow as fast as g(n)”
c2g(n) f(n) c1g(n) 0 n0 n Θ-notation Let g(n) : N ↦ N be a function. Then we have Θ(g(n)) = { f(n) : there exist positive constants c1, c2, and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0} Notation: f(n) = Θ(g(n))
Θ-notation Let g(n) : N ↦ N be a function. Then we have Θ(g(n)) = { f(n) : there exist positive constants c1, c2, and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0} Claim: 19n3 + 17n2 - 3n = Θ(n3) Proof: Choose c1= 19, c2 = 36 and n0 = 1. Then we have for all n ≥ n0: 0 ≤ c1n3 (trivial) ≤ 19n3 + 17n2 - 3n (since 17n2 > 3n for n ≥ 1) ≤ c2n3 (since 17n2 ≤ 17n3 for n ≥1)■
Θ-notation Let g(n) : N ↦ N be a function. Then we have Θ(g(n)) = { f(n) : there exist positive constants c1, c2, and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0} Claim: 19n3 + 17n2 - 3n ≠ Θ(n2) Proof: Assume that there are positive constants c1, c2, and n0 such that for all n ≥ n0 0 ≤ c1n2 ≤ 19n3 + 17n2 - 3n ≤ c2n2 Since 3n – 17n2 ≤ 0 we would have for all n ≥ n019n3 ≤ c2n2 and hence 19n ≤ c2.
O-notation Let g(n) : N ↦ N be a function. Then we have O(g(n)) = { f(n) : there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0} “O(g(n)) is the set of functions that grow at most as fast as g(n)”
cg(n) f(n) 0 n0 n O-notation Let g(n) : N ↦ N be a function. Then we have O(g(n)) = { f(n) : there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0} Notation: f(n) = O(g(n))
Ω-notation Let g(n) : N ↦ N be a function. Then we have Ω(g(n)) = { f(n) : there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0} “Ω(g(n)) is the set of functions that grow at least as fast as g(n)”
f(n) cg(n) 0 n0 n Ω-notation Let g(n) : N ↦ N be a function. Then we have Ω(g(n)) = { f(n) : there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0} Notation: f(n) = Ω(g(n))
Asymptotic notation Θ(…) is an asymptotically tight bound O(…) is an asymptotic upper bound Ω(…) is an asymptotic lower bound • other asymptotic notationo(…) → “grows strictly slower than”ω(…) → “grows strictly faster than” “asymptotically equal” “asymptotically smaller or equal” “asymptotically greater or equal”
f(n) = n3 + Θ(n2) means f(n) = means O(1) or Θ(1) means 2n2 + O(n) = Θ(n2) means there is a function g(n) such that f(n) = n3 + g(n) and g(n) = Θ(n2) there is one function g(i) such that f(n) = and g(i) = O(i) a constant for each function g(n) with g(n)=O(n) we have 2n2 + g(n) = Θ(n2) More notation …
O(1) + O(1) = O(1) O(1) + … + O(1) = O(1) = O(n2) ⊆ O(n3) O(n3) ⊆ O(n2) Θ(n2) ⊆ O(n3) An algorithm with worst case running time O(n log n) is always slower than an algorithm with worst case running time O(n) if n is sufficiently large. true false true true false true false Quiz
n log2 n = Θ(n log n) n log2 n = Ω(n log n) n log2 n = O(n4/3) O(2n) ⊆ O(3n) O(2n) ⊆ Θ(3n) false true true true false Quiz
Analysis of InsertionSort InsertionSort(A) • initialize: sort A[1] • for j = 2 to A.length • do key = A[j] • i = j -1 • while i > 0 and A[i] > key • do A[i+1] = A[i] • i = i -1 • A[i +1] = key • Get as tight a bound as possible on the worst case running time. ➨ lower and upper bound for worst case running time Upper bound: Analyze worst case number of elementary operations Lower bound: Give “bad” input example
Analysis of InsertionSort InsertionSort(A) • initialize: sort A[1] • for j = 2 to A.length • do key = A[j] • i = j -1 • while i > 0 and A[i] > key • do A[i+1] = A[i] • i = i -1 • A[i +1] = key Upper bound: Let T(n) be the worst case running time of InsertionSort on an array of length n. We have T(n) = Lower bound: O(1) O(1) worst case:(j-1) ∙ O(1) O(1) { O(1) + (j-1)∙O(1) + O(1) } = O(j) = O(n2) O(1) + Ω(n2) Array sorted in de-creasing order ➨ The worst case running time of InsertionSort isΘ(n2).
Analysis of MergeSort • MergeSort(A) • // divide-and-conquer algorithm that sorts array A[1..n] • if A.length = 1 • thenskip • else • n = A.length ; n1= floor(n/2); n2= ceil(n/2); • copy A[1.. n1] to auxiliary array A1[1.. n1] • copy A[n1+1..n] to auxiliary array A2[1.. n2] • MergeSort(A1); MergeSort(A2) • Merge(A, A1, A2) O(1) O(1) O(n) O(n) ?? O(n) T( n/2 ) + T( n/2 ) MergeSort is a recursive algorithm ➨ running time analysis leads to recursion
Analysis of MergeSort • Let T(n) be the worst case running time of MergeSort on an array of length n. We have O(1) if n = 1 T(n) = T( n/2 ) + T( n/2 ) + Θ(n) if n > 1 frequently omitted since it (nearly) always holds often written as2T(n/2)
Solving recurrences • Easiest: Master theoremcaveat: not always applicable • Alternatively: Guess the solution and use the substitution method to prove that your guess it is correct. • How to guess: • expand the recursion • draw a recursion tree
The master theorem Let a and b be constants, let f(n) be a function, and let T(n) be defined on the nonnegative integers by the recurrence T(n) = aT(n/b) + f(n) Then we have: • If f(n) = O(nlog a – ε) for some constant ε > 0, then T(n) = Θ(nlog a). • If f(n) = Θ(nlog a), then T(n) = Θ(nlog a log n) • If f(n) = Ω(nlog a + ε) for some constant ε > 0, and if af(n/b) ≤ cf(n) for some constant c < 1 and all sufficiently large n, then T(n) = Θ(f(n)) can be rounded up or down note: logba - ε b b b b b
The master theorem: Example T(n) = 4T(n/2) + Θ(n3) • Master theorem with a = 4, b = 2, and f(n) = n3 logba = log24 = 2 ➨ n3 = f(n) = Ω(nlog a + ε) = Ω(n2 + ε) with, for example, ε = 1 • Case 3 of the master theorem gives T(n) = Θ(n3), if the regularity condition holds. choose c = ½ and n0 = 1 ➨af(n/b) = 4(n/2)3 = n3/2 ≤ cf(n) for n ≥ n0 ➨ T(n) = Θ(n3) b
The substitution method • The Master theorem does not always apply • In those cases, use the substitution method: • Guess the form of the solution. • Use induction to find the constants and show that the solution works Use expansion or a recursion-tree to guess a good solution.
Recursion-trees T(n) = 2T(n/2) + n n n/2 n/2 n/4 n/4 n/4 n/4 n/2i n/2i … n/2i Θ(1) Θ(1) … Θ(1)
n n/2 n/2 n/4 n/4 n/4 n/4 n/2i n/2i … n/2i Θ(1) Θ(1) … Θ(1) Recursion-trees T(n) = 2T(n/2) + n n 2 ∙ (n/2) = n 4 ∙ (n/4) = n log n 2i∙ (n/2i) = n n ∙ Θ(1) = Θ(n) + Θ(n log n)
Recursion-trees T(n) = 2T(n/2) + n2 n2 (n/2)2 (n/2)2 (n/4)2 (n/4)2 (n/4)2 (n/4)2 (n/2i)2 (n/2i)2 … (n/2i)2 Θ(1) Θ(1) … Θ(1)
n2 (n/2)2 (n/2)2 (n/4)2 (n/4)2 (n/4)2 (n/4)2 (n/2i)2 (n/2i)2 … (n/2i)2 Θ(1) Θ(1) … Θ(1) Recursion-trees T(n) = 2T(n/2) + n2 n2 2 ∙ (n/2)2 = n2/2 4 ∙ (n/4)2 = n2/4 2i∙ (n/2i) 2 = n2/2i n ∙ Θ(1) = Θ(n) + Θ(n2)
Recursion-trees T(n) = 4T(n/2) + n n n/2 n/2 n/2 n/2 … n/4 n/4 n/4 n/4 … Θ(1) Θ(1) … Θ(1)
Recursion-trees T(n) = 4T(n/2) + n n n n/2 n/2 n/2 n/2 4 ∙ (n/2) = 2n 16 ∙ (n/4) = 4n … n/4 n/4 n/4 n/4 … Θ(1) Θ(1) … Θ(1) n2∙ Θ(1) = Θ(n2) + Θ(n2)
3/2 = 1, 4/2 = 2 ➨ 3 must also be base case The substitution method 2 if n = 1 2T( n/2 ) + n if n > 1 Claim: T(n) = O(n log n) Proof: by induction on n to show: there are constants c and n0suchthat T(n) ≤ c n log n for all n ≥ n0 T(1) = 2➨ choose c = 2(for now)andn0 = 2 • Why n0 = 2? • How many base cases? Base cases: n = 2:T(2) = 2T(1) + 2 ≤ 2∙2 + 2 = 6 = c 2 log 2 for c = 3 n = 3: T(3) = 2T(1) + 3 ≤ 2∙2 + 3 = 7 ≤ c 3 log 3 T(n) = log 1 = 0 ➨ can not prove bound for n = 1
The substitution method 2 if n = 1 2T( n/2 ) + n if n > 1 Claim: T(n) = O(n log n) Proof: by induction on n to show: there are constants c and n0 such that T(n) ≤ c n log n for all n ≥ n0 • choose c = 3andn0 = 2 Inductive step: n > 3 T(n) = 2T( n/2 ) + n ≤ 2 cn/2 log n/2 + n (ind. hyp.) ≤ cn ((log n) - 1) + n ≤ cn log n■ T(n) =
The substitution method Θ(1) if n = 1 2T( n/2 ) + n if n > 1 Claim: T(n) = O(n) Proof: by induction on n Base case: n = n0 T(2) = 2T(1) + 2 = 2c + 2 = O(2) Inductive step: n > n0 T(n) = 2T( n/2 ) + n = 2O( n/2 ) + n (ind. hyp.) = O(n) ■ T(n) = Never use O, Θ, or Ω in a proof by induction!
Analysis of algorithms one more example …
Example • Example (A) • // A is an array of length n • n = A.length • if n=1 • then return A[1] • else begin • Copy A[1… n/2 ] to auxiliary array B[1... n/2 ] • Copy A[1… n/2 ] to auxiliary array C[1… n/2 ] • b =Example(B);c =Example(C) • for i = 1 to n • do for j = 1 to i • do A[i] = A[j] • return 43 • end
Let T(n) be the worst case running time of Example on an array of length n. Lines 1,2,3,4,11, and 12 take Θ(1) time. Lines 5 and 6 take Θ(n) time. Line 7 takes Θ(1) + 2 T( n/2 ) time. Lines 8 until 10 take time. If n=1 lines 1,2,3 are executed, else lines 1,2, and 4 until 12 are executed. ➨ T(n): ➨ use master theorem … Θ(1) if n=1 2T(n/2) + Θ(n2) if n>1 Example • Example (A) • // A is an array of length n • n = A.length • if n=1 • then return A[1] • else begin • Copy A[1… n/2 ] to auxiliary array B[1... n/2 ] • Copy A[1… n/2 ] to auxiliary array C[1… n/2 ] • b =Example(B);c =Example(C) • for i = 1 to n • do for j = 1 to i • do A[i] = A[j] • return 43 • end
Tips • Analysis of recursive algorithms:find the recursion and solve with master theorem if possible • Analysis of loops: summations • Some standard recurrences and sums: • T(n) = 2T(n/2) + Θ(n) ➨ • ½ n(n+1) = Θ(n2) • Θ(n3) T(n) = Θ(n log n)
Announcements • Group sign-up closes tonight (Wed 05-02-2014) at midnight. • Not signed up for group ➨ de-registered from the course on Thursday 06-02-2014 • Some students might be moved to equalize sizes. • Email your assignment as .pdfto your tutor, not to me.