1 / 111

Algorithm Efficiency Measurement Tools and Analysis

Explore algorithm analysis tools like Big-Oh notation and probability concepts for measuring efficiency. Learn the importance of selecting faster algorithms over CPU speed and the significance of Big-Oh, Big-Omega, and Big-Theta notations in algorithm analysis.

sholguin
Download Presentation

Algorithm Efficiency Measurement Tools and Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCE 411Design and Analysis of Algorithms Background Review Prof. Jennifer Welch CSCE 411, Background Review

  2. Outline • Algorithm Analysis Tools • Sorting • Graph Algorithms CSCE 411, Background Review

  3. Algorithm Analysis Tools • Big-Oh notation • Basic probability CSCE 411, Background Review

  4. How to Measure Efficiency • Machine-independent way: • analyze "pseudocode" version of algorithm • assume idealized machine model • one instruction takes one time unit • "Big-Oh" notation • order of magnitude as problem size increases • Worst-case analyses • safe, often occurs most often, average case often just as bad CSCE 411, Background Review

  5. Faster Algorithm vs. Faster CPU • A faster algorithm running on a slower machine will always win for large enough instances • Suppose algorithm S1 sorts n keys in 2n2 instructions • Suppose computer C1 executes 1 billion instruc/sec • When n = 1 million, takes 2000 sec • Suppose algorithm S2 sorts n keys in 50nlog2n instructions • Suppose computer C2 executes 10 million instruc/sec • When n = 1 million, takes 100 sec CSCE 411, Background Review

  6. Caveat • No point in finding fastest algorithm for part of the program that is not the bottleneck • If program will only be run a few times, or time is not an issue (e.g., run overnight), then no point in finding fastest algorithm CSCE 411, Background Review

  7. Review of Big-Oh Notation • Measure time as a function of input size and ignore • multiplicative constants • lower order terms • f(n) = O(g(n)) : f is “≤” g • f(n) = Ω(g(n)) : f is “≥” g • f(n) = (g(n)) : f is “=” g • f(n) = o(g(n)) : f is “<” g • f(n) = (g(n)) : f is “>” g CSCE 411, Background Review

  8. Big-Oh • f(n) = O(g(n)) means, informally, that g is an upper bound on f • Formally: there exist c > 0, n0 > 0 s.t. f(n) ≤ cg(n) for all n ≥ n0. • To show that f(n) = O(g(n)), either find c and n0 that satisfy the definition, or use this fact: • if limn->∞ f(n)/g(n) = some nonnegative constant, then f(n) = O(g(n)). CSCE 411, Background Review

  9. Big-Omega • f(n) = Ω(g(n)) means, informally, that g is an lower bound on f • Formally: there exist c > 0, n0 > 0 s.t. f(n) ≥ cg(n) for all n ≥ n0. • To show that f(n) = Ω(g(n)), either find c and n0 that satisfy the definition, or use this fact: • if limn->∞ f(n)/g(n) = some positive constant or ∞, then f(n) = Ω(g(n)). CSCE 411, Background Review

  10. Big-Theta • f(n) = Θ(g(n)) means, informally, that g is a tight bound on f • Formally: there exist c1 > 0, c2 > 0, n0 > 0 s.t. c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0. • To show that f(n) = Θ(g(n)), either find c1, c2, n0 that satisfy the definition, or use this fact: • if limn->∞ f(n)/g(n) = some positive constant, then f(n) = Θ(g(n)). CSCE 411, Background Review

  11. Tying Back to Algorithm Analysis • Algorithm has worst case running time O(g(n)) if there exist c > 0 and n0 > 0 s.t. for all n ≥ n0, every execution (including the slowest) on an input of size n takes at most cg(n) time. • Algorithm has worst case running time Ω(g(n)) if there exist c > 0 and n0 > 0 s.t. for all n ≥ n0, at least one execution (including the slowest) on an input of size n takes at least cg(n) time. CSCE 411, Background Review

  12. More Definitions • f(n) = o(g(n)) means, informally, g is a non-tight upper bound on f • limn->∞ f(n)/g(n) = 0 • pronounced “little-oh” • f(n) = ω(g(n)) means, informally, g is a non-tight lower bound on f • limn->∞ f(n)/g(n) = ∞ • pronounced “little-omega” CSCE 411, Background Review

  13. How To Solve Recurrences • Ad hoc method: • expand several times • guess the pattern • can verify with proof by induction • Master theorem • general formula that works if recurrence has the form T(n) = aT(n/b) + f(n) • a is number of subproblems • n/b is size of each subproblem • f(n) is cost of non-recursive part CSCE 411, Spring 2012: Set 3

  14. Probability • Every probabilistic claim ultimately refers to some sample space, which is a set of elementary events • Think of each elementary event as the outcome of some experiment • Ex: flipping two coins gives sample space {HH, HT, TH, TT} • An event is a subset of the sample space • Ex: event "both coins flipped the same" is {HH, TT} CSCE 411, Spring 2012: Set 10

  15. HT HH TH TT Sample Spaces and Events A S CSCE 411, Spring 2012: Set 10

  16. Probability Distribution • A probability distribution Pr on a sample space S is a function from events of S to real numbers s.t. • Pr[A] ≥ 0 for every event A • Pr[S] = 1 • Pr[A U B] = Pr[A] + Pr[B] for every two non- intersecting ("mutually exclusive") events A and B • Pr[A] is the probability of event A CSCE 411, Spring 2012: Set 10

  17. Probability Distribution Useful facts: • Pr[Ø] = 0 • If A  B, then Pr[A] ≤ Pr[B] • Pr[S — A] = 1 — Pr[A] // complement • Pr[A U B] = Pr[A] + Pr[B] – Pr[A  B] ≤ Pr[A] + Pr[B] CSCE 411, Spring 2012: Set 10

  18. Probability Distribution B A Pr[A U B] = Pr[A] + Pr[B] – Pr[A  B] CSCE 411, Spring 2012: Set 10

  19. 1/4 HT 1/4 1/4 HH TH TT 1/4 Example • Suppose Pr[{HH}] = Pr[{HT}] = Pr[{TH}] = Pr[{TT}] = 1/4. • Pr["at least one head"] = Pr[{HH U HT U TH}] = Pr[{HH}] + Pr[{HT}] + Pr[{TH}] = 3/4. • Pr["less than one head"] = 1 — Pr["at least one head"] = 1 — 3/4 = 1/4 CSCE 411, Spring 2012: Set 10

  20. Specific Probability Distribution • discrete probability distribution: sample space is finite or countably infinite • Ex: flipping two coins once; flipping one coin infinitely often • uniform probability distribution: sample space S is finite and every elementary event has the same probability, 1/|S| • Ex: flipping two fair coins once CSCE 411, Spring 2012: Set 10

  21. Flipping a Fair Coin • Suppose we flip a fair coin n times • Each elementary event in the sample space is one sequence of n heads and tails, describing the outcome of one "experiment" • The size of the sample space is 2n. • Let A be the event "k heads and nk tails occur". • Pr[A] = C(n,k)/2n. • There are C(n,k) sequences of length n in which k heads and n–k tails occur, and each has probability 1/2n. CSCE 411, Spring 2012: Set 10

  22. Example • n = 5, k = 3 • Event A is {HHHTT, HHTTH, HTTHH, TTHHH, HHTHT, HTHTH, THTHH, HTHHT, THHTH, THHHT} • Pr[3 heads and 2 tails] = C(5,3)/25 = 10/32 CSCE 411, Spring 2012: Set 10

  23. 2/9 HT 4/9 2/9 HH TH TT 1/9 Flipping Unfair Coins • Suppose we flip two coins, each of which gives heads two-thirds of the time • What is the probability distribution on the sample space? Pr[at least one head] = 8/9 CSCE 411, Spring 2012: Set 10

  24. Independent Events • Two events A and B are independent if Pr[A  B] = Pr[A]·Pr[B] • I.e., probability that both A and B occur is the product of the separate probabilities that A occurs and that B occurs. CSCE 411, Spring 2012: Set 10

  25. B A 1/4 HT 1/4 1/4 HH TH TT 1/4 Independent Events Example In two-coin-flip example with fair coins: • A = "first coin is heads" • B = "coins are different" Pr[A] = 1/2 Pr[B] = 1/2 Pr[A  B] = 1/4 = (1/2)(1/2) so A and B are independent CSCE 411, Spring 2012: Set 10

  26. Discrete Random Variables • A discrete random variable X is a function from a finite or countably infinite sample space to the real numbers. • Associates a real number with each possible outcome of an experiment • Define the event "X = v" to be the set of all the elementary events s in the sample space with X(s) = v. • Pr["X = v"] is the sum of Pr[{s}] over all s with X(s) = v. CSCE 411, Spring 2012: Set 10

  27. Discrete Random Variable X=v X=v X=v X=v X=v Add up the probabilities of all the elementary events in the orange event to get the probability that X = v CSCE 411, Spring 2012: Set 10

  28. Random Variable Example • Roll two fair 6-sided dice. • Sample space contains 36 elementary events (1:1, 1:2, 1:3, 1:4, 1:5, 1:6, 2:1,…) • Probability of each elementary event is 1/36 • Define random variable X to be the maximum of the two values rolled • What is Pr["X = 3"]? • It is 5/36, since there are 5 elementary events with max value 3 (1:3, 2:3, 3:3, 3:2, and 3:1) CSCE 411, Spring 2012: Set 10

  29. Independent Random Variables • It is common for more than one random variable to be defined on the same sample space. E.g.: • X is maximum value rolled • Y is sum of the two values rolled • Two random variables X and Y are independent if for all v and w, the events "X = v" and "Y = w" are independent. CSCE 411, Spring 2012: Set 10

  30. Expected Value of a Random Variable • Most common summary of a random variable is its "average", weighted by the probabilities • called expected value, or expectation, or mean • Definition: E[X] = ∑ v Pr[X = v] v CSCE 411, Spring 2012: Set 10

  31. Expected Value Example • Consider a game in which you flip two fair coins. • You get $3 for each head but lose $2 for each tail. • What are your expected earnings? • I.e., what is the expected value of the random variable X, where X(HH) = 6, X(HT) = X(TH) = 1, and X(TT) = —4? • Note that no value other than 6, 1, and —4 can be taken on by X (e.g., Pr[X = 5] = 0). • E[X] = 6(1/4) + 1(1/4) + 1(1/4) + (—4)(1/4) = 1 CSCE 411, Spring 2012: Set 10

  32. Properties of Expected Values • E[X+Y] = E[X] + E[Y], for any two random variables X and Y, even if they are not independent! • E[a·X] = a·E[X], for any random variable X and any constant a. • E[X·Y] = E[X]·E[Y], for any two independent random variables X and Y CSCE 411, Spring 2012: Set 10

  33. Conditional Probability • Formalizes having partial knowledge about the outcome of an experiment • Example: flip two fair coins. • Probability of two heads is 1/4 • Probability of two heads when you already know that the first coin is a head is 1/2 • Conditional probability of A given that B occurs, denoted Pr[A|B], is defined to be Pr[AB]/Pr[B] CSCE 411, Spring 2012: Set 10

  34. Conditional Probability A Pr[A] = 5/12 Pr[B] = 7/12 Pr[AB] = 2/12 Pr[A|B] = (2/12)/(7/12) = 2/7 B CSCE 411, Spring 2012: Set 10

  35. Conditional Probability • Definition is Pr[A|B] = Pr[AB]/Pr[B] • Equivalently, Pr[AB] = Pr[A|B]·Pr[B] CSCE 411, Spring 2012: Set 10

  36. Sorting • Insertion Sort • Heapsort • Mergesort • Quicksort CSCE 411, Background Review

  37. Insertion Sort Review • How it works: • incrementally build up longer and longer prefix of the array of keys that is in sorted order • take the current key, find correct place in sorted prefix, and shift to make room to insert it • Finding the correct place relies on comparing current key to keys in sorted prefix • Worst-case running time is (n2) CSCE 411, Spring 2012: Set 2

  38. Heapsort Review • How it works: • put the keys in a heap data structure • repeatedly remove the min from the heap • Manipulating the heap involves comparing keys to each other • Worst-case running time is (n log n) CSCE 411, Spring 2012: Set 2

  39. Mergesort Review • How it works: • split the array of keys in half • recursively sort the two halves • merge the two sorted halves • Merging the two sorted halves involves comparing keys to each other • Worst-case running time is (n log n) CSCE 411, Spring 2012: Set 2

  40. 1 5 2 2 4 2 3 6 1 4 5 3 2 6 6 6 5 2 4 2 5 4 6 6 1 1 2 3 3 2 6 6 2 5 5 2 5 2 4 4 6 6 1 1 3 3 2 2 6 6 4 6 2 6 1 3 Mergesort Example CSCE 411, Spring 2012: Set 3

  41. Recurrence Relation for Mergesort • Let T(n) be worst case time on a sequence of n keys • If n = 1, then T(n) = (1) (constant) • If n > 1, then T(n) = 2 T(n/2) + (n) • two subproblems of size n/2 each that are solved recursively • (n) time to do the merge • Solving recurrence gives T(n) =  (n log n) CSCE 411, Spring 2012: Set 3

  42. Quicksort Review • How it works: • choose one key to be the pivot • partition the array of keys into those keys < the pivot and those ≥ the pivot • recursively sort the two partitions • Partitioning the array involves comparing keys to the pivot • Worst-case running time is (n2) CSCE 411, Spring 2012: Set 2

  43. Graphs • Mathematical definitions • Representations • adjacency list, adjacency matrix • Traversals • breadth-first search, depth-first search • Minimum spanning trees • Kruskal’s algorithm, Prim’s algorithm • Single-source shortest paths • Dijkstra’s algorithm, Bellman-Ford algorithm CSCE 411, Background Review

  44. Graphs • Directed graph G = (V,E) • V is a finite set of vertices • E is the edge set, a binary relation on E (ordered pairs of vertice) • In an undirected graph, edges are unordered pairs of vertices • If (u,v) is in E, then v is adjacent to u and (u,v) is incident from u and incident to v; v is neighbor of u • in an undirected graph, u is also adjacent to v, and (u,v) is incident on u and v • Degree of a vertex is number of edges incident from or to (or on) the vertex CSCE 411, Spring 2012: Set 4

  45. More on Graphs • A path of length kfrom vertex u to vertex u’ is a sequence of k+1 vertices s.t. there is an edge from each vertex to the next in the sequence • In this case, u’ is reachable from u • A path is simple if no vertices are repeated • A path forms a cycle if last vertex = first vertex • A cycle is simple if no vertices are repeated other than first and last CSCE 411, Spring 2012: Set 4

  46. Graph Representations • So far, we have discussed graphs purely as mathematical concepts • How can we represent a graph in a computer program? We need some data structures. • Two most common representations are: • adjacency list – best for sparse graphs (few edges) • adjacency matrix – best for dense graphs (lots of edges) CSCE 411, Spring 2012: Set 4

  47. Adjacency List Representation • Given graph G = (V,E) • array Adj of |V| linked lists, one for each vertex in V • The adjacency list for vertex u, Adj[u], contains all vertices v s.t. (u,v) is in E • each vertex in the list might be just a string, representing its name, or it might be some other data structure representing the vertex, or it might be a pointer to the data structure for that vertex • Requires Θ(V+E) space (memory) • Requires Θ(degree(u)) time to check if (u,v) is in E CSCE 411, Spring 2012: Set 4

  48. a c d b e Adjacency List Representation b c a b a d e c a d d b c e e b d Space-efficient for sparse graphs CSCE 411, Spring 2012: Set 8

  49. Adjacency Matrix Representation • Given graph G = (V,E) • Number the vertices 1, 2, ..., |V| • Use a |V| x |V| matrix A with (i,j)-entry of A being • 1 if (i,j) is in E • 0 if (i,j) is not in E • Requires Θ(V2) space (memory) • Requires Θ(1) time to check if (u,v) is in E CSCE 411, Spring 2012: Set 4

  50. a c d b e Adjacency Matrix Representation a b c d e 0 a 1 1 0 0 b 1 0 0 1 1 c 1 0 0 1 0 d 0 1 1 0 1 e 0 1 0 1 0 Check for an edge in constant time CSCE 411, Spring 2012: Set 8

More Related