1.11k likes | 1.13k Views
Explore algorithm analysis tools like Big-Oh notation and probability concepts for measuring efficiency. Learn the importance of selecting faster algorithms over CPU speed and the significance of Big-Oh, Big-Omega, and Big-Theta notations in algorithm analysis.
E N D
CSCE 411Design and Analysis of Algorithms Background Review Prof. Jennifer Welch CSCE 411, Background Review
Outline • Algorithm Analysis Tools • Sorting • Graph Algorithms CSCE 411, Background Review
Algorithm Analysis Tools • Big-Oh notation • Basic probability CSCE 411, Background Review
How to Measure Efficiency • Machine-independent way: • analyze "pseudocode" version of algorithm • assume idealized machine model • one instruction takes one time unit • "Big-Oh" notation • order of magnitude as problem size increases • Worst-case analyses • safe, often occurs most often, average case often just as bad CSCE 411, Background Review
Faster Algorithm vs. Faster CPU • A faster algorithm running on a slower machine will always win for large enough instances • Suppose algorithm S1 sorts n keys in 2n2 instructions • Suppose computer C1 executes 1 billion instruc/sec • When n = 1 million, takes 2000 sec • Suppose algorithm S2 sorts n keys in 50nlog2n instructions • Suppose computer C2 executes 10 million instruc/sec • When n = 1 million, takes 100 sec CSCE 411, Background Review
Caveat • No point in finding fastest algorithm for part of the program that is not the bottleneck • If program will only be run a few times, or time is not an issue (e.g., run overnight), then no point in finding fastest algorithm CSCE 411, Background Review
Review of Big-Oh Notation • Measure time as a function of input size and ignore • multiplicative constants • lower order terms • f(n) = O(g(n)) : f is “≤” g • f(n) = Ω(g(n)) : f is “≥” g • f(n) = (g(n)) : f is “=” g • f(n) = o(g(n)) : f is “<” g • f(n) = (g(n)) : f is “>” g CSCE 411, Background Review
Big-Oh • f(n) = O(g(n)) means, informally, that g is an upper bound on f • Formally: there exist c > 0, n0 > 0 s.t. f(n) ≤ cg(n) for all n ≥ n0. • To show that f(n) = O(g(n)), either find c and n0 that satisfy the definition, or use this fact: • if limn->∞ f(n)/g(n) = some nonnegative constant, then f(n) = O(g(n)). CSCE 411, Background Review
Big-Omega • f(n) = Ω(g(n)) means, informally, that g is an lower bound on f • Formally: there exist c > 0, n0 > 0 s.t. f(n) ≥ cg(n) for all n ≥ n0. • To show that f(n) = Ω(g(n)), either find c and n0 that satisfy the definition, or use this fact: • if limn->∞ f(n)/g(n) = some positive constant or ∞, then f(n) = Ω(g(n)). CSCE 411, Background Review
Big-Theta • f(n) = Θ(g(n)) means, informally, that g is a tight bound on f • Formally: there exist c1 > 0, c2 > 0, n0 > 0 s.t. c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0. • To show that f(n) = Θ(g(n)), either find c1, c2, n0 that satisfy the definition, or use this fact: • if limn->∞ f(n)/g(n) = some positive constant, then f(n) = Θ(g(n)). CSCE 411, Background Review
Tying Back to Algorithm Analysis • Algorithm has worst case running time O(g(n)) if there exist c > 0 and n0 > 0 s.t. for all n ≥ n0, every execution (including the slowest) on an input of size n takes at most cg(n) time. • Algorithm has worst case running time Ω(g(n)) if there exist c > 0 and n0 > 0 s.t. for all n ≥ n0, at least one execution (including the slowest) on an input of size n takes at least cg(n) time. CSCE 411, Background Review
More Definitions • f(n) = o(g(n)) means, informally, g is a non-tight upper bound on f • limn->∞ f(n)/g(n) = 0 • pronounced “little-oh” • f(n) = ω(g(n)) means, informally, g is a non-tight lower bound on f • limn->∞ f(n)/g(n) = ∞ • pronounced “little-omega” CSCE 411, Background Review
How To Solve Recurrences • Ad hoc method: • expand several times • guess the pattern • can verify with proof by induction • Master theorem • general formula that works if recurrence has the form T(n) = aT(n/b) + f(n) • a is number of subproblems • n/b is size of each subproblem • f(n) is cost of non-recursive part CSCE 411, Spring 2012: Set 3
Probability • Every probabilistic claim ultimately refers to some sample space, which is a set of elementary events • Think of each elementary event as the outcome of some experiment • Ex: flipping two coins gives sample space {HH, HT, TH, TT} • An event is a subset of the sample space • Ex: event "both coins flipped the same" is {HH, TT} CSCE 411, Spring 2012: Set 10
HT HH TH TT Sample Spaces and Events A S CSCE 411, Spring 2012: Set 10
Probability Distribution • A probability distribution Pr on a sample space S is a function from events of S to real numbers s.t. • Pr[A] ≥ 0 for every event A • Pr[S] = 1 • Pr[A U B] = Pr[A] + Pr[B] for every two non- intersecting ("mutually exclusive") events A and B • Pr[A] is the probability of event A CSCE 411, Spring 2012: Set 10
Probability Distribution Useful facts: • Pr[Ø] = 0 • If A B, then Pr[A] ≤ Pr[B] • Pr[S — A] = 1 — Pr[A] // complement • Pr[A U B] = Pr[A] + Pr[B] – Pr[A B] ≤ Pr[A] + Pr[B] CSCE 411, Spring 2012: Set 10
Probability Distribution B A Pr[A U B] = Pr[A] + Pr[B] – Pr[A B] CSCE 411, Spring 2012: Set 10
1/4 HT 1/4 1/4 HH TH TT 1/4 Example • Suppose Pr[{HH}] = Pr[{HT}] = Pr[{TH}] = Pr[{TT}] = 1/4. • Pr["at least one head"] = Pr[{HH U HT U TH}] = Pr[{HH}] + Pr[{HT}] + Pr[{TH}] = 3/4. • Pr["less than one head"] = 1 — Pr["at least one head"] = 1 — 3/4 = 1/4 CSCE 411, Spring 2012: Set 10
Specific Probability Distribution • discrete probability distribution: sample space is finite or countably infinite • Ex: flipping two coins once; flipping one coin infinitely often • uniform probability distribution: sample space S is finite and every elementary event has the same probability, 1/|S| • Ex: flipping two fair coins once CSCE 411, Spring 2012: Set 10
Flipping a Fair Coin • Suppose we flip a fair coin n times • Each elementary event in the sample space is one sequence of n heads and tails, describing the outcome of one "experiment" • The size of the sample space is 2n. • Let A be the event "k heads and nk tails occur". • Pr[A] = C(n,k)/2n. • There are C(n,k) sequences of length n in which k heads and n–k tails occur, and each has probability 1/2n. CSCE 411, Spring 2012: Set 10
Example • n = 5, k = 3 • Event A is {HHHTT, HHTTH, HTTHH, TTHHH, HHTHT, HTHTH, THTHH, HTHHT, THHTH, THHHT} • Pr[3 heads and 2 tails] = C(5,3)/25 = 10/32 CSCE 411, Spring 2012: Set 10
2/9 HT 4/9 2/9 HH TH TT 1/9 Flipping Unfair Coins • Suppose we flip two coins, each of which gives heads two-thirds of the time • What is the probability distribution on the sample space? Pr[at least one head] = 8/9 CSCE 411, Spring 2012: Set 10
Independent Events • Two events A and B are independent if Pr[A B] = Pr[A]·Pr[B] • I.e., probability that both A and B occur is the product of the separate probabilities that A occurs and that B occurs. CSCE 411, Spring 2012: Set 10
B A 1/4 HT 1/4 1/4 HH TH TT 1/4 Independent Events Example In two-coin-flip example with fair coins: • A = "first coin is heads" • B = "coins are different" Pr[A] = 1/2 Pr[B] = 1/2 Pr[A B] = 1/4 = (1/2)(1/2) so A and B are independent CSCE 411, Spring 2012: Set 10
Discrete Random Variables • A discrete random variable X is a function from a finite or countably infinite sample space to the real numbers. • Associates a real number with each possible outcome of an experiment • Define the event "X = v" to be the set of all the elementary events s in the sample space with X(s) = v. • Pr["X = v"] is the sum of Pr[{s}] over all s with X(s) = v. CSCE 411, Spring 2012: Set 10
Discrete Random Variable X=v X=v X=v X=v X=v Add up the probabilities of all the elementary events in the orange event to get the probability that X = v CSCE 411, Spring 2012: Set 10
Random Variable Example • Roll two fair 6-sided dice. • Sample space contains 36 elementary events (1:1, 1:2, 1:3, 1:4, 1:5, 1:6, 2:1,…) • Probability of each elementary event is 1/36 • Define random variable X to be the maximum of the two values rolled • What is Pr["X = 3"]? • It is 5/36, since there are 5 elementary events with max value 3 (1:3, 2:3, 3:3, 3:2, and 3:1) CSCE 411, Spring 2012: Set 10
Independent Random Variables • It is common for more than one random variable to be defined on the same sample space. E.g.: • X is maximum value rolled • Y is sum of the two values rolled • Two random variables X and Y are independent if for all v and w, the events "X = v" and "Y = w" are independent. CSCE 411, Spring 2012: Set 10
Expected Value of a Random Variable • Most common summary of a random variable is its "average", weighted by the probabilities • called expected value, or expectation, or mean • Definition: E[X] = ∑ v Pr[X = v] v CSCE 411, Spring 2012: Set 10
Expected Value Example • Consider a game in which you flip two fair coins. • You get $3 for each head but lose $2 for each tail. • What are your expected earnings? • I.e., what is the expected value of the random variable X, where X(HH) = 6, X(HT) = X(TH) = 1, and X(TT) = —4? • Note that no value other than 6, 1, and —4 can be taken on by X (e.g., Pr[X = 5] = 0). • E[X] = 6(1/4) + 1(1/4) + 1(1/4) + (—4)(1/4) = 1 CSCE 411, Spring 2012: Set 10
Properties of Expected Values • E[X+Y] = E[X] + E[Y], for any two random variables X and Y, even if they are not independent! • E[a·X] = a·E[X], for any random variable X and any constant a. • E[X·Y] = E[X]·E[Y], for any two independent random variables X and Y CSCE 411, Spring 2012: Set 10
Conditional Probability • Formalizes having partial knowledge about the outcome of an experiment • Example: flip two fair coins. • Probability of two heads is 1/4 • Probability of two heads when you already know that the first coin is a head is 1/2 • Conditional probability of A given that B occurs, denoted Pr[A|B], is defined to be Pr[AB]/Pr[B] CSCE 411, Spring 2012: Set 10
Conditional Probability A Pr[A] = 5/12 Pr[B] = 7/12 Pr[AB] = 2/12 Pr[A|B] = (2/12)/(7/12) = 2/7 B CSCE 411, Spring 2012: Set 10
Conditional Probability • Definition is Pr[A|B] = Pr[AB]/Pr[B] • Equivalently, Pr[AB] = Pr[A|B]·Pr[B] CSCE 411, Spring 2012: Set 10
Sorting • Insertion Sort • Heapsort • Mergesort • Quicksort CSCE 411, Background Review
Insertion Sort Review • How it works: • incrementally build up longer and longer prefix of the array of keys that is in sorted order • take the current key, find correct place in sorted prefix, and shift to make room to insert it • Finding the correct place relies on comparing current key to keys in sorted prefix • Worst-case running time is (n2) CSCE 411, Spring 2012: Set 2
Heapsort Review • How it works: • put the keys in a heap data structure • repeatedly remove the min from the heap • Manipulating the heap involves comparing keys to each other • Worst-case running time is (n log n) CSCE 411, Spring 2012: Set 2
Mergesort Review • How it works: • split the array of keys in half • recursively sort the two halves • merge the two sorted halves • Merging the two sorted halves involves comparing keys to each other • Worst-case running time is (n log n) CSCE 411, Spring 2012: Set 2
1 5 2 2 4 2 3 6 1 4 5 3 2 6 6 6 5 2 4 2 5 4 6 6 1 1 2 3 3 2 6 6 2 5 5 2 5 2 4 4 6 6 1 1 3 3 2 2 6 6 4 6 2 6 1 3 Mergesort Example CSCE 411, Spring 2012: Set 3
Recurrence Relation for Mergesort • Let T(n) be worst case time on a sequence of n keys • If n = 1, then T(n) = (1) (constant) • If n > 1, then T(n) = 2 T(n/2) + (n) • two subproblems of size n/2 each that are solved recursively • (n) time to do the merge • Solving recurrence gives T(n) = (n log n) CSCE 411, Spring 2012: Set 3
Quicksort Review • How it works: • choose one key to be the pivot • partition the array of keys into those keys < the pivot and those ≥ the pivot • recursively sort the two partitions • Partitioning the array involves comparing keys to the pivot • Worst-case running time is (n2) CSCE 411, Spring 2012: Set 2
Graphs • Mathematical definitions • Representations • adjacency list, adjacency matrix • Traversals • breadth-first search, depth-first search • Minimum spanning trees • Kruskal’s algorithm, Prim’s algorithm • Single-source shortest paths • Dijkstra’s algorithm, Bellman-Ford algorithm CSCE 411, Background Review
Graphs • Directed graph G = (V,E) • V is a finite set of vertices • E is the edge set, a binary relation on E (ordered pairs of vertice) • In an undirected graph, edges are unordered pairs of vertices • If (u,v) is in E, then v is adjacent to u and (u,v) is incident from u and incident to v; v is neighbor of u • in an undirected graph, u is also adjacent to v, and (u,v) is incident on u and v • Degree of a vertex is number of edges incident from or to (or on) the vertex CSCE 411, Spring 2012: Set 4
More on Graphs • A path of length kfrom vertex u to vertex u’ is a sequence of k+1 vertices s.t. there is an edge from each vertex to the next in the sequence • In this case, u’ is reachable from u • A path is simple if no vertices are repeated • A path forms a cycle if last vertex = first vertex • A cycle is simple if no vertices are repeated other than first and last CSCE 411, Spring 2012: Set 4
Graph Representations • So far, we have discussed graphs purely as mathematical concepts • How can we represent a graph in a computer program? We need some data structures. • Two most common representations are: • adjacency list – best for sparse graphs (few edges) • adjacency matrix – best for dense graphs (lots of edges) CSCE 411, Spring 2012: Set 4
Adjacency List Representation • Given graph G = (V,E) • array Adj of |V| linked lists, one for each vertex in V • The adjacency list for vertex u, Adj[u], contains all vertices v s.t. (u,v) is in E • each vertex in the list might be just a string, representing its name, or it might be some other data structure representing the vertex, or it might be a pointer to the data structure for that vertex • Requires Θ(V+E) space (memory) • Requires Θ(degree(u)) time to check if (u,v) is in E CSCE 411, Spring 2012: Set 4
a c d b e Adjacency List Representation b c a b a d e c a d d b c e e b d Space-efficient for sparse graphs CSCE 411, Spring 2012: Set 8
Adjacency Matrix Representation • Given graph G = (V,E) • Number the vertices 1, 2, ..., |V| • Use a |V| x |V| matrix A with (i,j)-entry of A being • 1 if (i,j) is in E • 0 if (i,j) is not in E • Requires Θ(V2) space (memory) • Requires Θ(1) time to check if (u,v) is in E CSCE 411, Spring 2012: Set 4
a c d b e Adjacency Matrix Representation a b c d e 0 a 1 1 0 0 b 1 0 0 1 1 c 1 0 0 1 0 d 0 1 1 0 1 e 0 1 0 1 0 Check for an edge in constant time CSCE 411, Spring 2012: Set 8