180 likes | 310 Views
CSCE 2100: Computing Foundations 1 Probability Theory. Tamara Schneider Summer 2013. Why Probability Theory in CS? . Estimation of Running Time Very large worst-case running time Better average-case running time Decisions in algorithms in presence of uncertainty Medical diagnosis
E N D
CSCE 2100: Computing Foundations 1Probability Theory Tamara Schneider Summer 2013
Why Probability Theory in CS? • Estimation of Running Time • Very large worst-case running time • Better average-case running time • Decisions in algorithms in presence of uncertainty • Medical diagnosis • Allocation of resource based on future needs
Probability Space • Finite set of points, whereby each of them represents a possible outcome of a specific experiment • Each point (outcome) has a probability associated with it • Probabilities are always positive!!! • The sum of all probabilities is always 1 • Assume an equal probability distribution if not otherwise stated • e.g. 1/6 for a specific number of a die throw (unless die is not fair)
Example: Probability Space Event E that you roll a 2, 4, or 5 Probability space Imagine you throw a dart randomly at the box. You will hit the area of E 50% of the time. P(E) = 0.5
Probability and Combinatorics • We need to count the number of possible outcomes (probability space) • We need to count the number of points in the probability space for a specific event
Example: Craps Throw 2 dice and calculate the probability of obtaining a total of 7 or 11. 36 possible outcomes The event contains 8 points. p = 8/36 ≈ 22%
Example: Keno • Randomly select 20 numbers in the range of 1 and 80 (not repeated) • Players guess 5 numbers and are rewarded if they have guessed 3, 4, or 5 correct numbers • Probability space: Number of selections of twenty numbers out of eighty.
Example: Keno Ways of choosing 20 of 80: Ways of picking 3 winners out of 5 and 17 losers out of 75: p ≈ 0.084 = 8.4%
Conditional Probability • Definition: If E and F are two events in a probability space, the conditional probability of F given E is the sum of the probabilities of the points that are both in E and F (region A) divided by the sum of the probabilities of the points in E. • Notation: P(F/E) “the probability of F given E A: in E and F F E B: in E but not in F A B P(F/E) = A/E
Example 1: Conditional Probability Toss of 2 dice Probability space has 36 elements with equal probability 1/36 E: First comes out 1 (E1)F: Second comes out 1 (E2) P(E) = 6/36 = 1/6 P(F) = 6/36 = 1/6 P(F/E) = 1/6 The experiments are independent, since P(F) = P(F/E). It does not matter if E occurred or not; the probability of F stays the same.
Example 2: Conditional Probability [1] • Deal of 2 cards from a 52 card deck • Number of points in experiment (probability space): Π(52,2) = 52 × 51 =2,652 • E: First card is an ace: 4 × 51 = 204(4 choices for ace, 51 choices for second card)P(E) = 204/2,652 = 1/13 • F: Second card is an ace: 4 × 51 = 204(4 choices for ace, 51 choices for first card)P(F) = 204/2,652 = 1/13 • P(F/E) = 12/204 = 1/17 (= 3/51)since there are 4×3 = 12 combinations for aces.
Example 2: Conditional Probability [1] Probability Space: 52 × 51 =2,652 P(E) = 204/2,652 none of the cards is an ace E: first card is an ace P(F) = 204/2,652 4 × 51 = 204 P(F/E) = 12/204 2 aces 4×3=12 4 × 51 = 204 F: second card is an ace The experiments are not independent, since P(F) ≠P(F/E). It does matter if E occurred or not; the probability of F changes.
Product Rule for Independent Experiments • For a sequence of outcomes of k independent experiments, we can multiply the probabilities • Example E: The last 4 digits of a phone# are 1234 • E1 = E2 = E3 = E4 = 0.1 • P(E) = (0.1)4 = 0.0001 = 0.01%
Programming Applications - Example [1] bool find(intx, int A[], int n){ for(inti=0; i<n; i++) if(A[i] == x) return true; return false } //find • Find out if x is in array A • If x is not found, running time: O(n) • If x is found, running time: O(n) • Is the average case running time better? • Assume that all points are equally likely
Programming Applications - Example [2] bool find(intx, int A[], int n){ for(inti=0; i<n; i++) if(A[i] == x) return true; return false } //find • Probability space has n points: from 0 to n-1 • If x is in A[k], then the loop iterates k times • Assume we find it in the i-th iteration with cost c per iteration • cost d needed for initialization and return statement • running time: ci + d O(n)
Monte Carlo Algorithms • Deterministic algorithm: For the same data and same input we will always receive the same output. • Monte-Carlo algorithm: Makes a random selection at each iteration.
Monte Carlo Example Computer chip factory • Probability of a bad chip in untested box is 0.1 • Testing a box of n chips takes O(n) time • Monte Carlo: Select k chips from each box • If all k tested chips are OK, declare box OK • 1/10 of the chips are bad if box is untested • Probability of saying OK after testing 1 chip : 0.9 • Probability of error: (0.9)k • For k=131 tests (0.9)k ≈ 0.000001 (if the box is good, we should find a bad chip with a probability ≈ none) • ⇒ So we find a bad chip with a probability 0.999999 = 99.9999%
Summary • Probability Space • Conditional Probabilities • Independent Experiments • Monte Carlo Algorithms