PROBABILITY AND BAYES THEOREM

PROBABILITY AND BAYES THEOREM

PROBABILITY POPULATION SAMPLE STATISTICAL INFERENCE

PROBABILITY: A numerical value expressing the degree of uncertainty regarding the occurrence of an event. A measure of uncertainty. • STATISTICAL INFERENCE: The science of drawing inferences about the population based only on a part of the population, sample.

PROBABILITY • CLASSICAL INTERPRETATION If a random experiment is repeated an infinite number of times, the relative frequency for any given outcome is the probability of this outcome. Probability of an event: Relative frequency of the occurrence of the event in the long run. • Example: Probability of observing a head in a fair coin toss is 0.5 (if coin is tossed long enough). • SUBJECTIVE INTERPRETATION The assignment of probabilities to event of interest is subjective • Example: I am guessing there is 50% chance of rain today.

PROBABILITY • Random experiment • a random experiment is a process or course of action, whose outcome is uncertain. • Examples Experiment Outcomes • Flip a coin Heads and Tails • Record a statistics test marks Numbers between 0 and 100 • Measure the time to assemble Numbers from zero and abovea computer

PROBABILITY • Performing the same random experiment repeatedly, may result in different outcomes, therefore, the best we can do is consider the probability of occurrence of a certain outcome. • To determine the probabilities, first we need to define and list the possible outcomes

Sample Space • Determining the outcomes. • Build an exhaustive list of all possible outcomes. • Make sure the listed outcomes are mutually exclusive. • The set of all possible outcomes of an experiment is called a sample space and denoted byS.

Sample Space Uncountable (Continuous ) Countable Finite number of elements Infinite number of elements

EXAMPLES • Countable sample space examples: • Tossing a coin experiment S : {Head, Tail} • Rolling a dice experiment S : {1, 2, 3, 4, 5, 6} • Determination of the sex of a newborn child S : {girl, boy} • Uncountable sample space examples: • Life time of a light bulb S : [0, ∞) • Closing daily prices of a stock S : [0, ∞)

Sample Space • Multiple sample spaces for the same experiment are possible • E.g. with 5 coin tosses we can take: S={HHHHH, HHHHT, …} or if we are only interested in the number of heads we can take S*={0,1,2,3,4,5}

EXAMPLES • Examine 3 fuses in sequence and note the results of each experiment, then an outcome for the entire experiment is any sequence of N’s (non-defectives) and D’s (defectives) of length 3. Hence, the sample space is S : { NNN, NND, NDN, DNN, NDD, DND, DDN, DDD}

Assigning Probabilities • Given a sample space S ={O1,O2,…,Ok}, the following characteristics for the probability P(Oi) of the simple event Oi must hold: • Probability of an event: The probability P(A), of event A is the sum of the probabilities assigned to the simple events contained in A.

Assigning Probabilities • P(A) is the proportion of times the event A is observed.

Set theory: Definitions • Set: a set A is a collection of elements (or outcomes) • Membership: x A (x is in A), or x A (x is not in A) • Complement: • Union: • Intersection: • Difference: • Subset: A is contained in B • Equality: • Symmetric difference:

Algebraic laws • commutative: A ∪ B = B ∪ A A ∩ B = B ∩ A • associative: (A ∪ B) ∪ C = A ∪ (B ∪ C) A ∩ (B ∩ C) = (A ∩ B) ∩ C • distributive: A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) • DeMorgan’s: (A ∪ B)' = A' ∩ B' (' is complement) (A ∩ B)' = A' ∪ B'

Intersection • The intersection of event A and B is the event that occurs when both A and B occur. • The intersection of events A and B is denoted by (A and B) or AB. • The joint probability of A and B is the probability of the intersection of A and B, which is denoted by P(A and B) or P(AB).

Union • The union event of A and B is the event that occurs when either A or B or both occur. • At least one of the events occur. • It is denoted “A or B” OR AB

Addition Rule For any two events A and B P(A  B) = P(A) + P(B) - P(A  B)

Complement Rule • The complement of event A (denoted by AC) is the event that occurs when event A does not occur. • The probability of the complement event is calculated by A and AC consist of all the simple events in the sample space. Therefore,P(A) + P(AC) = 1 P(AC) = 1 - P(A)

MUTUALLY EXCLUSIVE EVENTS • Two events A and B are said to be mutually exclusive or disjoint, if A and B have no common outcomes. That is, A and B =  (empty set) • The events A1,A2,… are pairwise mutually exclusive (disjoint), if • Ai  Aj =  for all i  j.

EXAMPLE • The number of spots turning up when a six-sided dice is tossed is observed. Consider the following events. A: The number observed is at most 2. B: The number observed is an even number. C: The number 4 turns up.

S 2 1 1 1 A A A 3 5 2 2 B B B C 4 4 6 6 4 6 AB 2 VENN DIAGRAM • A graphical representation of the sample space. AB AC = A and C are mutually exclusive

AXIOMS OF PROBABILTY(KOLMOGOROV AXIOMS) Given a sample space S, the probability function is a function P that satisfies 1) For any event A, 0  P(A)  1. 2) P(S) = 1. 3) If A1, A2,… are pairwise disjoint, then

Probability P : S  [0,1] Probability domain range function

THE CALCULUS OF PROBABILITIES • If P is a probability function and A is any set, then a. P()=0 b. P(A)  1 c. P(AC)=1  P(A)

THE CALCULUS OF PROBABILITIES • If P is a probability function and A and B any sets, then • P(B  AC) = P(B)P(A  B) • If A  B, then P(A)  P(B) c. P(A  B)  P(A)+P(B)  1 (Bonferroni Inequality) d. (Boole’s Inequality)

Principle of Inclusion-Exclusion • A generalization of addition rule • Proof by induction

EQUALLY LIKELY OUTCOMES • The same probability is assigned to each simple event in the sample space, S. • Suppose that S={s1,…,sN} is a finite sample space. If all the outcomes are equally likely, then P({si})=1/N for every outcome si.

ODDS • The odds of an event A is defined by • It tells us how much more likely to see the • occurrence of event A. • P(A)=3/4P(AC)=1/4 P(A)/P(AC) = 3. • That is, the odds is 3. It is 3 times more likely that A occurs as it is that it does not.

ODDS RATIO • OR is the ratio of two odds. • Useful for comparing the odds under two different conditions or for two different groups, e.g. odds for males versus females. • If odds of event A is 4.2 for males and 2 for females, then odds ratio is 2.1. The odds of observing event A is 2.1 times higher for males compared to females.

CONDITIONAL PROBABILITY • (Marginal) Probability: P(A): How likely is it that an event A will occur when an experiment is performed? • Conditional Probability: P(A|B): How will the probability of event A be affected by the knowledge of the occurrence or nonoccurrence of event B? • If two events are independent, then P(A|B)=P(A)

CONDITIONAL PROBABILITY

Example • Roll two dice • S=all possible pairs ={(1,1),(1,2),…,(6,6)} • Let A=first roll is 1; B=sum is 7; C=sum is 8 • P(A|B)=?; P(A|C)=? • Solution: • P(A|B)=P(A and B)/P(B) P(B)=P({1,6} or {2,5} or {3,4} or {4,3} or {5,2} or {6,1}) = 6/36=1/6 P(A|B)= P({1,6})/(1/6)=1/6 =P(A) A and B are independent

Example • P(A|C)=P(A and C)/P(C)=P(Ø)/P(C)=0 A and C are disjoint Out of curiosity: P(C)=P({2,6} or {3,5} or {4,4} or {5,3} or {6,2}) = 5/36

CONDITIONAL PROBABILITY

Example • Suppose we pick 4 cards at random from a deck of 52 cards containing 4 aces. • A=event that we pick 4 aces • Ai=event that ith pick is an ace (i=1,2,3,4)

BAYES THEOREM • Suppose you have P(B|A), but need P(A|B).

Example • Let: • D: Event that person has the disease; • T: Event that medical test results positive • Given: • Previous research shows that 0.3 % of all Turkish population carries this disease; i.e., P(D)= 0.3 % = 0.003 • Probability of observing a positive test result for someone with the disease is 95%; i.e., P(T|D)=0.95 • Probability of observing a positive test result for someone without the disease is 4%; i.e. P(T| )= 0.04 • Find: probability of a randomly chosen person having the disease given that the test result is positive.

Example • Solution: Need P(D|T). Use Bayes Thm. P(D|T)=P(T|D)*P(D)/P(T) P(T)=P(D and T)+P( and T) = 0.95*0.003+0.04*0.997 = 0.04273 P(D|T) =0.95*0.003 / 0.04273 = 6.67 % Test is not very reliable!

BAYES THEOREM • Can be generalized to more than two events. • If Ai is a partition of S, then, • Can be rewritten in terms of odds • Suppose A1,A2,… are competing hypotheses and B is evidence or data relevant to choosing the correct hypothesis Posterior odds = likelihood ratio x prior odds

Independence • A and B are independent iff • P(A|B)=P(A) or P(B|A)=P(B) • P(AB)=P(A)P(B) • A1, A2, …, An are mutually independent iff for every subset j of {1,2,…,n} E.g. for n=3, A1, A2, A3 are mutually independent iff P(A1A2A3)=P(A1)P(A2)P(A3) and P(A1A2)=P(A1)P(A2) and P(A1A3)=P(A1)P(A3) and P(A2A3)=P(A2)P(A3)

Independence • If n=4, then the number of conditions for independence is • Find these conditions.

Sequences of events • A sequence of events A1, A2, … is increasing iff • A sequence of events A1, A2, … is decreasing iff • If {An} is increasing, then • If {An} is decreasing, then

Examples • Let S=(0,1) and An=(1/n,1) {An} is increasing. What is limit of An as n goes to infinity? • Let S=(0,1) and Bn=(0,1/n) {Bn} is decreasing. What is limit of Bn as n goes to infinity?

Problems 1. Show that two nonempty events cannot be disjoint and independent at the same time. Hint: First, prove that if they are disjoint, then they are not independent. Second, prove that if they are independent, then they are not disjoint.

Problems 2. If P(A)=1/3 and P(Bc)=1/4, can A and B be disjoint? Explain.

Problems 3. Either prove the statement is true or disprove it: If P(B|A)=P(B|AC), then A and B are independent.

Problems 4. An insurance company has three types of customers – high risk, medium risk, and low risk. Twenty percent of its customers are high risk, and 30% are medium risk. Also, the probability that a customer has at least one accident in the current year is 0.25 for high risk, 0.16 for medium risk, and 0.1 for low risk. a) Find the probability that a customer chosen at random will have at least one accident in the current year. b) Find the probability that a customer is high risk, given that the person has had at least one accident during the current year.

Problems 5. Eleven poker chips are numbered consecutively 1 through 10, with two of them labeled with a 6 and placed in a jar. A chip is drawn at random. • Find the probability of drawing a 6. • Find the odds of drawing a 6 from the jar. • Find the odds of not drawing a 6.

Problems 6. If the odds in favor of winning a horse race are 3:5, find the probability of winning the race.

PROBABILITY AND BAYES THEOREM