810 likes | 1.19k Views
Intro to Probability. Zhi Wei. Outline. Basic concepts in probability theory Random variable and probability distribution Bayes ’ rule. Introduction. Probability is the study of randomness and uncertainty.
E N D
Intro to Probability Zhi Wei
Outline • Basic concepts in probability theory • Random variable and probability distribution • Bayes’ rule
Introduction • Probability is the study of randomness and uncertainty. • In the early days, probability was associated with games of chance (gambling).
Simple Games Involving Probability Game: A fair die is rolled. If the result is 2, 3, or 4, you win $1; if it is 5, you win $2; but if it is 1 or 6, you lose $3. Should you play this game?
Random Experiment • a random experiment is a process whose outcome is uncertain. • Examples: • Tossing a coin once or several times • Picking a card or cards from a deck • Measuring temperature of patients • ...
Events & Sample Spaces Sample Space The sample space is the set of all possible outcomes. Event An event is any subset of the whole sample space Simple Events The individual outcomes are called simple events.
Example Experiment: Toss a coin 3 times. • Sample space = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}. • Examples of events include • A = {at least two heads} • B = {exactly two tails.} = {HHH, HHT,HTH, THH} = {HTT, THT,TTH}
Basic Concepts (from Set Theory) • AB, the union of two events A and B, is the event consisting of all outcomes that are either in Aor in Bor in both events. • AB (AB), the intersection of two events A and B, is the event consisting of all outcomes that are in both events. • Ac, the complement of an event A, is the set of all outcomes in that are not in A. • A-B, the set difference, is the event consisting of all outcomes that in A but not in B • When two events A and B have no outcomes in common, they are said to be mutually exclusive, or disjoint, events.
Example Experiment: toss a coin 10 times and the number of heads is observed. • Let A = { 0, 2, 4, 6, 8, 10}. • B = { 1, 3, 5, 7, 9}, C = {0, 1, 2, 3, 4, 5}. • AB= {0, 1, …, 10} = . • AB contains no outcomes. So A and B are mutually exclusive. • Cc = {6, 7, 8, 9, 10}, A C = {0, 2, 4}, A-C={6,8,10}
Rules • Commutative Laws: • AB = B A, AB = B A • Associative Laws: • (AB) C = A (B C ) • (AB) C = A (B C) . • Distributive Laws: • (AB)C = (A C) (B C) • (AB)C = (A C) (B C) • DeMorgan’s Laws:
Venn Diagram A A B B A∩B A∩B
Probability • A Probability is a number assigned to each subset (events) of a sample space . • Probability distributions satisfy the followingrules:
Axioms of Probability • For any event A, 0 P(A) 1. • P() =1. • If A1, A2, …An is a partition of A, then • P(A) = P(A1)+ P(A2)+...+ P(An) • (A1, A2, …An is called a partition of A if A1A2…An = A and A1, A2, …An are mutually exclusive.)
Properties of Probability • For any event A, P(Ac) = 1 - P(A). • P(A-B)=P(A) – P(A B) • If A B, then P(A - B) = P(A) – P(B) • For any two events A and B, • P(AB) = P(A) + P(B) - P(AB). • For three events, A, B, and C, • P(ABC) = P(A) + P(B) + P(C) - • P(AB) - P(AC) - P(BC) + P(AB C).
Example • In a certain population, 10% of the people are rich, 5% are famous, and 3% are both rich and famous. A person is randomly selected from this population. What is the chance that the person is • not rich? • rich but not famous? • either rich or famous?
Joint Probability • For events A and B, joint probability Pr(AB) stands for the probability that both events happen. • Example: A={HT}, B={HT, TH}, what is the joint probability Pr(AB)?
Independence • Two events A and B are independent in case Pr(AB) = Pr(A)Pr(B)
Independence • Two events A and B are independent in case Pr(AB) = Pr(A)Pr(B) • Example 1: Drug test A = {A patient is a Women} B = {Drug fails} Are A and B independent?
Independence • Two events A and B are independent in case Pr(AB) = Pr(A)Pr(B) • Example 1: Drug test • Example 2: toss a coin 3 times, Let • = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}. A={having both T and H}, B={at most one T} Are A and B independent? How about toss 2 times? A = {A patient is a Women} B = {Drug fails} Are A and B independent?
Independence • If A and B independent, then A and Bc, B and Ac, Ac and Bc independent • Consider the experiment of tossing a coin twice • Example I: • A = {HT, HH}, B = {HT} • Will event A independent from event B? • Example II: • A = {HT}, B = {TH} • Is event A independent from event B? • Disjoint Independence • If A is independent from B, B is independent from C, will A be independent from C?
Conditioning • If A and B are events with Pr(A) > 0, the conditional probability of B given A is
Conditioning • If A and B are events with Pr(A) > 0, the conditional probability of B given A is • Example: Drug test A = {Patient is a Women} B = {Drug fails} Pr(B|A) = ? Pr(A|B) = ?
Conditioning • If A and B are events with Pr(A) > 0, the conditional probability of B given A is • Example: Drug test • Given A is independent from B, what is the relationship between Pr(A|B) and Pr(A)? A = {Patient is a Women} B = {Drug fails} Pr(B|A) = ? Pr(A|B) = ?
A = {Using Drug I} B = {Using Drug II} C = {Drug succeeds} Pr(C|A) ~ 10% Pr(C|B) ~ 50% Simpson’s Paradox: View I Drug II is better than Drug I
Simpson’s Paradox: View II Female Patient A = {Using Drug I} B = {Using Drug II} C = {Drug succeeds} Pr(C|A) ~ 10% Pr(C|B) ~ 5%
Simpson’s Paradox: View II Male Patient A = {Using Drug I} B = {Using Drug II} C = {Drug succeeds} Pr(C|A) ~ 100% Pr(C|B) ~ 50% Female Patient A = {Using Drug I} B = {Using Drug II} C = {Drug succeeds} Pr(C|A) ~ 10% Pr(C|B) ~ 5%
Simpson’s Paradox: View II Drug I is better than Drug II Male Patient A = {Using Drug I} B = {Using Drug II} C = {Drug succeeds} Pr(C|A) ~ 100% Pr(C|B) ~ 50% Female Patient A = {Using Drug I} B = {Using Drug II} C = {Drug succeeds} Pr(C|A) ~ 10% Pr(C|B) ~ 5%
Conditional Independence • Event A and B are conditionally independent given C in case Pr(AB|C)=Pr(A|C)Pr(B|C) • A set of events {Ai} is conditionally independent given C in case
Conditional Independence (cont’d) • Example: There are three events: A, B, C • Pr(A) = Pr(B) = Pr(C) = 1/5 • Pr(AC) = Pr(BC) = 1/25, Pr(AB) = 1/10 • Pr(ABC) = 1/125 • Whether A, B are independent? • Whether A, B are conditionally independent given C? • A and B are independent A and B are conditionally independent
Outline • Basic concepts in probability theory • Random variable and probability distribution • Bayes’ rule
Random Variable and Distribution • A random variable Xis a numerical outcome of a random experiment • The distributionof a random variable is the collection of possible outcomes along with their probabilities: • Categorical case: • Numerical case:
Random Variables Distributions • Cumulative Probability Distribution (CDF): • Probability Density Function (PDF):
Random Variable: Example • Let S be the set of all sequences of three rolls of a die. Let X be the sum of the number of dots on the three rolls. • What are the possible values for X? • Pr(X = 5) = ?, Pr(X = 10) = ?
Expectation value • Division of the stakes problem Henry and Tony play a game. They toss a fair coin, if get a Head, Henry wins; Tail, Tony wins. They contribute equally to a prize pot of $100, and agree in advance that the first player who has won 3 rounds will collect the entire prize. However, the game is interrupted for some reason after 3 rounds. They got 2 H and 1 T. How should they divide the pot fairly? • It seems unfair to divide the pot equally Since Henry has won 2 out of 3 rounds. Then, how about Henry gets 2/3 of $100? • Other thoughts? X is what Henry will win if the game not interrupted
Expectation • Definition: the expectation of a random variable is • , discrete case • , continuous case • Properties • Summation: For any n≥1, and any constants k1,…,kn • Product: If X1, X2, …, Xn are independent
Expectation: Example • Let S be the set of all sequence of three rolls of a die. Let X be the sum of the number of dots on the three rolls. • What is E(X)? • Let S be the set of all sequence of three rolls of a die. Let X be the product of the number of dots on the three rolls. • What is E(X)?
Variance • Definition: the variance of a random variable X is the expectation of (X-E[x])2 : • Properties • For any constant C, Var(CX)=C2Var(X) • If X1, X2, …, Xn are independent
Bernoulli Distribution • The outcome of an experiment can either be success (i.e., 1) and failure (i.e., 0). • Pr(X=1) = p, Pr(X=0) = 1-p, or • E[X] = p, Var(X) = p(1-p) • Using sample() to generate Bernoulli samples > n = 10; p = 1/4; > sample(0:1, size=n, replace=TRUE, prob=c(1-p, p)) [1] 0 1 0 0 0 1 0 0 0 1
Binomial Distribution • n draws of a Bernoulli distribution • Xi~Bernoulli(p), X=i=1nXi, X~Bin(p, n) • Random variable X stands for the number of times that experiments are successful. • n = the number of trials • x = the number of successes • p = the probability of success • E[X] = np, Var(X) = np(1-p)
the binomial distribution in R • dbinom(x, size, prob) • Try 7 times, equally likely succeed or fail > dbinom(3,7,0.5) [1] 0.2734375 >barplot(dbinom(0:7,7,0.5),names.arg=0:7)
what if p ≠ 0.5? • > barplot(dbinom(0:7,7,0.1),names.arg=0:7)
Which distribution has greater variance? p = 0.1 p = 0.5 var = n*p*(1-p) = 7*0.1*0.9=7*0.09 var = n*p*(1-p) = 7*0.5*0.5 = 7*0.25
briefly comparing an experiment to a distribution nExpr = 1000 tosses = 7; y=rep(0,nExpr); for (i in 1:nExpr) { x = sample(c("H","T"), tosses, replace = T) y[i] = sum(x=="H") } hist(y,breaks=-0.5:7.5) lines(0:7,dbinom(0:7,7,0.5)*nExpr) points(0:7,dbinom(0:7,7,0.5)*nExpr) theoretical distribution result of 1000 trials Histogram of y 300 250 200 150 Frequency 100 50 0 0 2 4 6 y
Cumulative distribution P(X=x) P(X≤x) > barplot(dbinom(0:7,7,0.5),names.arg=0:7) > barplot(pbinom(0:7,7,0.5),names.arg=0:7)
cumulative distribution 1.0 1.0 0.8 0.8 0.6 0.6 cumulative distribution probability distribution 0.4 0.4 0.2 0.2 0.0 0.0 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 P(X=x) P(X≤x)
example: surfers on a website • Your site has a lot of visitors 45% of whom are female • You’ve created a new section on gardening • Out of the first 100 visitors, 55 are female. • What is the probability that this many or more of the visitors are female? • P(X≥55) = 1 – P(X≤54) = 1-pbinom(54,100,0.45)
Another way to calculate cumulative probabilities • ?pbinom • P(X≤x) = pbinom(x, size, prob, lower.tail = T) • P(X>x) = pbinom(x, size, prob, lower.tail = F) > 1-pbinom(54,100,0.45) [1] 0.02839342 > pbinom(54,100,0.45,lower.tail=F) [1] 0.02839342
Female surfers visiting a section of a website what is the area under the curve?
Cumulative distribution > 1-pbinom(54,100,0.45) [1] 0.02839342 <3 %