260 likes | 514 Views
ENGG2012B Lecture 17 Interpretations of probability Kolmogorov’s axioms. Kenneth Shum. Midterm. 22 nd Mar, F2~F3, MMW LT1 Bring calculator and blank papers. Close-book and close-note exam. No make-up exam Coverage: Lecture 1 to 15. Tutorial 1 to 7. Homework 1 to 3.
E N D
ENGG2012BLecture 17Interpretations of probabilityKolmogorov’s axioms Kenneth Shum ENGG2012B
Midterm • 22nd Mar, F2~F3, MMW LT1 • Bring calculator and blank papers. • Close-book and close-note exam. • No make-up exam • Coverage: • Lecture 1 to 15. • Tutorial 1 to 7. • Homework 1 to 3. ENGG2012B
Last week: Law of total probability C A E D B A B C D = , A, B, C, D mutually disjoint Pr(E) = Pr(EA)+ Pr(EB)+ Pr(EC)+ Pr(ED). ENGG2012B
Last week: Conditional probability • If we are given an addition information that the outcome is in event B, then we can update the likelihood to |AB|/ |B|. • If Pr(B) is nonzero, then we define the conditional probability of A given B by Pr(A|B) = Pr(AB)/Pr(B). A B At the beginning, the probability of event A is |A| / ||, assuming that all outcomes in are equally likely. ENGG2012B
Disjoint events and independent events • If two events E and F are disjoint, i.e., EF=, • then they cannot occur at the same time. • Pr(E F) = Pr(E) + Pr(F) • Pr(E F) = 0 • If two event A and B are statistically independent, then • Pr(AB) = Pr(A)Pr(B), • AB is in general not empty. a F E a B A ENGG2012B
Example of dependent events • There are six balls in a bag. • Three of them are red and three of them are blue. • We draw two balls randomly from the bag (without replacement). • Let A be the event that the first ball is red. • Let B be the event that the second ball is red. • Pr(A) = Pr(B) = 1/2 • The two events are not statistically independent, because given that the first ball is red, the second is red with probability 2/5. • Hence P(B) P(B|A). ENGG2012B
The Bayes’ rule • Let A and B be two events. Suppose that the probability of B is nonzero. • Probability of A given B can be computed from the probability of B given A by ENGG2012B
Example: The paradox of HIV test • There is a test for AIDS. • If a person is infected, the test will be positive 99% of the time. • If a person is not infect, the test will be negative 98% of the time. • Suppose person A take the test, and the result is positive. Person A comes from a population with infection rate 0.5%. What is the chance that A is infected? This is called a priori probability ENGG2012B
Solution • Let I be the event “infected”, P be the event “test result is positive”. • We want to calculate Pr(I|P). • We know that Pr(P|I) = 0.99, Pr(Pc|Ic) = 0.98. • By Bayes’ rule: Pr(I|P) = Pr(IP)/Pr(P) = Pr(P|I) Pr(I) / Pr(P). • But Pr(P) = Pr(PI) + Pr(P Ic) // law of total probability = Pr(P|I)Pr(I) + Pr(P | Ic) Pr(Ic) // by the definition of conditional probability = 0.990.5% + (1 – 0.98) 99.5%= 0.0249. • So, Pr(I|P) = 0.99 0.5% / 0.0249 = 0.199. ENGG2012B
INTERPRETATIONS OF PROBABILITY ENGG2012B
Classical interpretation • Usually credited to Laplace. • All outcomes in the sample space have the same probability. • Pr(E) = |E| / | | • It requires that the sample space is a finite set. • Calculation of probability reduces to counting. • The classical interpretation cannot model biased coin, loaded dice, etc. ENGG2012B
Frequentist interpretation • Advocated by Richard von Mises (1883-1953). • We can only talk about probability if the random experiment can be repeated many times. • The probability of an event is the relative frequency of the occurrence of the event. • E.g. If we toss a biased coin 1000 times, and we had 421 heads, then we assign the probability 0.421 to the event “head”. • The frequentist interpretation cannot model situations such as • a bridge will not collapse within the next 50 years with probability 0.999. • The nuclear plant in Daiya Bay is safe with probability 0.99999. ENGG2012B
Subjective interpretation • Probabilities are defined by personal preference, or personal belief. • Example: “2013年3月1日 14:47 長遠房屋策略督導委員會成員蔡涯棉指,土地供應不足是樓市根本問題,增加土地供應可遏抑外來及投資需求。鑑於政府預計今年會推出24000個一手單位,數量相比對過去5年推出平均不足一萬個為多,蔡氏估計樓價大升機會較微。” • Used in social science, decision making, etc. ENGG2012B
Example of subjective probability • There are two suspects X and Y in a murder case. Both of them are on the run. Initially, X and Y have the same evidence. • After more detailed investigation, it is known that the murderer has blood type B. • We also know that suspect X has blood type B, but we do not know the blood type of suspect Y. • It is also known that 10% of the population has blood type B. • What is the probability that Y is the murderer given the new information? ENGG2012B
Solution • Let H1 be the event that X is the murderer and H2 be the event that Y is the murderer. • Initially we have Pr(H1) = Pr(H2) = 0.5. These are subjective probabilities. • Let E be the event that the suspect has blood type B. • Pr(E|H1) = 1 • Pr(E|H2) = 0.1 • We want to calculate Pr(H2|E). • By Bayes’ rule: Pr(H2|E) = Pr(H2E) / Pr(E) = Pr(E|H2) Pr(H2) / Pr(E). • From the law of total probability Pr(E) = Pr(EH1) + Pr(E H2) =Pr(E|H1)Pr(H1)+ Pr(E|H2)Pr(H2)=10.5 + 0.10.5. • Therefore, Pr(H2|E) = 0.1 0.5 / (1 0.5 + 0.1 0.5) = 9%. • Pr(H1|E) = 1 – 9% = 91%. ENGG2012B
Geometric probability • A.k.a. “stochastic geometry” • The sample space is a geometric object, like a line segment, circle, square, sphere, etc. • Pick a point randomly in the sample space. The probability that the chosen point falls within a certain region is directly proportional to the area/volume of the region. • This is the uniform distribution on the geometric object. ENGG2012B
Example • Pick a point at random on a circle of radius 1. • For 0<r<1, this point falls within a circle of radius r with probability r2/ . • The probability of pickingparticular point is zero,because a point as zero area. Circle of radius 1 ENGG2012B
AXIOMS OF PROBABILITY ENGG2012B
Probability measure • Regardless of the interpretations, the calculations of probabilities must satisfy some basic rules. • Three of the very basic rules are identified by Russian mathematician Kolmogorov as the axioms of probability. • In Kolmogorov’s formulation, we assign a real number to an event. A probability measure is a function which accepts an event as an input and output a real number. • As we saw in the example of geometric probability, it is sometime more convenient to assign probabilities to events instead of points in the sample space. • We construct a probability model after assigning probability to sufficiently large number of basic events. The other probability events can be computed by union, intersection, and complements of the basic events. ENGG2012B
Kolmogorov’s axioms • Let be a sample space, which may be finite or infinite. • A probability measure assigns real numbers to some events in, satisfying the following axioms: • Pr(E) is a real number between 0 and 1. • Pr() = 1 • For any sequence of pairwise disjoint event E1, E2, E3, …, we have Pr (E1 E2 E3 …) = Pr(E1) + Pr(E2) + Pr(E3) + … ENGG2012B
Andrey Kolmogorov (1903-1987) • Russian mathematician. • The three axioms can be found in his book “Foundation of the theory of probability”, published in 1933. ENGG2012B
Immediate implications of the axioms • Although the statement of axioms is for an infinite sequence of events, it includes the case of finitely number events as special case: If E1, E2,…, En are disjoint events, then Pr(E1 E2 … En) = Pr(E1)+Pr(E2)+…+Pr(En). Proof: Apply the third axiom with =En+1= En+2= En+3= En+4 =… Then Pr(E1 E2 … En En+1 En+2 …) =Pr(E1)+ Pr(E2) +…+Pr(En)+Pr(En+1)+Pr(En+2)+… = Pr(E1)+ Pr(E2) +…+Pr(En)+0+0+… ENGG2012B
Immediate implications of the axioms • For any event E in , let Ec be the complement of the event E. We have Pr(Ec) = 1 – Pr(E), because Pr(E) + Pr(Ec) = Pr() = 1. • Let be the empty set (the null event). Pr() = 0, Proof: Pr()+Pr()=Pr() // and are disjoint =Pr() // = The result follows from subtracting Pr() from both sides. ENGG2012B
Immediate implications of the axioms C A D AB B • Let A and B be any two events, then Pr(A B) = Pr(A) + Pr(B) – Pr(AB). Proof: Let C be the event A Bc and D be the event B Ac. We have A B = C D (AB). The right-hand side is a union of disjoint set. Pr(A B) = Pr(C) + Pr(D)+ Pr(AB) //by the third axiom = Pr(C) + Pr(AB) + Pr(D)+ Pr(AB) – Pr(AB) = P(A) + P(B) – Pr(AB) //by the third axiom again ENGG2012B