280 likes | 471 Views
GG 313 Lecture 4 Probability Basics. 9/1/05. Why Study Probability?. We need to determine what is probable - not just possible.
E N D
Why Study Probability? We need to determine what is probable - not just possible. Expect that the most probable explanation is the right one. My advisor called this “The principle of minimum astonishment”. This is not always the case, however, rare events do occur, such as huge meteor impacts. But if you think the occurrence of an exciting rare event explains your data, be sure to eliminate the more likely possibilities first - like errors in your data.
One aspect of a viable theory is its ability to predict undetermined values and future events. The theory of plate tectonics predicts speed of the plates, ages of the ocean floor, and locations of earthquakes. The values of these parameters can be predicted to a high probability based on the principles of plate tectonics. Other theories are discarded because their predictions are less reliable. The work done by Chris Conger will increase the probability of finding sand deposits in the shallow ocean. Not too exciting? Most science progresses in small steps capitalizing on the work of others.
Another advantage of study of probability is its use in everyday life. What actions will improve your probability of success. Buying lottery tickets? Investing in junk bonds? Buying gold? Investing in real estate? Smoking? Going to grad school? What events are most probable in my lifetime? What kind of event is likely to kill me? Meteor impact? Earthquake? Tsunami? Terrorism? War? Accident? Heart disease? Cancer? Should you buy insurance? How much should you pay for insurance? None of these probabilities are fixed. As knowledge increases and parameters change, so do the probabilities.
Some basics: Flip a coin 3 times, how many possible outcomes are there? With each flip there are two possible outcomes, and we do this 3 times, so all the possible results are: There are 3 events each with two possible outcomes, so there are a total of 2*2*2 results = 8. The formulation is the number of possible results with k trails with ni possible outcomes in the I’th trial is Flip 1 flip 2 flip 3 H H H T H H H T H T T H H H T T H T H T T T T T How many values can a 3-digit binary number have?
Another example: How many possible license plates are there using three letters and three numbers? N=26*26*26*10*10*10= 1,757,600 Permutations: The permutations of r objects taken from a set of n distinct objects is the number of ways n things taken r at a time can be arranged. Example: We have 20 rock samples, how many ways can you select 3 samples from the 20? The first rock can be any one of the 20; the 2nd can be any of 19, and the 3rd can be any of 18. So the answer is 20*19*18. The formulation is:
Factorial: The factorial operation is defined as: By definition, 0! Is set equal to 1. We can re-write the permutation equation as: Example: How many different hands are there in straight poker (no draw)?
The poker example isn’t quite correct, because it assumes that the order that you received the cards in is important, which it isn’t. We need another parameter where order isn’t important. COMBINATIONS: When we don’t care about the order of the outcomes (ABC=ACB), then we talk about the number of COMBINATIONS of n objects taken r at a time. This turns out to be the number of permutations divided by r!.
The reason they are called binomial coefficients is because they are the coefficients of the exdpansion: For n=2, (x+y)2=1x2+2xy+1y2
So how many different poker hands are there really? How many ways can you pick three marbles from 9 marbles? After picking r objects, there are n-r left, and there are as many ways of picking n-r objects from n objects as there are ways to pick r objects from n objects:
Probability Now that we know how to tell what’s possible, how do we tell what’s probable? The basic concept is: If there are s possible favorable outcomes of an event and there are n outcomes possible, then the probability of success is s/n. p=s/n However, this is only true if all outcomes are equally likely.
Example: What is the probability of drawing an ace from a deck of cards? Since there are 52 cards, there are 52 possible outcomes, and, since there are 4 aces, four of those outcomes are favorable, thus: P=4/52=1/13=7.7% Example: A cancer surgery patient gets biopsies on 6 lymph nodes. If any one is found to contain cancer, then the cancer will be known to have spread and the patient will receive chemotherapy. If only 1 in 10 lymph nodes are actually cancerous, what are the odds of all six sampled nodes coming out negative?
Our possible outcomes are 10 nodes taken 6 at a time, or 10C6=10!/(6!(10-6)!)=10*9*8*7/(4*3*2*1)=10*3*7=210. Favorable outcomes are picking the 1 cancerous node out of 10 in 6 tries, which is the same as picking only the 9 clear nodes in 6 tries: 9C6=9!/(6!(9-6)!)=9*8*7/(3*2*1)=84. So the probability is 84/210=40%. Lesson to surgeons: sample LOTS of nodes! When the probabilities of some outcomes are greater than those of others, the above calculations don’t work. A better definition is: The probability of an outcome is the fraction of trials where that outcome is observed with a large number of trials.
Example: “The probability of sunshine for more than 2 hours per day in June in Honolulu is 97%.” This statistic, valuable to the Tourist Bureau, is based on a large number of samples of sunshine in Honolulu in June. The Law of Large Numbers: If an experiment is repeated a large number of times, the fraction of times a particular outcome is observed will approach the probability of that outcome.
Rules and definitions S: Sample Space: All possible outcomes of an experiment A: Event: a subset of S. An event may contain more than one outcome Mutually exclusive: Two events that have no common outcomes. The probability of an event must be greater than or equal to 0 and less than or equal to 1. 0 P(A) 1 Also, P(S)=1.
If P(A)=1, A is a certainty. • If two events are mutually exclusive, then the probability that one or the other will occur is the sum of their probabilities. • : the “Union” symbol. It means “or” • : the “Intersection” symbol. It means “and” If A and B are mutually exclusive: P(A B)= P(A)+P(B) P(A B)=0 • : the “compliment” symbol. It means “not” P(A)+P(A)=1
ODDS Gamblers use odds rather than probabilities. It is an error to use these two terms interchangeably. If the probability of an event is p, the odds of it occurring are: a:b=p/(1-p) , or p=a/(a+b) Odds are used because they tell you directly what your likely winning are. 1:1 (say 1-to-1) odds mean even money, or a probability of 50%. Example: There are 3 blue marbles and one red marble. You reach into a hat and draw out 1 marble. Your probability of picking a red marble is 0.25, but the odds of picking red are 1:3. If you make a bet to pick a red marble you should get 3 dollars for every dollar bet if you’re going to break even in the long run.
Additional Probability Addition Rules Venn Diagram 0.18 0.12 0.24 Venn Diagrams illustrate the the probabilities of non-exclusive events. The circles represent two different events embedded in the sample space. This could be the probabilities of hitting economical oil (Orange) or gas (pink). P(oil)=0.18+0.12=0.30 P(gas)=0.24+0.12=0.36 P(gas oil)=0.18+0.12+0.24=0.54 Note: this is the “inclusive OR” in that both events can occur and still be counted.
If we had used our previous addition rule, AB=P(A)+P(B)=P(oil)+P(gas)=0.30+0.36=0.66, We overestimate the probability of finding gas and oil. We fix that by writing: P(oil gas)=P(oil)+P(gas)-P(oil gas)=0.3+0.36-0.12=0.54 If the events are mutually exclusive, then P(A) P(B)=0, And the original rule is recovered.
Conditional Probability “What if” probabilities are very common - probabilities where an outcome depends on the occurrence of a previous outcome. • If a strength 5 hurricane hits New Orleans, what is the probability that a dike will fail? • If an earthquake occurs of the west coast, what is the probability that a major tsunami will be generated. • If a disaster occurs, what is the probability that our insurance company will not have sufficient funds • If oil supply drops below demand, what is the probability that we can make due with alternative energy?
Conditional probability is the probability that an event will occur, given that another event has already occurred. P(A|B)= P(A B) P(B) The probability of A given that B has occurred is equal to the probability of A and B divided by the probability of B. In the oil and gas example, what is the probability of finding oil given that gas was found? P(oil | gas)= 0.12/0.36= 1/3= 33%
Bayes Basic Theorem Re-writing the above equation, we get: P(A B) = P(B) P(A | B) and P(A B) = P(A) P(B | A). If A and B are independent, then if B has already occurred or not does not affect the probability of A: P(A|B)=P(A). Substituting into Bayes Theorem: P(A B)= P(A) P(B), if A and B are independent. For n independent events:
Example: What is the probability of death by meteoroid impact? The probability of a planet killer meteoroid impact in a given year are about 10-8. The average person lives about 60 years, and there are about 5x109 people. Every 108 years, 5x109 people will be killed by an impact, but every 60 years 5x109 people will be killed by other causes. So, in 108 years, 108/60 * 5x109 die of other causes, and 5x109 people will be killed by an impact. Divide the total deaths by impact by total deaths by other to get the probability of death by impact: P(death by impact)~ 1 in 17 million this is about the same as death by lightning
Peak Oil Example The probability that A) demand for oil will outstrip supply within the next 5 years is ~70%. The probability that B) we will be able to satisfy demand with other energy sources to take up the demand: ~20% The probability of C) global economic chaos if A B’ is ~60%. The probability of global economic chaos beginning within the next 5 years: P(A) P(B’|A)=0.7*(1-0.2)=0.56 P( C ) = 0.56*0.6 ~ 34 % This is the argument that is getting considerable attention now: Google “peak oil”
If there is more than 1 event Bi (all mutually exclusive) that are conditionally related to event A, then P(A) is the sum of the conditional probabilities of the Bi. This yields: Which is the general Bayes Theorem.
Like much of statistics, the formulas are incomprehensible without examples. Consider: An unknown marine fossil fragment was found at the fossil site in a stream bed. You want a better fossil, but there are two possible sources up stream. Drainaage basin B1 covers 180 km2 and B2 covers 100 km2 .
Based on the area alone, the probability that the fossil comes from one or the other basins is: P(B1)=180/280=0.64 P(B2)=100/280=.36 However, a geological map shows that 35% of the outcrops in B1 are marine, while 80% of the outcrops in B2 are marine. The conditional probabilities are: P(A|B1)=0.35 probability of fossil given B1 P(A|B2)=0.80 probability of fossil given B2 We can now use Bayes theorum to find the probability that the fossil came from B1, given that the fossil is marine: