320 likes | 475 Views
Last Lecture:. Histograms: Definition Interpretation in terms of probability Estimate of distribution function Sample Means, Sample Medians, and Sample Variances / Standard Deviations (also known as “statistics”) Definitions Interpretations Estimates of “true” values
E N D
Last Lecture: • Histograms: • Definition • Interpretation in terms of probability • Estimate of distribution function • Sample Means, Sample Medians, and Sample Variances / Standard Deviations (also known as “statistics”) • Definitions • Interpretations • Estimates of “true” values • This Thursday, see www.math.umass.edu/~jstauden/ for homework 2. • In the HW, you’ll also learn about percentiles and boxplots (from Chapters 2 and 3). • We’ll learn a lot more about the stuff in the first 3 chapters later in the semester…
Economic Growth Rate is an Estimate • If 100 economists were asked to estimate the growth of the economy last quarter, a histogram of the estimates might look like this: Sample mean growth estimate is about: 0.002 (0.2%) Min = -0.002 Max = 0.006 Range = 0.008 Distribution is bell shaped. Fact: Most of data falls w/in mean +/- 2std dev for bell shaped distributions As a result, the sample std dev of the estimates is about 0.008/4 = 0.002 (Could also calculate s =0.00173 in this case…)
Probability (starting chapter 4) • A probability is a number between zero and one that is assigned to an event. • The higher the probability, the more likely the event. Notation: Pr( event occurs ) • If the “experiment” that generates the event were repeated many times, the probability describes the fraction of the time it would occur.
One way to think about probability: Box represents all possible events Event 1 Total Area Of the box Is One (think about why) Pr( Event 1 occurs ) = area of oval Pr( Event 1 does not occur) = 1 – Pr ( Event 1 occurs )
Suppose there are 2 events that are “independent” (we’ll define independence later…) Box represents all possible events Event 1 Event 2 Pr( Event 1 and Event 2 ) = area of overlap = Pr( event 1 ) * Pr( event 2 ) Pr(Event 1 or Event 2) = Pr(Event 1) + Pr(Event 2) – Pr(Event 1 and Event 2) = Pr(Event 1) + Pr(Event 2) – Pr(Event 1)* Pr(Event 2)
Example with dice Pr( rolling a 1 on 1 die in one roll) = 1/6 Pr( rolling a 2 on 1 die in one roll) = 1/6 Pr( rolling a 3 on 1 die in one roll) = 1/6 Pr( rolling a 4 on 1 die in one roll) = 1/6 Pr( rolling a 5 on 1 die in one roll) = 1/6 Pr( rolling a 6 on 1 die in one roll) = 1/6 Pr( rolling less than 4 in one roll) = Pr( rolling 1 or 2 or 3 in one roll) = Pr( rolling 1 in one roll) + Pr( rolling 2 in one roll) + Pr( rolling 3 in one roll) - Pr( rolling a and a 2 and a 3 in one roll) = 1/6 + 1/6 + 1/6 – 0 = 1/2
Example with 2 dice Outcome on die 1 1 2 3 4 5 6 1 2 Outcome 3 on die 2 4 5 6 x x Each square isa possibleeventPr( any specificevent) = 1/6 * 1/6 = 1/36Pr( rolling a seven in total) =6/36 = 1/6(squares w/ xs inthem are 7s) x x x x
Example with 2 dice (a related interpretation) Pr( rolling a seven in total ) = number of ways to roll a seven number of possible outcomes In general when ways the event could occur are equally probable, Pr( event ) = number of ways that the event could occur number of possible outcomes What’s a simple expression for Pr( event doesn’t occur?) (hint: it involves Pr( event )…)
Aside: Odds • “Odds” are related to probabilities. More specifically, the odds of an event are:“Pr( event does not occur ) / Pr (event occurs) to 1” • At start of last football season, “the odds”* that the Patriots would win the Superbowl were: 250 to 1.(1-pr(Pats win))/pr(Pats win) = 250(1-pr(Pats win)) = 250pr(Pats win)1 = 251pr(Pats win)pr(Pats win) = 1/251 • Q: How were odds determined? • A: Doing that well is how casinos make money. The precise methods are proprietary. One way is to try to estimate the probabilities from historical data… *i.e. “the odds” = what some casino in NV thought.
Example 2: Researcher mates 2 fruit flies and observes the traits of 300 offspring: wing size normal miniature eye color normal 140 6 vermilion 3 151 What is pr(a fly in the experiment has normal eye color and normal wing size)? What is pr(a fly in the experiment has vermillion eyes)? What is pr(a fly in the experiment has vermillion eyes, miniature wings, or both)? = WAYS EVENT CAN OCCUR / TOTAL NUMBER OF EVENTS
Independence Definition: events A and B are independent if Pr(event A and B) = Pr(A)*Pr(B) Idea: Does whether A occurs or not give you any information about whether B occurs or not? If yes, then A and B are not independent.
Independence: example Consider the following example from a latex glove manufacturer. Each number represents an 8 hour manufacturing shift. Defects 8 hour shifts that produce defects 8 hour shifts that produce no defects 90 9200 Weather Raining Not Raining 80 15000 Q: Are defects and weather independent?
Pr( rain ) = shifts w/ rain / total shifts=(90 + 9200 )/(90+9200+80+15000) = .38 Pr( defect ) = shifts w/ defect / total shifts=(90 + 80 )/(90+9200+80+15000) = .00698 Pr( defect and rain) = shifts w/ defects and rain / total shifts=(90)/(90+9200+80+15000) = .00369.00369 does not equal (0.38)*(.00698)So the events are not independent in this sample. (Humidity is related to defects…)
Random Variables Let X be a number whose value depends on the outcome of a “chance event”Examples:A poll is asked of 100 people X = 0 if person 1 answers no and 1 if yes or X = total number of yesesX = measurement of a board with a rulerX = weight of a randomly selected cat
A probability distribution function (pdf) is associated with every random variable. Assume for now that X is discrete (takes values mapable to the integers or a subset of the integers). The probability distribution function is: Pr( X = a number ) (argument is “a number” output is probability) p(k) = Pr( X = k) Capital letter = random variable Lower case letter = number
Properties of pdfs p(k) is greater than or equal to 0 for any k p(k) less than or equal to 1 for any k sum of p(k) over all possible k’s = 1 The pdf is a model for how X behaves. Note that histograms estimate pdfs from data. Histograms, sample means, sample variances etc show how observations of X actually behave.
Ways to determine PDFs: • Given in a table • Given by a formula There are “famous ones”: binomial, Poisson, hypergeometric,…
PDF – probabilities in a table Let X = # coffee cart line length at 10am # of Phone Calls in an Hour (k)Pr(k) 0 0.10 1 0.20 2 0.25 3 0.30 4 0.15 Suppose greater than 4 people is impossible If you observe the line length onseveral days and make a histogram, then it will be “close” to pr(k). It gets “closer” as the number of days increased. Pr( X >= 0) = Pr( X > 3) = Pr( X >= 1) = Pr( X < 1) =
Associated with PDFs are true Means and Variances (& true std devs)idea: pdf provides model. True means and variances are attributes of the model… • True Mean = sum( k*p(k) ) where sum is over all the possible k’s . • True Variance = sum(p(k) * (k-mean)^2) where sum is over all possible k’s. • Line length: Mean = E(X) = 0*.1 + 1*.2 + 2*.25 + 3*.3 + 4*.15 = 2.2 Variance = Var(X) = (0-2.2)2*.1 + (1-2.2)2*.2 + (2-2.2)2*.25 + (3-2.2)2*.3 + (4-2.2)2*.15= 1.46 • Sample means and Sample variances are calculated from datasets. • True means and True variances are part of the theoretical model for the data. • KEY IDEA: as the size of the dataset becomes larger, the Sample means and variances get closer to the true means and variances…
Powerball Example:# Winners: 0 1 2 3 4 5 6Probability: 8% 21% 26% 21% 13% 9% 2%(These are estimates based on historical data, but assume that they are the truthfor the sake of the example.)Probability that I am a winner if I buy 1 ticket = 1/80 million (= 1/80M).Jackpot (pre tax) = $200 million. (if >1 person wins, jackpot is divided).Assume whether or not I win is independent of the number of winners.Let X = millions of dollars I win from one ticket.PDF:x 0 200 100 66.7 50 40 33.3pr(x) ? .2283/80M .2826/80M .2283/80M .1413/80M .0978/80M .0217/80M1) What does “?” equal? (and how did I compute the other pr(x)’s)?(see next slide for ans)2) mu = E(X) = sum(x * p(x)). In dollars this is about $1.26. (you can confirm this)3) Var(X) = sum((x – mu)*pr(x))4) Interpretation: If I play powerball a lot when there is a $200 million jackpot, then I can expect to win $1.26 on average.5) If tickets are a dollar each, why doesn’t Powerball lose money? (These numbers are all based on real data.)
Answers to question on previous slide • The ? = (80million-1)/80million • You know this is true since the probability that I do not win is one minus the probability that I win (and the probability that I win is given to be 1/80 million). • How did I compute the pr(x)’s: • The probability that I win $200million = Pr(I win and there is only winner given that there is at least 1 winner)=Pr(I win)*Pr(there is only one winner given that there is at least one winner)=(1/80million) * (0.21/(.21+.26+.21+.13+.09+.02)) Uses independence Uses the rule for conditional probability on page 141.
Cumulative Probability: • A cumulative probability is the probability that X is less than or equal to a some number: • Ex: powerball: • Pr(there are 3 or fewer winners)=Pr(X<=3)=Pr(X=0 or X=1 or X=2 or X=3)=Pr(no winners)+Pr(1 winner)+Pr(2 winners)+Pr(3 winners)= 8%+21%+26%+21% • Notation: F(3)=Pr(X <= 3) (F(k) = Pr(X<=k) is called the Cumulative Distribution Function or CDF) • If this helps, think of F(k) as the integral of the PDF from 0 to k. • Note: Pr(X > 3) = 1-Pr(X<=3) (careful about > and <=…)
Graphically: Pr( X <= 3 ) = sum of the areas of the shaded regions = 1 – Pr( X>4 ) = 1 – sum of the areas of the white regions PDF for the random variable that represents the number of winners
“Famous” PDFs • Binomial: “X~bin(n,p)” • Setup • Let X = number of successes out of n identical trials • n identical independent trials • Each trial results in a success w/ probability p or failure with probability q=1-p • X could possibly be 0,…,n • PDF: • Pr(X = k) = (n choose k) pkqn-k • (n choose k) = number of ways to choose k things from n things= (n) = n! / (k! (n-k)!) (k) • Note that n! = n*(n-1)*…*2*1 • Also, 0! = 1 • Expectation = E(X) = npVariance = Var(X) = npqStdDev = sqrt(Var(X))
Example: • Suppose each person in a 5 person class comes with probability 0.85? • Let X = number of people in class on a given day. • What’s probability 4 people show up one day? • X~bin(5,0.85) • Pr(X = 4) = (5 choose 4) * 0.854 *0.151 = 5 * 0.854 *0.151= 0.3915047
Why the binomial pdf is correct: Example: • 5 Students. Each attends with probability 0.85. What’s the probability of exactly 4 successes? • There are 5 choose 4 ( 5 = 5!/(4!*1!) ) possible configurations of students (YYYYN, YYYNY, etc). • Each configuration has probability 0.8540.151 • Pr(X = 2) = 5* 0.8540.151 = 39% (we’re using the “or” rule here: person 1 doesn’t come or person 2 doesn’t come, or… Probaility of 4 People coming Probaility of 1 Person not coming (remember the “and” rule for independent events)
“Famous” PDFs • Poisson: “X~Pois(r)” • Setup • Let X = number of occurrences of an event in time or space • Events are expected to occur at rate r • X could possibly be 0,1,2,… • PDF: • Pr(X = k) = rke-r/k! • e is 2.718… • Note that 0! = 1 • Expectation = E(X) = rVariance = Var(X) = rStdDev = sqrt(Var(X)) One could show why the Poisson PDF is correct, but the math is more involved. If you’re interested, come talk to me sometime.
Example: • Inspect an experimental rat’s brain for tumorous cells. You expect 10 tumorous cells in 60mm3 of brain. What’s the probability that you see either 2 or 3 tumorous cells in 10mm3? • X = tumors found 10mm3 of brain. X~Poisson(5/3) (rate per 60mm3 is 10, so rate per 10mm3 is 10/6 = 5/3) • Pr(X = 2 or 3) = Pr(X = 2) + Pr(X = 3) = (5/3)2e-(5/3)/2! + (5/3)3e-(5/3)/3! = 41%
“Famous” PDFs • Hypergeometric: “X~Hyp(N,M,n)” • Setup • There are a total of N items. M are of type A and N-M are of type B. n items are chosen at random withoutreplacement. • Let X = number of chosen items that are type A • Pr(X = k) = (M choose k)*(N-M choose n-k)/(N choose n) • Remember:(n choose k) = number of ways to choose k things from n things= (n) = n! / (k! (n-k)!) (k) • Note that 0! = 1 • Note that binomial is like the hypergeometric, but the binomial is with replacement… (which results in a fixed p)
Hypergeometric Example • Cards: probability of being dealt a flush in hearts in a hand of poker (flush=all cards of same suit) • X = number of hearts in the hand • N = 52 • M = 52/4 = 13 • n = 5 • Want Pr(X=5) (13 choose 5 ) (39 choose 0)/(52 choose 5) = 1287 * 1 / 2598960 0.0004951981 (NOTE THAT THIS NUMBER IS DIFFERENT FROM WHAT I WROTE ON THE BOARD IN THE CLASS) What’s probabilty of getting a flush in any suit? • (see minitab:calc:Probability Distributions: Hypergeometric)
For each of the following: • What is the random variable? • What is it’s distribution and what are numbers for its parameters? • What is the probability that is being asked for? • How can it be computed from the probability density function.
More Examples: • There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once and all searches are independent. What’s the probability of being searched at least one time? • 50 geese in a flock of 200 are tagged by a wildlife biologist. The next year, 10 ducks from the flock are captured. Assume the flock still has 200 ducks and no tags are lost. What’s the probability that at least 5 of the recaptured ducks have tags? • Suppose a written test has 5 True/False questions. Passing = at least 3 correct answers and the test can be taken at most 3 times. (Assume no learning occurs between tests if one fails!) • If one randomly guesses what’s the probability of passing? • What’s the probability that someone who randomly guesses will eventually pass? • An overloaded server receives an average of 25 emails per second at 12:00PM. If it receives more than 30 emails in a second, it will crash. What’s the probability of a crash at 12:00PM on a given day (based on the traffic in the previous 1 second)?