1.09k likes | 1.1k Views
Explore the probability of the Patriots winning at least 19 coin tosses out of 25 and the odds of winning at craps. Understand the binomial distribution and house edge.
E N D
FIN 30210: Managerial Economics Statistical Analysis
Part I: Probability Here are the odds for Blackjack….remember, what happens in Vegas stays in Vegas The Cubs have a 12% chance of winning the world series this year Probability is about having the truth
“Patriots have no need for probability, win coin flip at impossible clip” (Nov. 4th, 2015) “Belichick has also been extremely lucky. The Pats have won the coin toss 19 of the last 25 times, according to the Boston Globe's Jim McBride.” So, what are the odds that the Patriots can win at least 19 out of 25 flips?
To do this, we need a probability distribution…for a coin toss, we have the following. Probability Side Note: 50 Super Bowl Coin Tosses Heads: 24 (48%) Tails: 26 (52%) 1/2 Outcome Tail Head So, suppose that we wanted the odds that the Patriots got 19 wins in a row…. Probability ( A and B) = Probability(A) * Probability (B)
Probability ( A and B) = Probability(A) * Probability (B) Probability 1/2 Outcome Tail Head So, we want the probability of 19 Wins The odds of dying from an asteroid collision with earth in the next 100 years is 1 in 500,000 (.000191% - 1 in 523,560) This isn’t really what we want though…getting 19 wins in a row is one of many ways to get 19 out of 25
What are the odds that the Patriots get 24 out of 25 wins Probability ( A and B) = Probability(A) * Probability (B) Probability ( A or B) = Probability(A) + Probability (B) There are LOTS of ways to get exactly 24 out of 25 wins One way would be L W WWWWWWWWWWWWWWWWWWWWWWW (.00000298%) 24 Wins One way would be W L W WWWWWWWWWWWWWWWWWWWWWW (.00000298%) 23 Wins
What are the odds that the Patriots get 24 out of 25 wins Probability ( A and B) = Probability(A) * Probability (B) Probability ( A or B) = Probability(A) + Probability (B) In Fact, there are 25 ways to get 24 out of 25 wins, so the answer would be (.000075% - 1 in 1.3 million) The odds of becoming a movie star are 1 in 1.5 million
The probability for a number of wins out of a certain number of tries is given by a binomial distribution: k successes in n tries. Probability of success is p Note: 24 out of 25 wins So, the probability that the patriots get EXACTLY 19 out of 25 wins would be (.52% - 1 in 192) So, the probability that the patriots get AT LEAST 19 out of 25 wins would be The odds of Notre Dame winning the national title in football this year are 1 in 40 (.73% - 1 in 137)
Here’s the binomial distribution for 25 tosses Odds of 12 or Less = 50% Odds of 12 or Less = 50% Probability (%) 31% .73%
On the other side of the proverbial coin is losing the toss a lot. In 2011, the Cleveland Browns lost 11 in a row. (.049% 1 in 2040) The odds of fatally slipping in the shower are 1 in 2500 In 2012, the Carolina Panthers lost 12 in a row. (.024% - 1 in 4,166) The odds of getting a hole in 1 in golf are 1 in 5,000
What are your odds of winning at craps? Easiest Bet – Playing the Pass Line
The Game of Craps – Playing the Pass Line • If you roll a 2,3,or 12, you lose “crap out” • If you roll a 7 or 11, you “win” • If you roll a 4,5,6,8,9,10 the rolled number becomes the “point” • If you roll the point, you win, if you roll a seven before rolling the “point” you lose. • The Pass Line Pays Even Odds Probability 1/6 Number 1 2 3 4 5 6
Come Out • Win = 23% • Lose = 12% • Roll Again = 65% • 4 or 10 = 16% • 5 or 9 = 22% • 6 or 8 = 27% Probability (%) “Win” (17%) “Craps” (3%) “Craps” (9%) “Point” (33%) “Point” (33%) “Win” (6%)
65 Percent of the time, you have a “point” to make • Come Out = 4,10 Next Roll • Win = 8% • Lose = 17% • Roll Again = 75% 17% • Come Out = 5,9 Next Roll • Win =11% • Lose = 17% • Roll Again = 72% 14% 14% Probability (%) 11% 11% • Come Out = 6,8 Next Roll • Win =14% • Lose = 17% • Roll Again = 69% 8% 8% 6% 16% 3% 3%
So, what's the probability that you win with a pass bet? These are a bit tricky…..
What’s the probability the you roll a 4 before you roll a7? Roll something other than a 4 or 7, then roll a 4 Roll a 4 Roll something other than a 4 or 7 twice, then roll a 4 Roll something other than a 4 or 7 three times, then roll a 4 A useful bit of math
The Game of Craps – Playing the Pass Line • If you roll a 2,3,or 12, you lose “crap out” • If you roll a 7 or eleven, you win “win” • If you roll a 4,5,6,8,9,10 the rolled number becomes the “point” • If you roll the point, you win, if you roll a seven before rolling the “point” you lose. • The Pass Line Pays even odds Playing the Pass Line This is known as the “House Edge” Win = 49.3% - Loss = 50.7% -1.4% “The Gambler’s Ruin” A gambler playing a negative expected value game will eventually go broke with probability one!!
The Game of Craps – Playing the Pass Line • If you roll a 2,3,or 12, you lose “crap out” • If you roll a 7 or eleven, you win “win” • If you roll a 4,5,6,8,9,10 the rolled number becomes the “point” • If you roll the point, you win, if you roll a seven before rolling the “point” you lose. • The Pass Line Pays even odds Playing the Pass Line Expected Value measures the average outcome over a large number of attempts, given the probabilities of each outcome.
Playing the Pass Line Expected Percentage loss (House Edge) For a $1 Pass Bet
Suppose that the first roll is a 4. I can now make an additional bet. I can make a bet that a 4 is rolled before a 7. This is called “Playing the odds” The house pays odds equal to the to the true odds, so the house edge on this additional bet are ZERO!!!!!!! This is the only fair bet in Vegas!!!
Suppose that you can bet twice your initial bet on the odds • Whatever your initial Pass/Don’t Pass Wager, you can up your bet on a point as follows • You can bet 2X your initial bet if your point is 4 or 10 (Pays 2 to 1) • You can bet 2X your initial bet if your point is 5 or 9 (Pays 3 to 2) • You can bet 2X your initial bet if your point is 6 or 8 (Pays 6 to 5) For a $1 Initial Bet – Playing Pass/w 2x odds Expected Percentage loss (House Edge) The expected loss is the same, but your overall bet is bigger, so the percentage loss is smaller!!
A Common Casino Betting System for Casino Craps is the “3-4-5” System • Whatever your initial Pass/Don’t Pass Wager, you can up your bet on a point as follows • You can bet 3X your initial bet if your point is 4 or 10 (Pays 2 to 1) • You can bet 4X your initial bet if your point is 5 or 9 (Pays 3 to 2) • You can bet 5X your initial bet if your point is 6 or 8 (Pays 6 to 5) For a $1 Initial Bet – Playing Pass/w 3-4-5 odds Expected Percentage loss (House Edge) The bigger the multiple allowed, the smaller the house edge!!
Here’s a comparison of casino edges on other games… Craps Other Games
What are the odds that it will be 80 degrees tomorrow in South Bend? As with the first two examples, this involves a probability distribution
Just as with the coin flip or the dice roll, we can imagine a “truth” out there governing South Bend temperatures. This “truth”, again, is in the form of a probability distribution. Probability Temperature
We can use the normal distribution to get the probability that the temperature lies within various ranges Probability 34% 34% 2.3% 0.2% 2.3% 0.2% 13.5% 13.5% Temperature 68% 95% 99.6%
So, for example…… Probability 34% 34% 2.3% 2.3% 0.2% 0.2% 13.5% 13.5% Temperature 68% 95% 99.6%
Conditional distributions give us probabilities conditional on some observable information What is the probability that the Temperature in south bend is greater than 15 degrees Conditional on February Probability Probability Unconditional Temp Temp 99.8% 16%
Part II: Statistics Statistics is about finding the truth
Law of large numbers: In statistics, as the number of identically distributed, randomly generated variables increases, their sample mean (average) approaches their theoretical mean. The law of large numbers was first proved by the Swiss mathematician Jakob Bernoulli. Number of data points increases The Truth Jakob Bernoulli 1655 - 1705 (Population Mean) (Population Variance)
Average Monthly Temperatures in Indiana from 1894 - 2016 • Sample Statistics • Average = 50.7 • Std. Dev. = 16.3 • High = 78.1 • Low = 22.1 We have Average Monthly temperatures (1894 – 2016) for 36 locations across Indiana. This is what we would call a “cross sectional” dataset (multiple observations at a single point in time)
Sample Statistics • Average = 50.7 • Std. Dev. = 16.3 • High = 78.1 • Low = 22.1 Frequency (%)
Suppose that we condition on “Northern Indiana” or “Southern Indiana” • Sample Statistics • Average = 47.9 • Std. Dev. = 16.8 • High = 72.6 • Low = 22.1 We have Average Monthly temperatures (1894 – 2016) for 36 locations across Indiana. This is what we would call a “cross sectional” dataset (multiple observations at a single point in time)
“Northern Indiana” • Sample Statistics • Average = 47.9 • Std. Dev. = 16.8 • High = 72.6 • Low = 22.1 Frequency (%) “Southern Indiana” • Sample Statistics • Average = 53.1 • Std. Dev. = 15.7 • High = 78.1 • Low = 24.2 Frequency (%)
January - March • Sample Statistics • Average = 32.7 • Std. Dev. = 6.5 • High = 47.5 • Low = 22.1 Frequency (%) I could also condition on Month(s) of the year Temperature June - August • Sample Statistics • Average = 70.8 • Std. Dev. = 2.6 • High = 78.8 • Low = 65.1 Frequency (%) Temperature
February • Sample Statistics • Average = 32.7 • Std. Dev. = 3.5 • High = 37.7 • Low = 23.6 Frequency (%) Or individual months of the year Temperature July • Sample Statistics • Average = 72.6 • Std. Dev. = 1.9 • High = 78.1 • Low = 69.0 Frequency (%) Temperature
Northern Indiana in February • Sample Statistics • Average = 25.9 • Std. Dev. = 1.4 • High = 27.4 • Low = 23.6 Or individual months of the year and locations Northern Indiana in July • Sample Statistics • Average = 70.8 • Std. Dev. = 1.1 • High = 72.6 • Low = 69.0
For Indiana Probability “I’m 95% sure that the temperature for September will be between 18 and 83 degrees” 34% 34% 2.3% 2.3% 0.2% 0.2% 13.5% 13.5% Temperature 68% 95% 99.6%
So, for example……for Northern Indiana in September Sample Statistics Probability “I’m 95% sure that the temperature for September will be between 60 and 64 degrees” 34% 34% 2.3% 2.3% 0.2% 0.2% 13.5% 13.5% Temperature 68% 95% 99.6%
Regressions are about estimating conditional distributions • Linear Regressions make several key assumptions • Linear Relationship • Multivariate Normality • No or Little Multicollinearity • No Auto-correlation • Homoscedasticity Independent Variable Error Term Explained Variable Parameters to be estimated
Frequency Conditional Distribution of Y
The OLS (Ordinary Least Squares) method estimates the parameters alpha and beta by minimizing the sum of squared errors. Estimated Coefficients
We also have a set of error terms Frequency These errors are a sampling of the population of errors
Each regression gives us a sample of the distribution of errors (not the entire population of errors). Therefore, the estimated coefficients are not the true coefficients, but rather, they are samples drawn from a distribution of possible true parameter values Frequency Frequency
A few important things regarding these parameter estimates… The estimated parameters are drawn from a distribution with a mean equal to the true parameter value – we are not making biased prediction! These parameters are unknown, so we need to estimate them from the data 1) The variance of the parameters is smaller (the estimates are more precise)when the variance of x is large 2) As the number of observations gets large, the variance approaches zero – we learn the truth!
Law of large numbers: In statistics, as the number of identically distributed, randomly generated variables increases, their sample mean (average) approaches their theoretical mean. The law of large numbers was first proved by the Swiss mathematician Jakob Bernoulli. Sample Estimates Population Parameters Number of observations gets big Number of observations gets big Number of observations gets big
We also have some additional “diagnostics” to check the performance of the regression Total Sum of Squares Regression Sum of Squares + = Residual Sum of Squared Residuals Total Variation in the Data we are trying to explain Total Variation in the data we have actually Total Variation in the Data left unexplained Standard Error of the Regression R Squared of the Regression The percentage of the variation of Y explained in the regression The is the average error of our estimates
If we would like to make a forecast using our regression data, we need to calculate the conditional distribution Frequency Note that since our estimates are unbiased, our forecasts will also be unbiased! As our sample size gets bigger, the variance of our forecasts goes down (our forecasts get more precise) If the variance of X is big, we get better forecasts