Finance 30210: Managerial Economics

Finance 30210: Managerial Economics Demand Estimation and Forecasting

What are the odds that a fair coin flip results in a head? What are the odds that the toss of a fair die results in a 5? What are the odds that tomorrow’s temperature is 95 degrees?

The answer to all these questions come from a probability distribution Probability 1/2 Head Tail A probability distribution is a collection of probabilities describing the odds of any particular event Probability 1/6 1 2 3 4 5 6

The distribution for temperature in south bend is a bit more complicated because there are so many possible outcomes, but the concept is the same Probability Standard Deviation Temperature Mean We generally assume a Normal Distribution which can be characterized by a mean (average) and standard deviation (measure of dispersion)

Without some math, we can’t find the probability of a specific outcome, but we can easily divide up the distribution Probability Temperature Mean-2SD Mean -1SD Mean Mean+1SD Mean+2SD 2.5% 13.5% 34% 34% 13.5% 2.5%

Annual Temperature in South Bend has a mean of 59 degrees and a standard deviation of 18 degrees. Probability 95 degrees is 2 standard deviations to the right – there is a 2.5% chance the temperature is 95 or greater (97.5% chance it is cooler than 95) Temperature 23 41 59 77 95 Can’t we do a little better than this?

Conditional distributions give us probabilities conditional on some observable information – the temperature in South Bend conditional on the month of July has a mean of 84 with a standard deviation of 7. Probability 95 degrees falls a little more than one standard deviation away (there approximately a 16% chance that the temperature is 95 or greater) Temperature 70 77 84 91 95 98 Conditioning on month gives us a more accurate probabilities!

We know that there should be a “true” probability distribution that governs the outcome of a coin toss (assuming a fair coin) The Truth Suppose that we were to flip a coin over and over again and after each flip, we calculate the percentage of heads & tails (Sample Statistic) (True Probability) That is, if we collect “enough” data, we can eventually learn the truth!

We can follow the same process for the temperature in South Bend The Truth Temperature~ We could find this distribution by collecting temperature data for south bend Sample Mean (Average) Sample Variance Note: Standard Deviation is the square root of the variance.

Some useful properties of probability distributions Probability distributions are scalable = 3 X Mean = 1 Variance = 4 Std. Dev. = 2 Mean = 3 Variance = 36 (3*3*4) Std. Dev. = 6

Probability distributions are additive = + Mean = 1 Variance = 1 Std. Dev. = 1 Mean = 2 Variance = 9 Std. Dev. = 3 Mean = 3 Variance = 14 (1 + 9 + 2*2) Std. Dev. = 3.7 COV = 2

Suppose we know that the value of a car is determined by its age The Truth Value = $20,000 - $1,000 (Age) Value Car Age Mean = 8 Variance = 4 Std. Dev. = 2 Mean = $ 12,000 Variance = 4,000,000 Std. Dev. = $ 2,000

We could also use this to forecast: The Truth Value = $20,000 - $1,000 (Age) How much should a six year old car be worth? Value = $20,000 - $1,000 (6) = $14,000 Note: There is NO uncertainty in this prediction.

Searching for the truth…. You believe that there is a relationship between age and value, but you don’t know what it is…. • Collect data on values and age • Estimate the relationship between them Note that while the true distribution of age is N(8,4), our collected sample will not be N(8,4). This sampling error will create errors in our estimates!!

Slope = b a Value = a + b * (Age) + error We want to choose ‘a’ and ‘b’ to minimize the error!

We have our estimate of “the truth” T-Stats bigger than 2 in absolute value are considered statistically significant! Value = $12,354 - $854 * (Age) + error Intercept (a) Mean = $12,354 Std. Dev. = $653 Age (b) Mean = -$854 Std. Dev. = $80

Percentage of value variance explained by age Error Term Mean = 0 Std. Dev = $2,250

We can now forecast the value of a 6 year old car 6 Value = $12,354 - $854 * (Age) + error Mean = $12,354 Std. Dev. = $653 Mean = $854 Std. Dev. = $ 80 Mean = $0 Std. Dev. = $2,250 (Recall, The Average Car age is 8 years)

Value +95% Forecast Interval -95% Age Note that your forecast error will always be smallest at the sample mean! Also, your forecast gets worse at an increasing rate as you depart from the mean

What are the odds that Pat Buchanan received 3,407 votes from Palm Beach County in 2000?

The Strategy: Estimate a relationship for Pat Buchanan’s votes using every county EXCEPT Palm Beach Using Palm Beach data, forecast Pat Buchanan’s vote total for Palm Beach “Are a function of” Observable Demographics Pat Buchanan’s Votes

The Data: Demographic Data By County What variables do you think should affect Pat Buchanan’s Vote total? # of Buchanan votes % of County that is college educated # of votes gained/lost for each percentage point increase in college educated population

Results R-Square = .19 19% of the variation in Buchanan’s votes across counties is explained by college education The distribution for ‘b’ has a mean of 15 and a standard deviation of 4 Each percentage point increase in college educated (i.e. from 10% to 11%) raises Buchanan’s vote total by 15 0 15 There is a 95% chance that the value for ‘b’ lies between 23 and 7 Plug in Values for College % to get vote predictions

Lets try something a little different… Log of Buchanan votes % of County that is college educated Percentage increase/decease in votes for each percentage point increase in college educated population

Results R-Square = .31 31% of the variation in Buchanan’s votes across counties is explained by college education The distribution for ‘b’ has a mean of .09 and a standard deviation of .02 Each percentage point increase in college educated (i.e. from 10% to 11%) raises Buchanan’s vote total by .09% 0 .09 There is a 95% chance that the value for ‘b’ lies between .13 and .05 Plug in Values for College % to get vote predictions

How about this… # of Buchanan votes Log of % of County that is college educated Gain/ Loss in votes for each percentage increase in college educated population

Results R-Square = .25 25% of the variation in Buchanan’s votes across counties is explained by college education The distribution for ‘b’ has a mean of 252 and a standard deviation of 54 Each percentage increase in college educated (i.e. from 30% to 30.3%) raises Buchanan’s vote total by 252 votes 0 .09 There is a 95% chance that the value for ‘b’ lies between 360 and 144 Plug in Values for College % to get vote predictions

One More… Log of Buchanan votes Log of % of County that is college educated Percentage gain/Loss in votes for each percentage increase in college educated population

Results R-Square = .40 40% of the variation in Buchanan’s votes across counties is explained by college education The distribution for ‘b’ has a mean of 1.61 and a standard deviation of .24 Each percentage increase in college educated (i.e. from 30% to 30.3%) raises Buchanan’s vote total by 1.61% 0 .09 There is a 95% chance that the value for ‘b’ lies between 2 and 1.13 Plug in Values for College % to get vote predictions

It turns out the regression with the best fit looks like this. Error term Buchanan Votes *100 Total Votes Parameters to be estimated

The Results: R Squared = .73 Now, we can make a forecast!

This would be our prediction for Pat Buchanan’s vote total!

We know that the log of Buchanan’s vote percentage is distributed normally with a mean of -2.004 and with a standard deviation of .2556 Probability LN(%Votes) -2.004 – 2*(.2556) -2.004 + 2*(.2556) = -2.5152 = -1.4928 There is a 95% chance that the log of Buchanan’s vote percentage lies in this range

Next, lets convert the Logs to vote percentages Probability % of Votes There is a 95% chance that Buchanan’s vote percentage lies in this range

Finally, we can convert to actual votes Probability 3,407 votes turns out to be 7 standard deviations away from our forecast!!! Votes There is a 95% chance that Buchanan’s total vote lies in this range

We know that the quantity of some good or service demanded should be related to some basic variables “ Is a function of” Quantity Demanded Price Price Other “Demand Shifters” Income Quantity

Cross Sectional estimation holds the time period constant and estimates the variation in demand resulting from variation in the demand factors Demand Factors Time t-1 t+1 t For example: can we estimate demand for Pepsi in South Bend by looking at selected statistics for South bend

Suppose that we have the following data for sales in 200 different Indiana cities Lets begin by estimating a basic demand curve – quantity demanded is a linear function of price. Change in quantity demanded per $ change in price (to be estimated)

That is, we have estimated the following equation Every dollar increase in price lowers sales by 46,087 units.

$1.37 91,903

As we did earlier, we can experiment with different functional forms by using logs Adding logs changes the interpretation of the coefficients Change in quantity demanded per percentage change in price (to be estimated)

That is, we have estimated the following equation Every 1% increase in price lowers sales by 103,973 units.

$1.37 100,402

As we did earlier, we can experiment with different functional forms by using logs Adding logs changes the interpretation of the coefficients Percentage change in quantity demanded per $ change in price (to be estimated)

That is, we have estimated the following equation Every $1 increase in price lowers sales by 1.22%.

We can now use this estimated demand curve along with price in South Bend to estimate demand in South Bend $1.37 83,283

As we did earlier, we can experiment with different functional forms by using logs Adding logs changes the interpretation of the coefficients Percentage change in quantity demanded per percentage change in price (to be estimated)

That is, we have estimated the following equation Every 1% increase in price lowers sales by 2.6%.

$1.37 72,402

We can add as many variables as we want in whatever combination. The goal is to look for the best fit. % change in Sales per $ change in price % change in Sales per % change in income % change in Sales per % change in competitor’s price R Squared: .46

Finance 30210: Managerial Economics