330 likes | 500 Views
Economics 105: Statistics. Go over GH 21 GH 22 due Tuesday. Multiple Regression. Assumption (7) No perfect multicollinearity no X is an exact linear function of other X ’ s Venn diagram Other implicit assumptions
E N D
Economics 105: Statistics Go over GH 21 GH 22 due Tuesday
Multiple Regression Assumption (7) No perfect multicollinearity no X is an exact linear function of other X’s Venn diagram Other implicit assumptions data are a random sample of n observations from proper population n > K -- in fact, good to have n>>K, *much* bigger the little xij’s are fixed numbers (the same in repeated samples) or they are realizations of random variables, Xij, that are independent of error term & then inference is done CONDITIONAL on observed values of xij’s
Multiple Regression Interpretation of multiple regression coefficients for one unit change in Xi …specify the units average change in Y ceteris paribus Venn diagram
Hypothesis Testing of a Single Coefficient • For this test • the test statistic is
Nonlinear Relationships • The relationship between the outcome and the explanatory variable may not be linear • Make the scatterplot to examine • Example: Quadratic model • Example: Log transformations • Log always means natural log (ln) in economics
Linear vs. Nonlinear Fit Y Y X X X X residuals residuals Linear fit does not give random residuals Nonlinear fit gives random residuals
Quadratic Regression Model Source: http://marginalrevolution.com/marginalrevolution/2012/04/new-cities.html
Quadratic Regression Model Source: http://marginalrevolution.com/marginalrevolution/2012/04/new-cities.html
Testing the Overall Model • Estimate the model to obtain the sample regression equation: • The “whole model” F-test H0: β1 = β2 = β3 = … = β15 = 0 H1: at least 1 βi ≠ 0 • F-test statistic =
Testing the Overall Model Critical value = 2.082= F.INV(0.99,15,430-15-1) p-value = 0 = 1-F.DIST(120.145,15,430-15-1,1)
Average Effect on Y of a change in X in Nonlinear Models • Consider a change in X1 of ΔX1 • X2 is held constant! • Average effect on Y is difference in pop reg models • Estimate of this pop difference is
Example • What is the average effect of an increase in Age from 30 to 40 years? 40 to 50 years? • 2.03*(40-30) - .02*(1600 – 900) = 20.3 – 14 = 6.3 • 2.03*(50-40) - .02*(2500 – 1600) = 20.3 – 18 = 2.3 • Units?!
Coefficient of Determination for Multiple Regression • Reports the proportion of total variation in Y explained by all X variables taken together • Consider this model
Multiple Coefficient of Determination (continued) 52.1% of the variation in pie sales is explained by the variation in price and advertising
Adjusted R2 • R2 never decreases when a new X variable is added to the model • disadvantage when comparing models • What is the net effect of adding a new variable? • We lose a degree of freedom when a new X variable is added • Did the new X variable add enough explanatory power to offset the loss of one degree of freedom?
Adjusted R2 (continued) • Penalizes excessive use of unimportant variables • Smaller than R2and can increase, decrease, or stay same • Useful in comparing among models, but don’t rely too heavily on it – use theory and statistical signif
Adjusted R2 (continued) 44.2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variables
Log Functional Forms • Linear-Log • Log-linear • Log-log • Log of a variable means interpretation is a percentage change in the variable • (don’t forget Mark’s pet peeve)
Log Functional Forms • Here’s why: ln(x+x) – ln(x) = • calculus: • Numerically: ln(1.01) = .00995 = .01 • ln(1.10) = .0953 = .10 (sort of)