1 / 38

FUNctional Form

Learn how to utilize summary statistics and visual representations to analyze data before conducting regression analysis. Understand the importance of transforming variables for nonlinearity in regression models.

heremon
Download Presentation

FUNctional Form

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FUNctional Form Adding forms of nonlinearity to Regression Models

  2. Getting to know your data • You need to have a basic “feel” for your data to know how to analyze it. • Two things you should do before you analyze: • Summary statistics (general characteristics) • Graphic views of data

  3. Summary Statistics • You should know basic information about your data: • Measures of Central Tendency • Measures of Dispersion • Minimum, Maximum • If you are going to talk about 1 unit changes in a variable, you might want to know how much change there is • If you are going to graph a relationship, you want to know what range of the relationship your data actually support (in-sample vs. out-of-sample predictions)

  4. How to obtain Summary Stats • In Stata, type summarize y x1 • Note that x1 is a 7-point interval scale • There is a problem with x1 . summarize Variable | Obs Mean Std. Dev. Min Max -------------+------------------------------------------------ y | 46 4.608696 2.12371 2 9 x1 | 46 5.130435 10.94351 1 77

  5. How to Obtain Summary Stats • You can get this info for specific variables • Command: summarize var1 var2 … . summarize x1 Variable | Obs Mean Std. Dev. Min Max -------------+------------------------------------------------- x1 | 46 5.130435 10.94351 1 77

  6. How to Obtain Summary Stats • You can get more detailed info • Command: summarize x1, detail . summarize x1, detail x1 ------------------------------------------------------------- Percentiles Smallest 1% 1 1 5% 1 1 10% 2 1 Obs 46 25% 3 1 Sum of Wgt. 46 50% 3 Mean 5.130435 Largest Std. Dev. 10.94351 75% 5 6 90% 6 7 Variance 119.7604 95% 7 7 Skewness 6.353026 99% 77 77 Kurtosis 42.25927

  7. Graphic Views of 1 Variable • Box and Whisker Plot • Essentially Puts the Data in Numerical Order and divides it into four parts. • Middle two boxes show where the middle 50% are • Lines show minimum and maximum • Command: • graph box varname

  8. Graphic Views of Two Variables • Use the Scatter command • scatter y x

  9. Why are Graphic Views of relationships important? . regress y x Source | SS df MS Number of obs = 63 ---------+----------------------------- F( 1, 61) = 0.17 Model | .141749236 1 .141749236 Prob > F = 0.6840 Residual | 51.7056947 61 .847634339 R-squared = 0.0027 ---------+----------------------------- Adj R-squared = -0.0136 Total | 51.8474439 62 .836249096 Root MSE = .92067 ------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+---------------------------------------------------------- x | -.0518887 .1268869 -0.41 0.684 -.3056147 .2018373 _cons | 3.84064 .1168212 32.88 0.000 3.607041 4.074238 ----------------------------------------------------------------- Interpret this regression…

  10. Now Look at the Graph

  11. Let’s Try again . regress y3 x Source | SS df MS Number of obs = 63 ---------+----------------------------- F( 1, 61) = 221.56 Model | 379.83978 1 379.83978 Prob > F = 0.0000 Residual | 104.575914 61 1.71435924 R-squared = 0.7841 ---------+----------------------------- Adj R-squared = 0.7806 Total | 484.415694 62 7.81315635 Root MSE = 1.3093 ----------------------------------------------------------------- y3 | Coef. Std. Err. t P>|t| [95% Con. Interval] ---------+------------------------------------------------------- x | 2.686041 .1804527 14.89 0.000 2.325203 3.046878 _cons | 6.027621 .1661377 36.28 0.000 5.695408 6.359833 ----------------------------------------------------------------- Interpret these results…

  12. And the Graph…

  13. What went wrong? • What assumption did we violate? • Sometimes there is a clear need • Other times, weigh tradeoffs between parsimony and detail • This Line perfectly predicts every point, but it doesn’t tell us a great deal about the relationship between x and y in general • Hard to generalize out of sample * This line summarizes the relationship very parsimoniously, without sacrificing too much

  14. How can we account for non-linearity? • Model is linear in the parameters (a and b) • We cannot (with OLS) do: • But we can “transform” the variables • Examples:

  15. Benefits of Transformation • Allows us to Account for Non-linearity (we get better fitting models) • Allows us to stick with the OLS framework • We don’t know anything else • Even when you do, the simple and desirable properties of OLS can make OLS with transformations a better choice than some high-fangled nonlinear models

  16. Which Transformation? • It is up to the researcher to specify this • This may seem ad hoc, but • Let Theory be your guide • It is no less ad hoc than constraining the model to be linear • Let me show you a few transformations

  17. Diminishing Marginal Returns • One Popular option is to take the “Natural Log” of x • This transforms x so that it will have a linear relationship with y Relationship between y and ln(x) Linear– OLS is A-O-K Relationship between y and x Non-linear – OLS not OK

  18. How does this work? • So you graphed the relationship between y and x and found that it looks like y has a diminishing marginal effect on y • You decide to transform x with the natural log to deal with this • generate lnx = ln(x) • regress y lnx • But how do we interpret?

  19. First, Let’s Compare Fit . regress y x Source | SS df MS Number of obs = 64 ----------+------------------------------ F( 1, 62) = 197.65 Model | 363.742893 1 363.742893 Prob > F = 0.0000 Residual | 114.098806 62 1.84030333 R-squared = 0.7612 ----------+------------------------------ Adj R-squared = 0.7574 Total | 477.841699 63 7.58478888 Root MSE = 1.3566 . regress y lnx Source | SS df MS Number of obs = 64 ----------+------------------------------ F( 1, 62) = 481.32 Model | 423.313321 1 423.313321 Prob > F = 0.0000 Residual | 54.528378 62 .879489967 R-squared = 0.8859 ----------+------------------------------ Adj R-squared = 0.8840 Total | 477.841699 63 7.58478888 Root MSE = .93781 Which Model Fits Better?

  20. Second, Let’s Interpret . regress y lnx Source | SS df MS Number of obs = 64 ----------+------------------------------ F( 1, 62) = 481.32 Model | 423.313321 1 423.313321 Prob > F = 0.0000 Residual | 54.528378 62 .879489967 R-squared = 0.8859 ----------+------------------------------ Adj R-squared = 0.8840 Total | 477.841699 63 7.58478888 Root MSE = .93781 --------------------------------------------------------------------- y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+------------------------------------------------------------- lnx | 3.023069 .1377947 21.94 0.000 2.747621 3.298516 _cons | 2.018158 .1843283 10.95 0.000 1.649691 2.386625 --------------------------------------------------------------------- • A 1-unit increase in the natural log of x results in a 3.02 unit increase in y. • What does a 1-unit increase in ln(x) mean?

  21. We can “back out” our results • First, get predicted values of y from the “logged” equation • Then graph the relationship between y and x (NOT y and lnx). The graph shows the predicted relationship between y and x. • Example…

  22. . predict predincrease . scatter predincrease x, c(l) sort(predincrease) || scatter y x

  23. Warning About Logs • The natural log is only defined for positive, non-zero values. If your variable is not always > 0, you can get into trouble here

  24. Another Transformation • Taking the “square root” of x also can model diminishing marginal returns • Use the Same Procedures • Has the same problem as log with negative numbers

  25. 1/x transformation • Instead of putting x into the regression, we can include 1/x to get a non-linearity like this:

  26. Problems with 1/x • When x is 0, 1/x is undefined

  27. Exponential transformation • Instead of including x, we can use • This gives us a relationship like:

  28. What transformations are OK • Any transformation of x is OK if it is justified by theory • You should be able to get back to your original scale • Be careful about transforming with functions that could be undefined for certain parts of your data

  29. What you should know and be able to do: • Examine variables using summary statistics and graphs • Diagnose non-linear relationships using graphs • Choose an appropriate transformation to model nonlinearity • Estimate and interpret regression models with transformations of the data using STATA.

  30. One obvious transform is left out • X2 • This is actually an interaction term of x with itself. • The effect of x on y depends on the value of x that is being observed! • As such, we must include x and x2 (why?)

  31. The Effect of x on y depends on its own value: • When x is low, a 1-unit increase in x has a large, negative effect • When x is just below its mid-range, a 1-unit increase in x has a small negative effect • When x gets over its mid-range, a 1-unit increase in x has a small positive effect • When x is high, a 1-unit increase in x has a large positive effect

  32. Example • Remember this? Let’s think about age… . regress volunteer tvhours sibs educ age Source | SS df MS Number of obs = 1187 ----------+------------------------------ F( 4, 1182) = 17.17 Model | 425.853521 4 106.46338 Prob > F = 0.0000 Residual | 7330.93334 1182 6.20214326 R-squared = 0.0549 ----------+------------------------------ Adj R-squared = 0.0517 Total | 7756.78686 1186 6.54029246 Root MSE = 2.4904 --------------------------------------------------------------------- volunteer | Coef. Std. Err. t P>|t| [95% Con. Interval] ----------+---------------------------------------------------------- tvhours | -.0551014 .0328821 -1.68 0.094 -.1196152 .0094124 sibs | .0232574 .020254 1.15 0.251 -.0164803 .0629951 educ | .1912462 .0266417 7.18 0.000 .1389758 .2435166 age | .0141891 .0043829 3.24 0.001 .0055899 .0227882 _cons | -.9510863 .4783878 -1.99 0.047 -1.88967 -.0125024 ---------------------------------------------------------------------

  33. What transformations might we use on age? . regress volunteer tvhours sibs educ age age2 Source | SS df MS Number of obs = 1187 ----------+------------------------------ F( 5, 1181) = 14.55 Model | 449.99501 5 89.9990021 Prob > F = 0.0000 Residual | 7306.79185 1181 6.1869533 R-squared = 0.0580 ----------+------------------------------ Adj R-squared = 0.0540 Total | 7756.78686 1186 6.54029246 Root MSE = 2.4874 -------------------------------------------------------------------- volunteer | Coef. Std. Err. t P>|t| [95% Con. Interval] ----------+--------------------------------------------------------- tvhours | -.0528947 .0328608 -1.61 0.108 -.1173668 .0115774 sibs | .0191269 .0203369 0.94 0.347 -.0207736 .0590275 educ | .1797836 .0272345 6.60 0.000 .1263502 .2332169 age | .0622066 .0246994 2.52 0.012 .013747 .1106662 age2 | -.0004876 .0002468 -1.98 0.048 -.0009718 -3.30e-06 _cons | -1.824354 .6509468 -2.80 0.005 -3.101495 -.5472129 --------------------------------------------------------------------

  34. What is the result? • Compare Goodness of Fit: • Adj. R2 improves (0.0540 vs. before 0.0517) • RMSE decreases (2.4874 vs. before 2.4904) • Improvement is minimal • Why such slight improvement? Look at the graph. • Stata Command: gen vol_hat = _b[_cons] + _b[tvhours]*2 + _b[sibs]*1 + _b[educ]*16 + _b[age]*age + _b[age2]*age2 • Stata Command: scatter vol_hat age , c(l) sort(age) msize(medium) || scatter volunteer age, msize(vsmall)

  35. Which model… • Weigh parsimony (simplicity, ease of interpretation • Weight precision (increases in goodness of fit) • Weigh theoretical appropriateness

  36. One last note: • Because polynomial terms are really interaction terms, the standard errors you see in the Stata output are not correct. • Effect of x depends on the value of x • The conditional slope is obtained by calculus • So the standard error of x is.

  37. Calculate conditional Std. Err. • gen condslope = _b[age] + 2*_b[age2]*age • gen condse = sqrt( .000016 + 4*age*.000000061 + 4*age*.000006 ) • gen low95 = condslope - 1.96*condse • gen hi95 = condslope + 1.96*condse • scatter condslope age, c(l) || scatter low95 age, c(l) sort(age) || scatter hi95 age, c(l) sort(age) yline(0)

More Related