1 / 9

Lecture 9: Explaining Variation in Y

Lecture 9: Explaining Variation in Y. BUEC 333 Summer 2009 Simon Woodcock. Explaining Variation in Y.

Download Presentation

Lecture 9: Explaining Variation in Y

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 9: Explaining Variation in Y BUEC 333 Summer 2009 Simon Woodcock

  2. Explaining Variation in Y • We’ve said several times that the goal of regression analysis is to “explain” variation in the dependent variable Yion the basis of variation in the independent variables X1i , X2i , ... , Xki • What does this mean? • And how do we know whether we’re doing a good job? • These are today’s topics.

  3. The Total Sum of Squares • When we talk about the “variation” in Yi to be “explained” by the independent variables, we’re talking about how Yivaries around its mean. • That is, we want to explain departures in Yifrom it’s population mean μY • why? because we can already “explain” the mean pretty well using • Of course, we don’t know μY so we look at departures in Yi from its sample mean (i.e., ) • However, we always have(why?) so trying to “explain” the total of these departures is pretty useless • Instead, we focus on what’s usually called the Total Sum of Squares (TSS) which isn’t zero unless there’s no variation in Yiat all. • When TSS is big, there is lots of variation in Yiaround its mean – and this is what we want to explain using the independent variables.

  4. The Decomposition of Variance • We can always write:(all I’ve done is add and subtract the predicted value -- draw a picture) • It follows that (you should be able to show this yourself – and I recommend you try it):where ESS is the explained sum of squares, and RSS is the residual sum of squares. • We’ve decomposed the total (squared) variation in Yiaround its mean into a component that our regression model explains (ESS), and a component that our regression model cannot explain (RSS).

  5. The Proportion of Variance Explained: R2 • When we build a regression model, we frequently want to know how well it “fits” the data. • Does our model do a good job of explaining the variation in Yi? • We can use our decomposition of TSS into ESS and RSS to measure the proportion of the variation in Yithat is explained by our model. • We call the proportionof the variation in Yithatis explained by the regression model R2: • Notice that 0 ≤ R2 ≤ 1

  6. Using R2to Assess Model Fit • R2is a useful measure to assess how well our model “fits” the data – that is, how well it explains the variation in Yi. • When R2 = 0, the regression explains none of the variation in Yi • the regression model explains variation in Yino better than the sample mean does (draw a picture) • When R2= 1, the regression explains all of the variation in Yi • this means there is an exact relationship between Yiand the independent variables (no errors – draw a picture) • Typically, we don’t encounter either of these extremes in real data (draw a picture) • Usually, bigger values of R2are “better” in the sense that our regression model does a “better” job of predicting Yi • but all it tells us is that there is a strong linear relationship between Yi and the independent variables – it doesn’t imply anything causal.

  7. More About R2 • How big should R2be to be confident in our model? • that depends on the context • in wage regressions (regress wage on education, experience, etc.) there are so many things that affect what a person earns that are hard to measure (luck, ability, motivation, etc.) that we are happy when R2is above 0.4 • in “macro” or financial regressions (e.g., regress the unemployment rate on inflation, economic growth, etc.) we are suspicious if R2is below 0.9 • There is a temptation to build a model (i.e., choose your independent variables) to maximize R2 • avoid this temptation! • if you add another independent variable to your model, R2never decreases – even if the new variable has no “real” relationship with the dependent variable!

  8. Motivating Adjusted R2 • There are other reasons to avoid building a model to maximize R2 • Occam’s Razor: “one should not increase, beyond what is necessary, the number of entities required to explain anything” (all else equal, we prefer smaller, simpler models) • losing degrees of freedom: a model’s degrees of freedom is the number of observations (n) minus the number of parameters you estimate (k slope parameters + 1 intercept). When we add independent variables to the model, we lose degrees of freedom and (we’ll see soon), our parameter estimates are less precise. • So if we add extra variables to the model, we need to trade off a better fit (in terms of R2) against parsimony (having a small, simple model). • An alternative to R2that takes this into account is adjusted R2.

  9. Adjusted R2 • Another way to measure the quality of a model’s fit is adjusted R2: • Adjusted R2(pronounced Rbar-squared) penalizes for having lots of independent variables (or few degrees of freedom) • It can increase, decrease, or stay the same when we add an extra regressor to the model. • If we add an extra independent variable that is only weakly related to the dependent variable, adjusted R2 will decrease • Like R2, adjusted R2 is less than 1, but it is not necessarily positive (if R2is very close to zero, adjusted R2can be negative) • It’s not the “be all and end all” – to assess whether a regression model is “good” we need to look at plenty of other things: do regression coefficients have plausible sign & magnitude? does the model give sensible predictions? is it missing independent variables that we know matter? etc.

More Related