150 likes | 175 Views
Econ 427 lecture 3 slides. A scatterplot. A regression line. A regression line. Linear Regression. We assume that y is linearly related to x, with an independently and identically distributed ( iid ) disturbance term with a zero mean and constant variance: t = 1,…, T. Linear Regression.
E N D
Linear Regression • We assume that y is linearly related to x, with an independently and identically distributed (iid) disturbance term with a zero mean and constant variance: t = 1,…, T
Linear Regression • The regression function gives an estimate of y, given x, which is just the conditional expectation of y given x = x*,
Linear Regression • Since we don’t know the true (population) relationship, we estimate it from the data by calculating the parameters that minimize squared errors:
Linear Regression • Then we used the estimated parameters to get a fitted value (also called an “in-sample forecast” of y, given x: • where the “hats” indicate estimated values. • The in-sample forecast errors are just:
sum of squared residuals • SSR is the sum of squared residuals of the regression (the minimized value that OLS searches for)
R-squared • A standard measure of overall goodness of fit is R2 (R-squared), technically the percentage of the variance of y explained by the variables in the model:
Adjusted R-squared • the problem with R2 is that it always goes up when you add more variables. To avoid “overfitting” (any model will fit the data if there are enough RHS variables), we normally adjust for the degrees of freedom using the “adjusted R-squared”:
F-statistic • F-statistic is a test of whether all model coefficients are jointly zero—an overall test of the significance of the regression:
Durbin-Watson statistic • The Durbin-Watson statistic is a measure of serialcorrelation in the regression errors. (Why do we care about whether errors are serially correlated?)
Durbin-Watson statistic • is a test of whether there is first-order autocorrelation of the model errors, i.e. • Values for DW fall in the [0,4] interval and values significantly below 2 (below 1.5 say) are indicative of serial correlation.
Moments • Mean • Variance • Skewness • Kurtosis