210 likes | 405 Views
Classical Regression. Lecture 8. Today’s Plan. For the next few lectures we’ll be talking about the classical regression model Looking at both estimators for a and b Inferences on what a and b actually tell us Today: how to operationalize the model
E N D
Classical Regression Lecture 8
Today’s Plan • For the next few lectures we’ll be talking about the classical regression model • Looking at both estimators for a and b • Inferences on what a and b actually tell us • Today: how to operationalize the model • Looking at BLUE for bi-variate model • Inference and hypothesis tests using the t, F, and c2 • Examples of linear regressions using Excel
Estimating coefficients • Our model: Y = a + bX + e • Two things to keep in mind about this model: 1)It is linear in both variables and parameters • Examples of non-linearity in variables: Y = a + bX2 or Y = a + bex • Example of non-linearity in parameters: Y = a + b2X • OLS can cope with non-linearity in variables but not in parameters
Estimating coefficients (3) 2) Notation: we’re not estimating a and b anymore • We are estimating coefficients which are estimates of the parameters of a and b • We will denote the coefficients as or and or • We are dealing with a sample size of n • For each sample we will get a different and pair
Estimating coefficients (4) • In the same way that you can take a sample to get an estimate of µyyou can take a sample to get an estimate of the regression line, of and
The independent variable • We also have a given variable X, its values are known • This is called the independent variable • Again, the expectation of Y given X is E(Y | X) = a + bX • With constant variance V(Y) = 2
A graph of the model Y (Y1, X1) Y X
What does the error term do? • The error term gives us the test statistics and tells us how well the model Y = a+bX+e fits the data • The error term represents: 1) Given that Y is a random variable, e is also random, since e is a function of Y 2) Variables not included in the model 3) Random behavior of people 4) Measurement error 5) Enables a model to remain parsimonious - you don’t want all possible variables in the model if some have little or no influence
Rewriting beta • Our complete model is Y = a + bX + e • We will never know the true value of the error e so we will estimate the following equation: • For our known values of X we have estimates of , , and • So how do we know that our OLS estimators give us the BLUE estimate? • To determine this we want to know the expected value of as an estimator of b, which is the population parameter
Rewriting beta(2) • To operationalize, we want to think of what we know • We know from lecture 2 that there should be no correlation between the errors and the independent variables • We also know • Now we have that E(Y|X) = a + bX + E(|X) • The variance of Y given X is V(Y) = 2 so V(|X)= 2
Rewriting beta(3) • Rewriting • In lecture 2 we found the following estimator for • Using some definitions we can show: E() = b
Rewriting beta (4) • We have definitions that we can use: So that • Using the definitions for yiandxi we can rewrite as • We can also write
Rewriting beta (5) • We can rewrite as where • The properties of ci :
Showing unbiasedness • What do we know about the expected value of beta? • We can rewrite this as • Multiplying the brackets out we get: • Since b is constant,
Showing unbiasedness (2) • Looking back at the properties for ci we know that • Now we can write this as • We can conclude that the expected value of is b and that is an unbiased estimator of b
Gauss Markov Theorem • We can now ask: is an efficient estimator? • The variance of b is Where • How do we know that OLS is the most efficient estimator? • The Gauss-Markov Theorem
Gauss Markov Theorem (2) • Similar to our proof on the estimator for my. • Suppose we use a new weight • We can take the expected value of E()
Gauss Markov Theorem (3) • We know that • For to be unbiased, the following must be true:
Gauss Markov Theorem (4) • Efficiency (best)? • We have where • Therefore the variance of this new is • + Sdi2 +2Scidi • If each di 0 such that ci c’i then • So when we use weights c’I we have an inefficient estimator
Gauss Markov Theorem (5) • We can conclude that • is BLUE
Wrap up • What did we cover today? • Introduced the classical linear regression model (CLRM) • Assumptions under the CLRM 1) Xi is nonrandom (it’s given) 2) E(ei) = E(ei|Xi) = 0 3) V(ei)= V(ei|Xi) = 2 4) Covariance (eiej) = 0 • Talked about estimating coefficients • Defined the properties of the error term • Proof by contradiction for the Gauss Markov Theorem