360 likes | 501 Views
Lecture 4. Econ 488. Ordinary Least Squares (OLS). Objective of OLS Minimize the sum of squared residuals: where Remember that OLS is not the only possible estimator of the β s. But OLS is the best estimator under certain assumptions…. Classical Assumptions.
E N D
Lecture 4 Econ 488
Ordinary Least Squares (OLS) • Objective of OLS Minimize the sum of squared residuals: • where • Remember that OLS is not the only possible estimator of the βs. • But OLS is the best estimator under certain assumptions…
Classical Assumptions • 1. Regression is linear in parameters • 2. Error term has zero population mean • 3. Error term is not correlated with X’s • 4. No serial correlation • 5. No heteroskedasticity • 6. No perfect multicollinearity • and we usually add: • 7. Error term is normally distributed
Assumption 1: Linearity • The regression model: • A) is linear • It can be written as • This doesn’t mean that the theory must be linear • For example… suppose we believe that CEO salary is related to the firm’s sales and CEO’s tenure. • We might believe the model is:
Assumption 1: Linearity • The regression model: • B) is correctly specified • The model must have the right variables • No omitted variables • The model must have the correct functional form • This is all untestable We need to rely on economic theory.
Assumption 1: Linearity • The regression model: • C) must have an additive error term • The model must have + εi
Assumption 2: E(εi)=0 • Error term has a zero population mean • E(εi)=0 • Each observation has a random error with a mean of zero • What if E(εi)≠0? • This is actually fixed by adding a constant (AKA intercept) term
Assumption 2: E(εi)=0 • Example: Suppose instead the mean of εi was -4. • Then we know E(εi+4)=0 • We can add 4 to the error term and subtract 4 from the constant term: • Yi =β0+ β1Xi+εi • Yi =(β0-4)+ β1Xi+(εi+4)
Assumption 2: E(εi)=0 • Yi =β0+ β1Xi+εi • Yi =(β0-4)+ β1Xi+(εi+4) • We can rewrite: • Yi =β0*+ β1Xi+εi* • Where β0*= β0-4 and εi*=εi+4 • Now E(εi*)=0, so we are OK.
Assumption 3: Exogeneity • Important!! • All explanatory variables are uncorrelated with the error term • E(εi|X1i,X2i,…, XKi,)=0 • Explanatory variables are determined outside of the model (They are exogenous)
Assumption 3: Exogeneity • What happens if assumption 3 is violated? • Suppose we have the model, • Yi =β0+ β1Xi+εi • Suppose Xi and εi are positively correlated • When Xi is large, εi tends to be large as well.
Assumption 3: Exogeneity “True” Line “True Line”
Assumption 3: Exogeneity Data Data “True” Line “True Line” “True Line”
Assumption 3: Exogeneity Estimated Line Data “True Line”
Assumption 3: Exogeneity • Why would x and ε be correlated? • Suppose you are trying to study the relationship between the price of a hamburger and the quantity sold across a wide variety of Ventura County restaurants.
Assumption 3: Exogeneity • We estimate the relationship using the following model: • salesi= β0+β1pricei+εi • What’s the problem?
Assumption 3: Exogeneity • What’s the problem? • What else determines sales of hamburgers? • How would you decide between buying a burger at McDonald’s ($0.89) or a burger at TGI Fridays ($9.99)? • Quality differs • salesi= β0+β1pricei+εi quality isn’t an X variable even though it should be. • It becomes part of εi
Assumption 3: Exogeneity • What’s the problem? • But price and quality are highly positively correlated • Therefore x and ε are also positively correlated. • This means that the estimate of β1will be too high • This is called “Omitted Variables Bias” (More in Chapter 6)
Assumption 4: No Serial Correlation • Serial Correlation: The error terms across observations are correlated with each other • i.e. ε1 is correlated with ε2, etc. • This is most important in time series • If errors are serially correlated, an increase in the error term in one time period affects the error term in the next.
Assumption 4: No Serial Correlation • The assumption that there is no serial correlation can be unrealistic in time series • Think of data from a stock market…
Assumption 4: No Serial Correlation Stock data is serially correlated!
Assumption 5: Homoskedasticity • Homoskedasticity: The error has a constant variance • This is what we want…as opposed to • Heteroskedasticity: The variance of the error depends on the values of Xs.
Assumption 5: Homoskedasticity Homoskedasticity: The error has constant variance
Assumption 5: Homoskedasticity Heteroskedasticity: Spread of error depends on X.
Assumption 5: Homoskedasticity Another form of Heteroskedasticity
Assumption 6: No Perfect Multicollinearity • Two variables are perfectly collinear if one can be determined perfectly from the other (i.e. if you know the value of x, you can always find the value of z). • Example: If we regress income on age, and include both age in months and age in years. • But age in years = age in months/12 • e.g. if we know someone is 246 months old, we also know that they are 20.5 years old.
Assumption 6: No Perfect Multicollinearity • What’s wrong with this? • incomei= β0 + β1agemonthsi + β2ageyearsi + εi • What is β1? • It is the change in income associated with a one unit increase in “age in months,” holding age in years constant. • But if you hold age in years constant, age in months doesn’t change!
Assumption 6: No Perfect Multicollinearity • β1 = Δincome/Δagemonths • Holding Δageyears = 0 • If Δageyears = 0; then Δagemonths = 0 • So β1 = Δincome/0 • It is undefined!
Assumption 6: No Perfect Multicollinearity • When more than one independent variable is a perfect linear combination of the other independent variables, it is called Perfect MultiCollinearity • Example: Total Cholesterol, HDL and LDL • Total Cholesterol = LDL + HDL • Can’t include all three as independent variables in a regression. • Solution: Drop one of the variables.
Assumption 7: Normally Distributed Error • This is required not required for OLS, but it is important for hypothesis testing • More on this assumption next time.
Putting it all together • Last class, we talked about how to compare estimators. We want: • 1. is unbiased. • on average, the estimator is equal to the population value • 2. is efficient • The variance of the estimator is as small as possible
Gauss-Markov Theorem • Given OLS assumptions 1 through 6, the OLS estimator of βk is the minimum variance estimator from the set of all linear unbiased estimators of βk for k=0,1,2,…,K • OLS is BLUE • The Best, Linear, Unbiased Estimator
Gauss-Markov Theorem • What happens if we add assumption 7? • Given assumptions 1 through 7, OLS is the best unbiased estimator • Even out of the non-linear estimators • OLS is BUE?
Gauss-Markov Theorem • With Assumptions 1-7 OLS is: • 1. Unbiased: • 2. Minimum Variance – the sampling distribution is as small as possible • 3. Consistent – as n∞, the estimators converge to the true parameters • As n increases, variance gets smaller, so each estimate approaches the true value of β. • 4. Normally Distributed. You can apply statistical tests to them.