370 likes | 491 Views
Quantitative Methods 2. Lecture 3 The Simple Linear Regression Model. Edmund Malesky, Ph.D., UCSD. What is Ordinary Least Squares?. Ordinary Least Squares (OLS) finds the linear model that minimizes the sum of the squared errors .
E N D
Quantitative Methods 2 Lecture 3 The Simple Linear Regression Model Edmund Malesky, Ph.D., UCSD
What is Ordinary Least Squares? • Ordinary Least Squares (OLS) finds the linear model that minimizes the sum of the squared errors. • Such a model provides the best explanation/prediction of the data. • Later we’ll show that OLS is the “Best Linear Unbiased Estimator” (BLUE)
“Explained and “Unexplained” Variation Y yi X Xi o
“Explained and “Unexplained” Variation Square this quantity and sum across all observations and we have our SST (Total Sum of Squares) Square this quantity and sum across all observations and we have our SSR (Residual Sum of Squares) Y yi Square this quantity and sum across all observations and we have our SSE (Explained Sum of Squares) X Xi o
Some Useful Properties of Summation Proofs of 7 and 8 in Appendix A of Wooldridge
Minimizing the Sum of Squared Errors • How to put the Least in OLS? • In mathematical jargon we seek to minimize the residual sum of squares (SSR), where:
Picking the Parameters • To Minimize SSR, we need parameter estimates. • In calculus, if you wish to know when a function is at its minimum, you take the first derivative. • In this case we must take partial derivatives since we have two parameters (β0 & β1) to worry about.
Minimize the Squared Errors • The SSR Function is: Substitute in our equation for yhat.
Here comes the magic, Baby! • Simplify Terms • Partial Derivative with respect to β0 • Partial Derivative with respect to β1
Simplify Terms Separate terms (A-B)2= A2-2AB+B2 (A-B)2= A2-BA-AB+B2 First, Outside, Inside, Last (F.O.I.L) F.O.I.L Multiply -2yi through
Partial Derivative with respect to β0 Take the derivative only of terms which include β0 Simplify
Partial Derivative with respect to β1 Take the derivative only of terms which include β1 Simplify
Partial Derivatives for β0 and β1 • First equation is the partial derivative with respect to β0 • Second equation is with respect to β1
Simplify and Set Equal to Zero • First equation is for β0, second is for β1 • Set = 0 to find minimum point • (Hats denote that parameters are estimates)
The Normal Equations Divide equation 1 by -2 and equation 2 by 2 Multiply through by -x in β1’s equation Separate summation terms and rearrange to yield:
Solving the Normal Equations • Now we have two equations with two unknown terms: β0 and β1 • These can be solved using algebra to calculate the values of both β0 and β1
Solving for β1 • Multiply first normal equation by the sum of xi • Multiply second normal equation by n.
Now subtract first equation from the second This yields: Still Solving for β1 …
Terms of cancel one another out Then we factor out β1 from both terms on the right-hand side Then divide through by the quantity on the right hand side to yield: Still Solving for β1 …
Tricky: Multiply both sides by 1/n*1=1/n*n/n=n/n2 Why? I need to multiply three separate numbers. I can’t simply split the 1/n. Imagine I wanted to multiply ½(5*10)=25. I can’t solve it by multiplying 1/2(5)*1/2(10), which equals 12.5. That is like multiplying by ¼. I need to multiply by ½*1=1/2*2/2=2/4. Now, 2(1/2*5)(1/2*10)=25 A Solution for β1 • Now multiply numerator & denominator by 1/n • Recall that: • This yields:
Now Solving for βo • Take the first normal equation • Then divide both sides by n and rearrange to yield:
A Solution for βo • Now again that recall that: • Thus:
But What Does It Mean? • Equation for β1 may not seem to make intuitive sense at first • But if we break it down into pieces, we can begin to see the logic
Understanding what makes β1 • Numerator for β1 is made of of TWO parts • Deviations of X from its mean • Deviations of Y from its mean • Then we multiply those deviations • And sum them up across all observations We know this as…. Covariance.
Understanding What Makes β1 We know this as…. Variance in the Independent Variable • Denominator of β1 is made up of the deviation of x from its mean times itself • We square this term. • And sum up across all observations
Understanding What Makes β1 • Thus β1 is made of of changes in x times changes in y, divided by changes in x squared • A.K.A “rise over run” • Notice if the changes in x are EQUAL to the changes in y, then β1 = 1
Understanding What Makes β1 • If the changes in y are LARGER than the changes in x, then β1 > 1 • I.E. a 1 unit change in x creates more than a 1 unit change in y • If the changes in y are SMALLER than the changes in x, then β1 < 1 • I.E. a 1 unit change in x creates less than a 1 unit change in y
Understanding What Makes β1 • This corresponds to our intuitive understanding of the slope of a line • How much change in y do we observe for each change in x? • We can also see how β1 is calculated in units of the dependent variable. • It is changes in the dependent variable over changes in the independent variable
Calculating β0 & β1 • Mean of x is 4 • Mean of y is 14
Calculating β0 & β1 = 186 = 62 β1 = 3
Calculating βo and β1 • βo = mean of y - β1 (mean of x) • Recall that: • mean of y = 14 & mean of x = 4 • βo = 14 - 3(4) • βo = 2 • Our equation is: y = 2 + 3x
Calculating R2 • Let’s return to SSR • Plug in βo and solve to get SST SSR SSE
Calculating R2 • R2= SSE/SST Our model perfectly explains variation in y.