240 likes | 382 Views
Welcome to BUAD 310. Instructor: Kam Hamidieh Lecture 19, Wednesday April 2, 2014. Agenda & Announcement. Today: Continue with Chapters 19-22 HW 5 is due today at 5 PM. Please do not ask for extensions. HW 6 is up. It is due on Wednesday April 9, 5 PM. An Interesting Article.
E N D
Welcome to BUAD 310 Instructor: Kam Hamidieh Lecture 19, Wednesday April 2, 2014
Agenda & Announcement • Today: • Continue with Chapters 19-22 • HW 5 is due today at 5 PM. Please do not ask for extensions. • HW 6 is up. It is due on Wednesday April 9, 5 PM. BUAD 310 - Kam Hamidieh
An Interesting Article http://www.latimes.com/sports/la-sp-1028-nba-stats-20131028,0,6776428.story#axzz2xbHNzfH4 BUAD 310 - Kam Hamidieh
Simple Linear Regression Model Study relationship between two quantitative variables Y (response) and X (predictor): Estimated line from least squares method: Residuals: BUAD 310 - Kam Hamidieh
Apple vs. S&P 500 (Fitted Apple Returns) = 0.0013 + 0.465 (S&P Returns) BUAD 310 - Kam Hamidieh
Correlation • Recall correlation for data (slide 8, lecture 10): • r estimates the population correlation ρ (rho). • Can you guess the correlation for the Apple vs. S&P500? BUAD 310 - Kam Hamidieh
Interpretation of r2 • The value of r2 : • Will be always between 0 and 1. WHY? • Will be unit-less. WHY? • r2 can be interpreted as the percentage of the response variation accounted for by the regression. • Two questions: • Why do we care about response variation? • What is mean by “accounting for response variation”? BUAD 310 - Kam Hamidieh
More on r2 • r2 measures only the degree of linear association. • 1 - r2is the fraction of the variation that is left in the residuals. • In hard sciences, r2values near 1 are not unusual. In social sciences you get much lower values. • For the Apple example, r2≈ 5%:About 5% of the variation in the Apple returns is accounted for by the predictor variable the S&P 500 returns. BUAD 310 - Kam Hamidieh
Residuals & Model Assumptions • Recall that the residuals e estimate the errors ε. • Recall that one assumption of simple linear model is: • Then ei’s should be approximately normally distributed * if * the linear model assumption hold. • WHY?....SO WHAT? BUAD 310 - Kam Hamidieh
Residuals & Model Assumptions • Check: • We can look at the histogram and QQ plots(?) of the residuals to see of the normality assumption holds. • We can also plot the residuals versus x to see if σεis approximately constant. We should not see any patterns. • If the above assumptions do not pan out, then our model is not appropriate. • Sometimes we can make some adjustments to correct for the violations. BUAD 310 - Kam Hamidieh
Apple vs SP&500 Residual Plots I see no discernible pattern on the left plot. The residuals seem normality distributed. BUAD 310 - Kam Hamidieh
QQ Plots • A QQ plot is a graph used to assess whether your data come from a normal population or not. • QQ = Quantile-Quantile • Many histograms look bell shaped but the data may not come from a normal population. (Recall t distribution?) • You plot the quantiles of your data against quantiles of a true normal distribution. If you see points along a straight line, then the data come from a normal population. WHY? BUAD 310 - Kam Hamidieh
Q-Q Plot of Apple/SP500 Residuals BUAD 310 - Kam Hamidieh
Some Simulated QQ Plots (n=100) BUAD 310 - Kam Hamidieh
Some Simulated QQ Plots (n=1000) BUAD 310 - Kam Hamidieh
More on Residuals • We can estimateσεby using the residuals as follows: • Has various names: • standard error of the regression • Root mean squared error (RMSE) • The standard deviation of the residuals measures the average distance of the points from the line. • Note that Se will be in the same units as the response. BUAD 310 - Kam Hamidieh
Summarizing the Fit for Apple & SP500 r and r2 Se b0 and b1 Se : On average Apple returns varied from the estimated regression line by about 1.6%. BUAD 310 - Kam Hamidieh
Another use of Se: Unusual Points • You can use Seas a way to see how unusual a point is: • Pick a point of interest. • Find its distance from the line: the residual value! • Divide this distance by Se . • The result tell you how many standard deviation units an observation falls from the line. (Large values indicate unusual observations.) BUAD 310 - Kam Hamidieh
Unusual Points - Example (0.005, 0.032) The residual at x = 0.005: We know: Se = 0.0156 The point falls 0.0284/0.0156 = 1.82 standard deviation units above the line BUAD 310 - Kam Hamidieh
Example On Homework 6, Problem 12. (YouTube video is posted.) BUAD 310 - Kam Hamidieh
In Class Exercise 1 Consumer Report tests different point-and-shoot digital cameras every year. The overall scores for the tested cameras range from 0 to 100 with higher scores indicating better overall test results. Here is the main question we are interested in answering: Will a consumer really get a better camera by spending more? The data is shown on the next slide, followed by the simple regression output. Here, let x = price, y = score. Answer the following questions: • Does the relationship look reasonably linear? • What are the estimates of B0 and B1? • Write down the equation for the estimated line. • Interpret the slop and the intercept. • Interpret the r2. • What is an estimate of σε? Interpret this value. • What is the predicted score of a camera that costs $250? (Also verify by the line.) • Using the residual plots, comment on the assumption that residuals come from a normal distribution with a constant standard deviation. • Will a consumer get a better camera by spending more? Can you quantify this? • Any other comments? BUAD 310 - Kam Hamidieh
Next Time • Inference for Simple Linear Regression and Transformations. BUAD 310 - Kam Hamidieh