230 likes | 1.37k Views
Least Squares Regression Line (LSRL). Presentation 2-5. Introduction. Size of Diamond vs. Price. Many times the scatterplot shows some pattern in the data. For now, we will look at the analysis of data that falls in a straight line pattern. Introduction.
E N D
Least Squares Regression Line (LSRL) Presentation 2-5
Introduction Size of Diamond vs. Price • Many times the scatterplot shows some pattern in the data. • For now, we will look at the analysis of data that falls in a straight line pattern.
Introduction • When we see a straight line pattern, we want to model the data with a linear equation. • This will allow us to make predictions and actually use our data.
Linear Relations • We know lines from algebra to come in the form y = mx + b, where m is the slope and b is the y-intercept. • In statistics, we use y = a + bx for the equation of a straight line. Now a is the intercept and b is the slope. • The slope (b) of the line, is the amount by which y increases when x increase by 1 unit. • This interpretation is very important. • The intercept (a), sometimes called the vertical intercept, is the height of the line when x = 0.
y 15 y = 7 + 3x y increases by b = 3 10 x increases by 1 5 a = 7 0 x 0 2 4 6 8 Example • Consider the equation: y=7+3x • The slope is 3. • For every increase of 1 in the x-variable, there will be an increase of 3 in the y-variable. • The intercept is 7. • When the x-variable is 0, the y-variable is 7.
y 15 y changes by b = -4 (i.e., changes by –4) 10 a = 17 y = 17 - 4x 5 x increases by 1 0 0 2 4 6 8 Example • Consider the equation: y=17-4x • The slope is -4. • For every increase of 1 in the x-variable, there will be a decrease of 4 in the y-variable. • The intercept is 17. • When the x-variable is 0, the y-variable is 17.
Least Squares Line • How can we find the best line to fit the data? • We would like to minimize the total distance away from the line • This distance is measured vertically from the point to the line. • Go to the following applet and start plotting points to see how this process works. Make your own regression line
Least Squares Line • You first get a line once you plot two points. • When you plot the third, green bars appear representing the error (actually called residual) of the line. • These are how far off your line is for each of the points. • The best line is the one that would minimize the total length of the green lines (all put together).
Guess the best fit line • Go to the following applet to practice your skills at estimating an LSRL. • Plot a bunch of points. • Then click the draw line button and draw what you think is the best fit line • Then, check the “show least squares line” checkbox To the applet
The details of the LSRL • The mathematics involved in calculating the LSRL is a bit complicated.
The most widely used criterion for measuring the goodness of fit of a line y = a + bx to bivariate data (x1, y1), (x2, y2), , (xn,yn) is the sum of the of the squared deviations about the line: Least Squares Line The line that gives the best fit to the data is the one that minimizes this sum; it is called the least squares line or sample regression line.
Coefficients a and b S-sub y and s-sub x are the sample standard deviations of y and x (kinda like rise over run) The slope is: The intercept is: y-bar and x-bar are the mean y and x respectively The equation of the least squares regression line is written as: The little symbol above the y is a hat! The equation is read as, “y-hat equals a plus bx.” The ‘y-hat’ indicates that this is a regression line and that the model (equation) is to be used to make predictions.
Three Important Questions • To examine how useful or effective the line summarizing the relationship between x and y, we consider the following three questions. • Is a line an appropriate way to summarize the relationship between the two variables? • Are there any unusual aspects of the data set that we need to consider before proceeding to use the regression line to make predictions? • If we decide that it is reasonable to use the regression line as a basis for prediction, how accurate can we expect predictions based on the regression line to be?
Example #1 - Finding the LSRL • Consider the following data: • With this data, find the LSRL • Start by entering this data into list 1 and list 2
Example #1 - Finding the LSRL • Go to the following website and follow along to complete the regression analysis with your calculator. Regression on the TI83/84
Example #1 - Finding the LSRL • You should then see the results of the regression. • a=53.24 • b=1.65 • r-squared=.8422 • r=.9177 This is the correlation coefficient for the scatterplot!!!
Example #2 – Interpreting LSRL • Interpreting the intercept • When your shoe size is 0, you should be about 53.24 inches tall • Of course this does not make much sense in the context of the problem • Interpreting the slope • For each increase of 1 in the shoe size, we would expect the height to increase by 1.65 inches
Example #3 – Using LSRL • Making predictions • How tall might you expect someone to be who has a shoe size of 12.5? • Just plug in 12.5 for the shoe size above, so… • Height = 53.24+1.65 (12.5)=73.865 inches • Of course this is a prediction and is therefore not exact.
Least Squares Regression Line (LSRL) • This concludes this presentation.