120 likes | 543 Views
LSRLs: Interpreting r vs. r 2. r – “the correlation coefficient” tells you the strength and direction between two variables (x and y, for example, height v. weight, r = ?). -1 ≤ r ≤ 1 r will not work for non-linear relationships r does not have units (r ≠ .30 pounds?)
E N D
LSRLs: Interpreting r vs. r2 • r – “the correlation coefficient” tells you the strength and direction between two variables (x and y, for example, height v. weight, r = ?). -1 ≤ r ≤ 1 • r will not work for non-linear relationships • r does not have units (r ≠ .30 pounds?) • r is not resistant to outliers! Consider the effect of outliers when looking at r, report r with outliers and without • r is same regardless of which is explanatory and which is response variable
Understanding what is expected with LSRLsNote: When finding LSRL the placement of the explanatory and response variables DOES matter! Y_hat = _x + _ (prediction equation, equation of line of best fit) Found by minimizing sum of squares of residuals *extra credit for manual calculation from packet 1. Find LSRL using calculator: stat->calc->8 or 4 (linreg) resulted in the output for packet examples, y_hat = a + bx (#8), y_hat = ax + b (#4) 2. Find LSRL using the mathematical formula of minimizing a quadratic function. (extra credit). 3. Find LSRL using computer output. 4. Find LSRL using b= r sy/sx. (You are not given data, you are given statistics: sy, sx, x_bar, y_bar, and r.) Find b., Substitute into y_bar = ax_bar + b. Solve for a. Substitue a and b into y_hat = a + bx and you are done.
Examining LSRLs: r v. r2 • Students height v. weight y_hat = 4.915x -157.613 predicted weight = 4.915(height) -157.613 r = r2 =
To answer the question in your packet, which is the better prediction equation (which would be more accurate in making a prediction)? • The one with the highest r2 value! • The higher the value, the more % of variation in y is explained by the LSRL of y on x.
Theory behind r2 • It tells us how much better a line with a slope would be at predicting than a line of y=y_bar. • It compares the vertical deviations (residuals) between the sloped line and the horizontal line (y=y_bar) and tells how much better the sloped line is in accounting for this variation. • This math and theory can be found in the book • You don’t have to know the mathematical formulas for finding it for AP Test or my test.
What You Should Know: Summary of r2 • r2 tells us how accurate our LSRL is at making predictions. • Do you think the x value in each observation tells you something about y? How much is it actually telling you? • When r2 = 1 we say “100% of the variation in weight is explained by the LSRL. • When r2 = .64, we say “64% of the variation in weight is explained by the LSRL. • r2 tells us the fractional variation in y that is explained by the LSRL of y on x. • MUST USE THIS SPECIFIC LANGUAGE TO INTERPRET r2 ON THE AP TEST AND MY TEST!!!
What is a residual? • The vertical deviation from y to y_hat from each observation to the LSRL (y_hat) -> “y-y_hat”. • The residual values (the vertical deviations) are stored in your calculator each time you run a linear regression LinReg a+bx. • These residuals can be found in RESID in your calculator 2nd->Stat->RESID
What do the residuals tell us? • The residuals tell us whether a line is a best fit (maybe a non-linear function, exponential or power, might fit the data better and help us predict better). • How to create a residual plot: • Plot x, the explanatory variable, L1 vs. y=RESIDS. (x vs RESIDS) • If the plot shows a pattern (not scattered), then a line is not a best fit.