520 likes | 690 Views
Lesson Objectives. Know what the equation of a straight line is, in terms of slope and y-intercept . Learn how find the equation of the least squares regression line . Know how to draw a regression line on a scatterplot.
E N D
Lesson Objectives • Know what the equation of a straight line is, in terms of slope and y-intercept. • Learn how find the equation of the least squares regression line. • Know how to draw a regression line on a scatterplot. • Know how to use the regression equation to estimate the mean of Y for a given value of X.
Best graphical tool for “seeing”the relationship between two quantitative variables. Use to identify: • Patterns (relationships) • Unusual data (outliers) Scatterplot
Y Y X X Y Y Y X X Positive Linear Relationship Negative Linear Relationship Nonlinear Relationship,need to change the model No Relationship (X is not useful)
b = slope a = the “y” intercept. Equation of a straight line. Y = mx + b m = slope = “rate of change” Days of algebra b = the “y” intercept. ^ Y = a + bx ^ Statistics form Y = estimate of the mean of Y for some X value.
r by “eyeball”. rby using equations by hand. rby hand calculator. r by computer: Minitab, Excel, etc. Equation of a straight line. How are the slope and y-interceptdetermined?
^ Y = a + bx 0 X-axis Equation of a straight line. b = rise run “y” intercept a
^ Y = a + bx 0 X-axis Equation of a straight line. a “y” intercept b = rise run
Example 1: Is height a goodestimator of mean weight? Population: All ST 260 students Y = Weight in pounds,X = Height in inches. Measure: Each value of X defines a subpopulation of “height” values. The goal is to estimate the true meanweight for each of the infinite number of subpopulations.
Sample of n = 5 studentsY = Weight in pounds,X = Height in inches. Example 1: HtWt Case 1 2 3 4 5 73 175 68 158 67 14072 20762 115 Step 1?
· . XY 73 175 68 158 67 140 72 207 62 115 Example 1 220 · . 200 Where should the line go? 180 · 160 . WEIGHT · 140 . · 120 . 100 60 64 68 72 76 HEIGHT
Equation of Least Squares Regression Line Slope: page 615 These are notthe preferred computational equations. y-intercept
S (xi - x)(yi - y) S (xi - x)2 S (yi - y)2 Basic intermediate calculations = Sxy = 1 = Sxx = 2 Numerator part of S2 = Syy = 3 Look at your formula sheet
Alternate intermediate calculations å å ( ) ( ) x y = Sxy = - å xy 1 n 2 (å x) = Sxx = - å x2 2 n Numerator part of S2 2 (å y) = Syy = - å y2 3 n Look at your formula sheet
1 2 3 4 5 S S S S x xy x2 y S y2 Example 1 Case x y HtWt xy Ht*Wt x2 Ht 2 y2 Wt 2 30625 24964 .._ _.___ 73 175 68 158 67 14072 20762 115 12775 10744 . .__.___ 5329 4624 . . _ .___ 342 795 54933 23470 131263
= - å å å ( xy ) ( ) x y n 1 = - 54933 ( 342 ) ( 795 ) 5 = 2 (å - x) å 2 x2 n 2 = - ( ) 342 23470 5 = 2 (å - y) å y2 3 n 2 = - ( ) 795 131263 5 Example 1 Intermediate Summary Values = = =
= 555.0 1 = 77.2 2 = 4858.0 3 Example 1 Intermediate Summary Values Once these values are calculated, the rest is easy!
^ Y = a + bX where Prediction equation 1 = b Estimated Slope 2 = a y b x Estimated Y - intercept Least Squares Regression Line
1 = b 2 555 = 77.2 Example 1 Slope, for Weight vs. Height = 7.189
= a b x y 795 342 y = = 159 x = = 68.4 5 5 - = 159 a (+7.189) 68.4 – 332.73 = Example 1 Intercept, for Weight vs. Height
^ Y = a + b X ^ ^ Y = – 332.73 + 7.189 X ^ Wt = – 332.73 + 7.189 Ht Example 1 Prediction equation
^ Y = – 332.7 + 7.189X Example 1 Draw the line on the plot 220 · 200 · 180 · 160 WEIGHT · 140 · 120 100 60 64 68 72 76 HEIGHT
^ Y = – 332.7 + 7.189 60 ^ Y = 98.64 ^ Y = – 332.7 + 7.189 76 ^ Y = 213.7 Example 1 Draw the line on the plot 220 X · 200 · 180 · 160 WEIGHT · 140 · 120 100 X 60 64 68 72 76 HEIGHT
What a regression equation gives you: • The “line of means” for the Y population. • A prediction of the mean of the population of Y-values defined by a specific value of X. • Each value of X defines a subpopulation of Y-values; the value of regression equation is the “least squares”estimateof the mean of that Y subpopulation.
Example 2: Estimate the weight of a student 5’ 5” tall. ^ Y = a + bX = – 332.73 + 7.189 X
^ Y = – 332.7 + 7.189(65) = Example 2 220 · 200 · 180 · 160 WEIGHT · 140 · 120 100 60 64 68 72 76 HEIGHT
Why was your estimate not exact? Calculate your own weight.
Calculate the least squares regression line. Plot the data and draw theline through the data. Predict Y for a given X. Interpret the meaning of the regression line. Regression: Know How To:
A numerical summary statistic that measures the strength of the linear association between two quantitative variables. Sample Correlation Coefficient, r
r= sample correlation. r= population correlation,“rho”. ris an “estimator” ofr. Notation:
Interpreting correlation: -1.0£r£ +1.0 r > 0.0 Pattern runs upward from left to right; “positive” trend. r < 0.0 Pattern runs downward from left to right; “negative” trend.
Y Y X-axis X-axis Upward & downward trends: Slope and correlationmust have the same sign. r > 0.0 r < 0.0
Y Y X-axis X-axis All data exactly on a straight line: Perfect positiverelationship Perfect negativerelationship r = _____ r = _____
Y Y X-axis X-axis Which has stronger correlation? r = _____________ r = _____________
rclose to-1 or +1 means_________________________ linear relation. rclose to0 means_________________________ linear relation. "Strength":How tightly the data follow a straight line.
Y Y X-axis X-axis Which has stronger correlation? r = ________________ r = ________________
Y Y X-axis X -axis Which has stronger correlation? Strong parabolic pattern! We can fix it. r = ________________ r = ________________
Computing Correlation • by hand using the formula • using a calculator (built-in) • using a computer: Excel, Minitab, . . . .
Sxy 1 r = = Sxx Syy 2 3 Formula for Sample Correlation(Page 627) Look at your formula sheet
1 2 3 Example 1; Weight versus Height Calculating Correlation r= = “Go to Slide 18 for values.” Look at your formula sheet
Real estate data,previous section Example 6 Positive Linear Relationship r =
AL school data,previous section Example 7 r = Negative Linear Relationship
Rainfalldata ,previous section Example 9 r = No linear Relationship
Size of“r” does NOT reflect the steepness of the slope, “b”; but“r” and “b” must have the samesign. . s s × y x r b b r = = s s x y Comment 1: and
Changing the units of Y and X does not affect the size of r. Comment 2: Inches to centimeters Pounds to kilograms Celsius to Fahrenheit X to Z (standardized)
Example:X = dryer temperatureY = drying time for clothes High correlation does not always imply causation. Causation: Comment 3: Changes in X actually docause changes in Y. Consistency, responsiveness, mechanism
Common ResponseBoth X and Y change as some unobserved third variable changes. Comment 4: Example: In basketball, there is a high correlation between points scored and personal fouls committed over a season. Third variable is ___?