220 likes | 317 Views
Simple Linear Regression and. Correlation. by. Asst. Prof. Dr. Min Aung. When SLR?. Study a relationship between two variables Paired-Samples or matched data Interval or ratio level measurement. Independent and dependent variables.
E N D
Simple Linear Regression and Correlation by Asst. Prof. Dr. Min Aung
When SLR? • Study a relationship between two variables • Paired-Samples or matched data • Interval or ratio level measurement
Independent and dependent variables • You want to guess or estimate or compute the values of the dependent variable. • In estimating, you will use the values of the independent variable.
Predictor and Predicted variables • Predictor = independent variable. • Predicted variable = dependent variable.
Scatter Diagram • X-axis = independent variable. • Y-axis = dependent variable. • Each pair of data A point (x, y) Y (2, 3) 3 X 2
Purpose of Drawing Scatter Diagram • Is there a linear relationship between the two variables X and Y? • Linear relationship = Scatter points (roughly at least) form the shape of a straight line. Y Y X X Linear relationship No linear relationship
Measuring Strength of Linear Relationship • Formula (2) (Not used in exam. Just for knowledge) • Pearson’s coefficient of correlation r • Calculator Work For Casio 350MS Switch the calculator on. • Set calculator in LR (Linear Regression) mode: Press Mode. Press 3 for Reg(Regression). Press 1 for Linear. • Check n. (Checking whether there are old data): Press Shift 1, next 3, and then =.
Calculator Work for r • Enter Data in Pairs: x-value , y-value M+ x-value , y-value M+ x-value , y-value M+ • Check n again: see step 2 above. • Press shift2, then move by arrow to the right, press 3 for r, and then press =. Now you see the value of r.
Interpretation of r (Direct linear relationship) • If r is 1 or – 1, then all scatter points are on a straight line. • If r is 1, all points are on a straight line with a positive slope. • If r is -1, all points are on a straight line with a negative slope. • If a straight line has a positive slope, it rises up to the right. • If a straight line has a positive slope, if x increases, then y increases for the points (x, y) on it. (large x, large y) (small x, small y) • In this situation, we say that the two variables X and Y are directly or positively correlated.
Interpretation of r (Inverse linear relationship) • If r is -1, all points are on a straight line with a negative slope. • If a straight line has a negative slope, if x increases, then y decreases for the points (x, y) on it. (small x, large y) (large x, small y) • In this situation, we say that the two variables X and Y are inversely or negatively correlated.
Interpretation of r (strength) • If r is not exactly 1 or – 1, but it is .9 or - .9, then the points are around a straight line. They are close to a straight-line shape. • If r is .8 or - .8, then the points are close to a straight-line shape, but not so well as in case of .9 or -.9. • Thus, the closer r is to 1 or – 1, the closer are the points to a straight-line shape. • Thus, the closer r is to 0, the farther are the points from a straight-line shape. • In r-values, 0.9 are stronger than 0.8, and 0.8are weaker than 0.9.
Interpretation of r (strength) Valuesof r Strong Strong - 0.5 0.5 1 -1 0 Weak linear relationship Weak linear relationship No linear relationship Perfect Perfect
Testing Linear Relationship • Pearson invented a formula to measure the strength and direction of a linear relationship between two variables. • The number given by his formula is called correlation coefficient. We call it Pearson’s coefficient of correlation. • We write r for this value in a sample, and we write for this value in a population. • Testing whether the correlation is significant is scientific guessing whether there should be a correlation, in the population, between the two variables under consideration.
Null and Alternate Hypothesis • Test correlation: H0: = 0 and Ha: 0 • Test direct correlation: H0: 0 and Ha: > 0 • Test inverse correlation: H0: 0 and Ha: < 0 • Test positive correlation: H0: 0 and Ha: > 0 • Test inverse correlation: H0: 0 and Ha: < 0
Three types of test • H0: = 0 and Ha: 0 Two-tailed test • H0: 0 and Ha: < 0 Left-tailed test • H0: 0 and Ha: > 0 Right-tailed test
Critical value • Read t table. • Degrees of freedom (Df) = n - 2 • n = number of pairs of data • Right-tailed test Positive sign • Left-tailed test Negative sign • Two-tailed test Both positive and negative sign
Test Statistic • Test statistic = Strength of evidence supporting alternate hypothesis Ha • Original test statistic to test is r. • Convert r to t by Formula (10). • Learn to compute t by your calculator correctly.
Rejection region 1 • For a two tailed-test, the rejection region is on the right of positive critical value and on the left of negative critical value. Total area = Level of significance = Probability = α T curve Real number line for tvalues Rejection region Rejection region Negative Critical Value 0 Positive Critical Value
Rejection region 2 • For a left-tailed test, the rejection region is on the left of (negative) critical value. α = Area = Level of significance = Probability t curve Real number line for tvalues Rejection region (Negative) Critical Value 0
Rejection region 3 • For a right-tailed test, the rejection region is on the right of the (positive) critical value. Area = Level of significance = Probability = α t curve Real number line for tvalues Rejection region (Positive) Critical Value 0
Decision Rule • If the test statistic (TS) is in the rejection region, then reject H0. • Reject H0 = “H0 is false, and hence Ha is true.” • Fail to reject H0 = “H0 is true, and hence Ha is false.”
Conclusion • Conclusion = Decision • Decision is the last step of statistical procedure. • Conclusion is the report to the one who asked the original question.