80 likes | 209 Views
The Scientific Method. Interpreting Data — Correlation and Regression Analysis. Lecture 5: Interpreting Data – Correlation and Regression Analysis. Interpreting Data – Correlation and Regression Analysis. Summary data may suggest a relationship between independent and dependent variables —
E N D
The Scientific Method Interpreting Data — Correlation and Regression Analysis
Lecture 5: Interpreting Data – Correlation and Regression Analysis Interpreting Data– Correlation and Regression Analysis Summary data may suggest a relationship between independent and dependent variables — a CORRELATION Correlation analysis is a statistical tool for determining whether an apparent association between two variables is unlikely to be due to chance alone, used when: There is a linear (straight line) relationship between variables a non-linear(curved) relationship can be transformedinto a linear one The Correlation Coefficient(r) is a mathematical measure of how much the total variation in the observed data departs from a theoretical straight line through the data which minimises the variation of each observation from that line (theline of best fit)
Pearson’s Product Moment Correlation Coefficient Lecture 5: Interpreting Data – Correlation and Regression Analysis Correlation Coefficient calculated from the equation: Stated as “the sum of the differences of the x’s times the differences of the y’s divided by the square root of the sum of the squares of the difference of the x’s times the sum of the squares of the differences of the y’s “ e.g. for example mouse length/weight data, STEP 1 is to plot the data to see if it is linearly related
STEP 2 is to set out a table to calculate and record the values Σx, Σ y, Σx2, Σy2, Σxy, and (means of x and y) STEP3 calculate =251 – (372) / 7 = 251 – 195.6 = 55.4 STEP4 calculate 1278 – (822) / 7 = 1278 – 960.6 = 317.4 Lecture 5: Interpreting Data – Correlation and Regression Analysis Correlation Coefficient
STEP5 calculate 553–(37 x 82)/7 = 553 – 433.4 =119.6 STEP6 Calculate the correlation coefficient r= = 119.6 / √(55.4 x 317.4) = 119.6 / 132.6 = 0.90 Lecture 5: Interpreting Data – Correlation and Regression Analysis Correlation Coefficient STEP7 Look up r in the table of correlation coefficients (ignoring + or - sign) The number of degrees of freedom is n - 2 because there are 2 means (in the example 7 – 2 = 5 df) If the calculated r value exceeds the tabulated value at p = 0.05 then the correlation is significant and we may reject the null hypothesis. 0.90 does exceed this value of 0.754 It also exceeds the tabulated value for p = 0.01 but not for p = 0.001
Lecture 5: Interpreting Data – Correlation and Regression Analysis Correlation Coefficient A correlation does not necessarily demonstrate a CAUSAL relationship. A significant correlation only shows that two factors vary in a related way (positively or negatively). In the mouse example, there is no logical reason to think that weight influences the length of the animal (both factors are influenced by age, diet or growth stage)
We calculate m as: We calculate c as: the equation for this straight line is: Lecture 5: Interpreting Data – Correlation and Regression Analysis Fitting a regression line to the data plot The algebraic regression equation for a straight line of y on x is: y = mx + c where m is the slope (or gradient) and c is theintercept (the point where the line crosses the y axis) m = Σdxdy To draw the line through the data points on the graph, we substitute values into this equation, e.g. when x = 4, y = 0.38 and when x = 7, y = 0.49. Only two points are needed to draw the line
y y1 y2 y1 y2 x y3 y3 x x Lecture 5: Interpreting Data – Correlation and Regression Analysis Fitting a regression line to the data plot If you think about what the equation y = mx+c is telling you, it makes sense! It says “the value of y depends on the value of x, the slope of the line and where it crosses the axis” so, for a given value of x, changing the slope of the line will give different values of y and again for a given value of x, changing the position of the line will give different values of y The interceptdefines the position of the line on the y axis