1 / 51

Exploring Bivariate Data: Scatter Plot Analysis & Correlation Coefficient

This activity focuses on constructing scatter plots and finding correlation coefficients to graphically represent and analyze the relationship between two quantitative variables. Learn how to interpret positive, negative, or no correlation, understand linear correlations, and calculate the correlation coefficient using computational formulas and z-scores. Explore the range of correlation values, the importance of linear correlation analysis, and the coefficient of determination. Utilize a graphing calculator for data analysis and correlation calculations.

kburg
Download Presentation

Exploring Bivariate Data: Scatter Plot Analysis & Correlation Coefficient

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Warm Up Scatter Plot Activity

  2. Bivariate Data – Scatter Plots and Correlation Coefficient

  3. Objective Construct Scatter Plots and find Correlation Coefficient using Formula

  4. Relevance To be able to graphically represent two quantitative variables and analyze the strength of the relationship.

  5. 2 Quantitative Variables…… • We represent 2 variables that are quantitative by using a scatter plot. • Scatter Plot – a plot of ordered pairs (x,y) of bivariate data on a coordinate axis system. It is a visual or pictoral way to describe the nature of the relationship between 2 variables.

  6. X: a. Input Variable b. Independent Var c. Controlled Var Y: a. Output Variable b. Dependent Var c. Results from the Controlled variable Input and Output Variables……

  7. When dealing with height and weight, which variable would you use as the input variable and why? Answer: Height would be used as the input variable because weight is often predicted based on a person’s height. Example Normal acceptable weight ranges are based on a person’s height!

  8. Constructing a scatter plot • Do a scatter plot of the following data:

  9. What do we look for? • A. Is it a positive correlation, negative correlation, or no correlation? • B. Is it a strong or weak correlation? • C. What is the shape of the graph?

  10. Answer using GDC

  11. Notice the following: A. Strong Positive– as x increases, y also increases. B. Linear - it is a graph of a line. Notice

  12. Example 2 – NO GDC

  13. Example 2 using GDC

  14. Notice the following: Strong Negative – As x increases, y decreases Linear – it’s the graph of a line. Notice

  15. Example 3 – NO GDC

  16. Example 3 using GDC

  17. Notice: There seems to be no correlation between the hours or exercise a person performs and the amount of milk they drink. Notice

  18. Put x’s in L1 and y’s in L2 Click on “2nd y=“ Set scatter plot to look like the screen to the right. Press zoom 9 or set your own window and then press graph. Steps for Scatter Plot using GDC

  19. Linear Correlation

  20. Definition – a statistical method used to determine whether a relationship exists between variables. 3 Types of Correlation: A. Positive B. Negative C. No Correlation Correlation

  21. Positive Correlation:as x increases, y increases or as x decreases, y decreases. • Negative Correlation:as x increases, y decreases. • No Correlation:there is no relationship between the variables.

  22. Linear Correlation Analysis • Primary Purpose: to measure the strength of the relationship between the variables. • *This is a test question!!!!

  23. The numerical measure of the strength and the direction between 2 variables. This number is called the correlation coefficient. The symbol used to represent the correlation coefficient is “r.” Coefficient of Linear Correlation

  24. The range of “r” values • The range of the correlation coefficient is -1 to +1. • The closer to 0 you get, the weaker the correlation.

  25. Range • Strong Negative No Linear Relationship Strong Positive ____________________________________ -1 0 +1

  26. Computational Formula using z-scores of x and y

  27. Example 1 • Find the correlation coefficient (r) of the following example. • Use the lists in the calculator.

  28. Since you will be using a formula that uses z-scores, you will need to know the mean and standard deviation of the x and y values. Put x’s in L1 Put y’s in L2 Run stat calc one var stats L1 – Write down mean & st. dev. Run stat calc one var stats L2 – Write down mean & st. dev. Find mean and standard deviation first Better Option: 2nd Stat Math Mean (L?) Store It!!!!!

  29. x values: y values: Shown on GDC – Write Down

  30. Calculator Lists

  31. From the lists….. n = 5 Calculate “r”

  32. Since r = 0.61, the correlation is a moderate correlation. Do we want to make predictions from this? It depends on how precise the answer needs to be. What does that mean?

  33. Example 2 • Find the correlation coefficient (r) for the following data. • Do you remember what we found from the scatter plot?

  34. Let’s do this one together • Remember to use your lists in the calculator. • Don’t round numbers until your final answer. • Find the mean and st. dev. for x and y. • Explain what you found.

  35. X Values: Y Values:

  36. List values you should have

  37. Compute “r”

  38. Describe it • Since r = 0.897 Strong Positive Correlation

  39. Example 3…… • Find the correlation coefficient for the following data. • Do you remember what we found from the scatter plot?

  40. X Values: Y Values:

  41. List Values using GDC

  42. Compute “r”

  43. Describe it • Since r = -0.944 Strong Negative Correlation

  44. Example 4 • Find the correlation coefficient of the following data. • Do you remember what we found from the scatter plot?

  45. x Values: y Values:

  46. List Values using GDC

  47. Compute “r”

  48. Describe It • Since r = .067 No Correlation…..No correlation exists

  49. What is • It is the coefficient of determination. • It is the percentage of the total variation in y which can be explained by the relationship between x and y. • A way to think of it: The value tells you how much your ability to predict is improved by using the regression line compared with NOT using the regression line.

  50. For Example • If it means that 89% of the variation in y can be explained by the relationship between x and y. • It is a good fit.

More Related