270 likes | 424 Views
Statistical Analysis – Chapter 9 Regression-Correlation Pt. II. Dr. Roderick Graham Fashion Institute of Technology. Objectives. In the last lecture we discussed the conceptual background behind regression lines… The purpose of scatter plots How we read scatter plots
E N D
Statistical Analysis – Chapter 9Regression-Correlation Pt. II Dr. Roderick Graham Fashion Institute of Technology
Objectives • In the last lecture we discussed the conceptual background behind regression lines… • The purpose of scatter plots • How we read scatter plots • Allowed SPSS to construct scatterplots and regression lines for very large datasets. • In this lecture, we will learn the calculations necessary to construct our own regression lines and make predictions.
Calculating Regression Line Equations – Notes Before Beginning • Remember that regression lines are used to summarize the relationship between two variables, x and y. • Thus, we start our calculations with values for x and y • Think back to how we calculated standard deviation…you had a formula and you needed to set up a chart in order to get the values needed to use that formula. • Solving for a regression line is the same way…you will have the formulas, you just have to “plug and play”
Let’s start with a scatter plot… • Imagine that we have this data… • The scatterplot would be… Now we can solve for the regression line….
Using the formulas… • With this formula, we can predict any future value of y (technically, we can also predict future values of x…but logically in our minds we believe that x is causing y)
Using the formulas… • But we also need a and b in order to use this formula. • These equations seem formidable…but it is just plug and play. You are given x and y, and all you need to do is set up a table to plug in the numbers.
Using the formulas… • Let’s take a closer look at the formulas for a and b Look at the formulas…what other columns and rows do we need in order to use these equations?
Using the formulas… • Let’s take a closer look at the formulas for a and b Columns and rows needed to solve for a and b
Using the formulas… • Setting up tables…. Look at the denominators (below the line) for each formula. What do you notice?
Analyzing the regression formula… • Now, given the table and scatter plot below, we now have a formula to solve for future values of y. • Our formula to solve for y = -8.83 + 1.75x
Predicting future values… • Now that we have our formula, let’s predict and plot points for two new values of y. Let’s say we have the values for x of 8 and 19. • We plug these into our new formula: y = -8.83 + 1.75x x = 8 y = -8.83 + 1.75(8) y = -8.83 + 14, y = 5.17 x = 19 y = -8.83 + 1.75(19) y = -8.83 + 33.25, y = 24.42
Using new values to plot a regression line… • We can use the new values to plot a regression line • We use the new x and y values for a new scatter plot, and connect the points… • And then…this is our regression line
Calculating r • Here is our initial data…let’s use our new regression formula to predict y’s using these x’s….let’s check ourselves…. • Someone calculate the y value for an x of 7 and an x of 15
Calculating r • Even though we have a formula to predict y with any value of x…we know that this formula is not 100% accurate. We proved this by going back to our original data and using original values of x to predict y. • r is the linear correlation coefficient, and it is a measure of the ability of one variable (x) to predict another (y). • The closer that this measure is to -1 or 1, the more accurate one variable predicts the other.
Calculating r Here is our original table..what new column is needed?
Calculating r Now we need to add y2, and solve for r!
Calculating % explained and unexplained • Our correlation coefficient (r) is .96. • Statisticians turn this number into something more “real world”. In order to show how much x explains changes in y (% explained variation), we use this formula: • % Explained variation = 100 r2 • The % explained variation = 100 (.96)2 = 92.16 This also means that the percent unexplained is around 8%
Here is a sample problem…question 9.4 from your textbook. Let’s do this one at your desks (individual or groups. You can turn this in for bonus points on your test)