300 likes | 436 Views
Please click in. Set your clicker to channel 41. My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z.
E N D
Please click in Set your clicker to channel 41 My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z
Introduction to Statistics for the Social SciencesSBS200, COMM200, GEOG200, PA200, POL200, SOC200Lecture Section 001, Fall, 2011Room 201 Physics-Atmospheric Sciences (PAS)10:00 - 10:50 Mondays & Wednesdays + Lab Session Welcome Please double check – All cell phones other electronic devices are turned off and stowed away http://www.youtube.com/watch?v=oSQJP40PcGI
Use this as your study guide By the end of lecture today11/21/11 Logic of hypothesis testing with Correlations Interpreting the Correlations and scatterplots Simple Regression Using correlation for predictions r versus r2
Readings for next exam Lind Chapter 13: Linear Regression and Correlation Chapter 14: Multiple Regression Chapter 15: Chi-Square Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions
Homework due next class November 23rd Assignment 14: Regression worksheet (can be found on class website) Homework Questions? Please double check – All cell phones other electronic devices are turned off and stowed away
Correlation Negative correlation: Zero correlation: Positive correlation p value for correlation of X with Y Correlation of X with X Correlation of X with Y Correlation of X with Z
Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses For correlation null is that r = 0 (no relationship) Step 2: Decision rule • Alpha level? (α= .05 or .01)? • Critical statistic (e.g. critical r) value from table? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed r is bigger then critical r then reject null Step 5: Conclusion - tie findings back in to research problem
Finding a statistically significant correlation • The result is “statistically significant” if: • the observed correlation is larger than the critical correlationwe want our r to be big if we want it to be significantly different from zero!! (either negative or positive but just far away from zero) • the p value is less than 0.05 (which is our alpha) • we want our “p” to be small!! • we reject the null hypothesis • then we have support for our alternative hypothesis
Correlation: Independent and dependent variables • When used for prediction we refer to the predicted variable • as the dependent variable and the predictor variable as the independent variable What are we predicting? What are we predicting? Dependent Variable Dependent Variable Independent Variable Independent Variable
YearlyIncome Expenses per year Correlation - What do we need to define a line If you probably make this much Y-intercept = “a” (also “b0”)Where the line crosses the Y axis Slope = “b” (also “b1”)How steep the line is If you spend this much • The predicted variable goes on the “Y” axis and is called the dependent variable • The predictor variable goes on the “X” axis and is called the independent variable
X Y XY X2 Y2 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 Draw a scatterplot
r = -0.85 b= - 0.91(slope) a= 5.5 (intercept) Draw a regression line and regression equation
Prediction line Y’ = a+ b1X1 Frequency of Teeth brushing will be about Other Problems Y-intercept If number of cavities = 3 Slope The expected frequeny of teeth brushing for having one cavity is Frequency of teeth brushing= 5.5 + (-.91) Cavities If “Cavities” = 3, what is the prediction for “Frequency of teeth brushing”? Frequency of teeth brushing= 5.5 + (-.91) Cavities Frequency of teeth brushing= 5.5 + (-.91) (3) Frequency of teeth brushing= 5.5 + (-2.73) = 2.77 (3.0, 2.77)
5 4 3 Number of times per day teeth are brushed 2 1 0 0 1 2 3 4 5 Number of cavities How well does the prediction line predict the Ys from the Xs? Residuals • Shorter green lines suggest better prediction – smaller error • Longer green lines suggest worse prediction – larger error • Why are green lines vertical? • Remember, we are predicting the variable on the Y axis • So, error would be how we are wrong about Y (vertical)
5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities How well does the prediction line predict the Ys from the Xs? Residuals • Slope doesn’t give “variability” info • Intercept doesn’t give “variability info • Correlation “r” does give “variability info • Residuals do give “variability info
Sound familiar?? What if we want to know the “average deviation score”? Finding the standard error of the estimate (line) Standard error of the estimate (line) Standard error of the estimate: • a measure of the average amount of predictive error • the average amount that Y’ scores differ from Y scores • a mean of the lengths of the green lines
Which minimizes errorbetter? 5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities r2 = The proportion of the total variance in one variablethat is predictable by its relationship with the other variable 5 4 # of times teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities How much better does the regression line predict the observed results? r2 Wow!
What is r2? r2 = The proportion of the total variance in one variable that is predictable by its relationship with the other variable Examples If mother’s and daughter’s heights are correlated with an r = .8, then what amount (proportion or percentage) of variance of mother’s height is accounted for by daughter’s height? .64 because (.8)2 = .64
What is r2? r2 = The proportion of the total variance in one variable that is predictable for its relationship with the other variable Examples If mother’s and daughter’s heights are correlated with an r = .8, then what proportion of variance of mother’s height is not accountedfor by daughter’s height? .36 because (1.0 - .64) = .36
Correlation - the prediction line - what is it good for? Prediction line • makes the relationship easier to see • (even if specific observations - dots - are removed) • identifies the center of the cluster of (paired) observations • identifies the central tendency of the relationship(kind of like a mean) • can be used for prediction • should be drawn to provide a “best fit” for the data r2 • should be drawn to provide maximum predictive (explanatory) power for the data • should be drawn to provide minimum predictive error
Let’s try one Which of these correlations would be most likely to have the highest positive value for r?a. Scatterplot Ab. Scatterplot Bc. Scatterplot Cd. Can not be determined from the information given Correct Answer
Let’s try one Which of the these scatterplots will have the smallest “y intercept”?a. Scatterplot Ab. Scatterplot Bc. Scatterplot Cd. Can not be determined from the information given Correct Answer
Let’s try one Which of the these correlations would be most likely to representthe correlation between salary and expenses? a. Scatterplot Ab. Scatterplot Bc. Scatterplot Cd. Can not be determined from the information given Correct Answer
Let’s try one Which of the following correlations would allow you the most accurate predictions? a. r = + 0.01b. r = - 0.10 c. r = + 0.40d. r = - 0.65 Correct Answer
Let’s try one After duplicate correlations have been discarded and trivial correlations have been ignored, there remain a. two correlationsb. three correlationsc. six correlationsd. nine correlations Correct Answer
Let’s try one Which of the following conclusions can not be made from the data in the matrix? a. There is a significant correlation between Science and Reading b. There is a significant correlation between Math and Reading c. There is a significant correlation between Math and Science Correct Answer
Let's assume that this scatterplot is depicting a correlation between heights and weights of children, and that this correlation is +0.6. From this we can conclude that what proportion of the variance of height is due to weight? a. 0.12 b. 0.36 c. 0.60 d. 1.00 Correct Answer
Thank you! See you next time!!