350 likes | 440 Views
Screen. Cabinet. Cabinet. Lecturer’s desk. Table. Computer Storage Cabinet. Row A. 3. 4. 5. 19. 6. 18. 7. 17. 16. 8. 15. 9. 10. 11. 14. 13. 12. Row B. 1. 2. 3. 4. 23. 5. 6. 22. 21. 7. 20. 8. 9. 10. 19. 11. 18. 16. 15. 13. 12. 17. 14. Row C. 1. 2.
E N D
Screen Cabinet Cabinet Lecturer’s desk Table Computer Storage Cabinet Row A 3 4 5 19 6 18 7 17 16 8 15 9 10 11 14 13 12 Row B 1 2 3 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 Row C 1 2 3 24 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 Row D 1 2 25 3 24 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 Row E 1 26 2 25 3 24 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 Row F 27 1 26 2 25 3 24 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 28 Row G 27 1 26 2 25 3 24 4 23 5 6 22 21 7 20 8 9 29 10 19 11 18 16 15 13 12 17 14 28 Row H 27 1 26 2 25 3 24 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 Row I 1 26 2 25 3 24 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 1 Row J 26 2 25 3 24 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 28 27 1 Row K 26 2 25 3 24 4 23 5 6 22 21 7 20 8 9 10 19 11 18 16 15 13 12 17 14 Row L 20 1 19 2 18 3 17 4 16 5 15 6 7 14 13 INTEGRATED LEARNING CENTER ILC 120 9 8 10 12 11 broken desk
Introduction to Statistics for the Social SciencesSBS200, COMM200, GEOG200, PA200, POL200, or SOC200Lecture Section 001, Spring, 2013Room 120 Integrated Learning Center (ILC)10:00 - 10:50 Mondays, Wednesdays & Fridays. Welcome
Homework due – None due Wednesday (April 17th) Please click in My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z
Lab sessions Labs continue this week
Use this as your study guide Next couple of lectures 4/15/13 Simple and Multiple Regression Using correlation for predictions r versus r2 Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent)Coefficient of correlation is name for “r”Coefficient of determination is name for “r2”(remember it is always positive – no direction info)Standard error of the estimate is our measure of the variability of the dots around the regression line(average deviation of each data point from the regression line – like standard deviation) Coefficient of regression will “b” for each variable (like slope)
Schedule of readings Before next exam (Monday April 29th) Please read chapters 10 – 14 Please read Chapters 17, and 18 in Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions
Exam 4 – Optional Times for Final • Two options for completing Exam 4 • Monday (4/29/13) • Wednesday (5/1/13) • Must sign up to take Exam 4 on Friday (4/26) • Only need to take one exam – these are two optional times
Correlation: Independent and dependent variables • When used for prediction we refer to the predicted variable • as the dependent variable and the predictor variable as the independent variable What are we predicting? What are we predicting? Dependent Variable Dependent Variable Independent Variable Independent Variable
+0.92 positive strong The relationship between the hours worked and weekly pay is a strong positive correlation. This correlation is significant, r(3) = 0.92; p < 0.05 up down 55.286 6.0857 y' = 6.0857x + 55.286 207.43 85.71 .846231 or 84% 84% of the total variance of “weekly pay” is accounted for by “hours worked” For each additional hour worked, weekly pay will increase by $6.09
400 380 360 Wait Time 340 320 300 280 7 8 6 5 4 Number of Operators
Critical r = 0.878 No we do not reject the null -.73 negative strong The relationship between wait time and number of operators working is negative and strong. This correlation is not significant, r(3) = 0.73; n.s. number of operators increase, wait time decreases 458 -18.5 y' = -18.5x + 458 365 seconds 328 seconds .53695 or 54% The proportion of total variance of wait time accounted for by number of operators is 54%. For each additional operator added, wait time will decrease by 18.5 seconds
39 36 33 30 27 24 21 Percent of BAs 45 48 51 54 57 60 63 66 Median Income
Critical r = 0.632 Yes we reject the null Percent of residents with a BA degree 10 8 0.8875 positive strong The relationship between median income and percent of residents with BA degree is strong and positive. This correlation is significant, r(8) = 0.89; p < 0.05. median income goes up so does percent of residents who have a BA degree 3.1819 0.0005 y' = 0.0005x + 3.1819 25% of residents 35% of residents .78766 or 78% The proportion of total variance of % of BAs accounted for by median income is 78%. For each additional $1 in income, percent of BAs increases by .0005
30 27 24 21 18 15 12 Crime Rate 45 48 51 54 57 60 63 66 Median Income
Critical r = 0.632 No we do not reject the null Crime Rate 10 8 -0.6293 negative moderate The relationship between crime rate and median income is negative and moderate. This correlation is not significant, r(8) = -0.63; p < n.s. [0.6293 is not bigger than critical of 0.632] . median income goes up, crime rate tends to go down 4662.5 -0.0499 y' = -0.0499x + 4662.5 2,417 thefts 1,418.5 thefts .396 or 40% The proportion of total variance of thefts accounted for by median income is 40%. For each additional $1 in income, thefts go down by .0499
Correlation: Independent and dependent variables • When used for prediction we refer to the predicted variable • as the dependent variable and the predictor variable as the independent variable What are we predicting? What are we predicting? Dependent Variable Dependent Variable Independent Variable Independent Variable
Multiple regression equations 1 How many independent variables? How many dependent variables? Prediction line Y’ = b1X 1+ b0 • We can predict amount of crime in a city from • the number of bathrooms in city Prediction line Y’ = b1X 1+ b2X 2+ b0 • We can predict amount of crime in a city from • the number of bathrooms in city • the amount spent on education in city 3 How many independent variables? 1 How many dependent variables? Prediction line Y’ = b1X 1+ b2X 2+ b3X 3+ b0 • We can predict amount of crime in a city from • the number of bathrooms in city • the amount spent on education in city • the amount spent on after-school programs
Multiple regression • Used to describe the relationship between several independent variables and a dependent variable. Can we predict amount of crime in a city from the number of bathrooms and the amount of spent on education and on after-school programs? Prediction line Y’ = b1X 1+ b2X 2+ b3X 3+ b0 • X1 X2 and X3are the independent variables. • Y is the dependent variable (amount of crime) • b0is the Y-intercept • b1is the net change in Y for each unit change in X1 holding X2and X3 constant. It is called a regressioncoefficient.
YearlyIncome Expenses per year Multiple regression will use multiple independent variables to predict the single dependent variable You probably make this much The predicted variable goes on the “Y” axis and is called the dependent variable. The predictor variable goes on the “X” axis and is called the independent variable You probably make this much Dependent Variable (Predicted) If you spend this much If you save this much Independent Variable 1 (Predictor) If you spend this much Independent Variable 2 (Predictor)
Regression Plane for a 2-Independent Variable Linear Regression Equation
Multiple regression equations • Can use variables to predict • behavior of stock market • probability of accident • amount of pollution in a particular well • quality of a wine for a particular year • which candidates will make best workers
Multiple Linear Regression - Example Can we predict heating cost? Three variables are thought to relate to the heating costs: (1) the mean daily outside temperature, (2) the number of inches of insulation in the attic, and (3) the age in years of the furnace. To investigate, Salisbury's research department selected a random sample of 20 recently sold homes. It determined the cost to heat each home last January
The Multiple Regression Equation – Interpreting the Regression Coefficients b1 = The regression coefficient for mean outside temperature (X1) is -4.583. The coefficient is negative and shows a negative correlation between heating cost and temperature. As the outside temperature increases, the cost to heat the home decreases. The numeric value of the regression coefficient provides more information. If we increase temperature by 1 degree and hold the other two independent variables constant, we can estimate a decrease of $4.583 in monthly heating cost.
The Multiple Regression Equation – Interpreting the Regression Coefficients b2 = The regression coefficient for mean attic insulation (X2) is -14.831. The coefficient is negative and shows a negative correlation between heating cost and insulation. The more insulation in the attic, the less the cost to heat the home. So the negative sign for this coefficient is logical. For each additional inch of insulation, we expect the cost to heat the home to decline $14.83 per month, regardless of the outside temperature or the age of the furnace.
The Multiple Regression Equation – Interpreting the Regression Coefficients b3 = The regression coefficient for mean attic insulation (X3) is 6.101 The coefficient is positive and shows a negative correlation between heating cost and insulation. As the age of the furnace goes up, the cost to heat the home increases. Specifically, for each additional year older the furnace is, we expect the cost to increase $6.10 per month.
Applying the Model for Estimation What is the estimated heating cost for a home if: • the mean outside temperature is 30 degrees, • there are 5 inches of insulation in the attic, and • the furnace is 10 years old?
Thank you! See you next time!!