1 / 28

My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z

Please click in. Set your clicker to channel 41. My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z .

lyndon
Download Presentation

My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Please click in Set your clicker to channel 41 My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z

  2. Introduction to Statistics for the Social SciencesSBS200, COMM200, GEOG200, PA200, POL200, SOC200Lecture Section 001, Fall, 2011Room 201 Physics-Atmospheric Sciences (PAS)10:00 - 10:50 Mondays & Wednesdays + Lab Session Welcome Please double check – All cell phones other electronic devices are turned off and stowed away http://www.youtube.com/watch?v=oSQJP40PcGI

  3. Use this as your study guide By the end of lecture today11/23/11 Logic of hypothesis testing with Correlations Interpreting the Correlations and scatterplots Simple and Multiple Regression Using correlation for predictions r versus r2 Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent)Coefficient of correlation is name for “r”Coefficient of determination is name for “r2”(remember it is always positive – no direction info)Standard error of the estimate is our measure of the variability of the dots around the regression line(average deviation of each data point from the regression line – like standard deviation) Coefficient of regression will “b” for each variable (like slope)

  4. Readings for next exam Lind Chapter 13: Linear Regression and Correlation Chapter 14: Multiple Regression Chapter 15: Chi-Square Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

  5. Homework due next class November 30th Assignment 14: Regression worksheet (can be found on class website) Please double check – All cell phones other electronic devices are turned off and stowed away

  6. Regression Example The manager of copier company wants to determine whether there is a relationship between the number of sales calls made in a month and the number of copiers sold that month. The manager selects a random sample of 10 representatives and determines the number of sales calls each representative made last month and the number of copiers sold.

  7. Scatter Diagram What are we predicting?

  8. Correlation Coefficient– Excel Example

  9. Correlation Coefficient– Excel Example • Interpret r = 0.759 • Positive relationship between the number of sales calls and the number of copiers sold. • Strong relationship • Remember, we have not demonstrated cause and effect here, only that the two variables—sales calls and copiers sold—are related. 0.759014

  10. Correlation Coefficient– Excel Example • Interpret r = 0.759 • Does this correlation reach significance? • n = 10, df = 8 • alpha = .05 • Observed r is larger than critical r (0.759 > 0.632) therefore we reject the null hypothesis. • r (8) = 0.759; p < 0.05 0.759014

  11. Coefficient of Determination– Excel Example • Interpret r2 = 0.576(.7592 = .576) • we can say that 57.6 percent of the variation in the number of copiers sold is explained, or accounted for, by the variation in the number of sales calls. • Remember, we lose the directionality of the relationship with the r2 0.759014

  12. Regression Analysis – Least Squares Principle When we calculate the regression line we try to: • minimize distance between predicted Ys and actual (data) Y points (length of green lines) • remember because of the negative and positive values cancelling each other out we have to square those distance (deviations) • so we are trying to minimize the “sum of squares of the vertical distances between the actual Y values and the predicted Y values”

  13. Regression Equation- Example If you probably sell this much If make this many calls Step 3 – State the regression equation Y’ = a + bX Y’ = 18.9476 + 1.1842 X What is the expected number of copiers sold by a representative who made 20 calls? Step 4 – Solve for some value of Z Y’ = 18.9476 + 1.1842 (20) Y’ = 42.63

  14. Regression Equation- Example If you probably sell this much If make this many calls Step 3 – State the regression equation Y’ = a + bX Y’ = 18.9476 + 1.1842 X What is the expected number of copiers sold by a representative who made 40 calls? Step 4 – Solve for some value of Z Y’ = 18.9476 + 1.1842 (40) Y’ = 66.3156

  15. The Standard Error ofEstimate The standard error of estimate measures the scatter, or dispersion, of the observed values around the line of regression A formula that can be used to compute the standard error: Standard error of the estimate (line)

  16. The Standard Error of Estimate Step 1: List all the Y data points

  17. The Standard Error of Estimate Step 1: List all the Y data points Step 2: Find all the predicted Y’ data points

  18. The Standard Error of Estimate Step 3: Find deviations Step 4: Square and add up deviations

  19. Then simply plug in the numbers and solve for the standard error of the estimate Remember conceptually, this is like the average of the length of those green lines 784.211 = = 9.901 10 - 2

  20. Multiple regression equations • Can use variables to predict • behavior of stock market • probability of accident • amount of pollution in a particular well • quality of a wine for a particular year • which candidates will make best workers

  21. Multiple regression equations • 1 • How many • independent variables? • How many • dependent variables? Prediction line Y’ = b1X 1+ b0 • We can predict amount of crime in a city from • the number of bathrooms in city Prediction line Y’ = b1X 1+ b2X 2+ b0 • We can predict amount of crime in a city from • the number of bathrooms in city • the amount spent on education in city • 3 • How many • independent variables? • 1 • How many • dependent variables? Prediction line Y’ = b1X 1+ b2X 2+ b3X 3+ b0 • We can predict amount of crime in a city from • the number of bathrooms in city • the amount spent on education in city • the amount spent on after-school programs

  22. Multiple regression • Used to describe the relationship between several independent variables and a dependent variable. Can we predict amount of crime in a city from the number of bathrooms and the amount of spent on education and on after-school programs? Prediction line Y’ = b1X 1+ b2X 2+ b3X 3+ b0 • X1X2and X3are the independent variables. • Y is the dependent variable (amount of crime) • b0is the Y-intercept • b1is the net change in Y for each unit change in X1 holding X2and X3 constant. It is called a regressioncoefficient.

  23. YearlyIncome Expenses per year Multiple regression will use multiple independent variables to predict the single dependent variable You probably make this much The predicted variable goes on the “Y” axis and is called the dependent variable. The predictor variable goes on the “X” axis and is called the independent variable You probably make this much Dependent Variable (Predicted) If you spend this much If you save this much Independent Variable 1 (Predictor) If you spend this much Independent Variable 2 (Predictor)

  24. Regression Plane for a 2-Independent Variable Linear Regression Equation

  25. Multiple regression equations Very often we want to select students or employees who have the highest probability of success in our school or company. Andy is an administrator at a paralegal program and he wants to predict the Grade Point Average (GPA) for the incoming class. He thinks these independent variables will be helpful in predicting GPA. • High School GPA (X1) • SAT - Verbal (X2) • SAT - Mathematical (X3) Andy completes a multiple regression analysis and comes up with this regression equation: Prediction line Y’ = b1X 1+ b2X 2+ b3X 3+ a Y’ = 1.2X 1+ .00163X 2- .00194X 3 - .411 Y’ = 1.2 gpa + .00163satverb - .00194satmath - .411

  26. Here comes Victoria, her scores are as follows: Prediction line: Y’ = b1X 1+ b2X 2+ b3X 3+ a Y’ = 1.2X 1+ .00163X 2-.00194X 3 - .411 • High School GPA = 3.81 • SATVerbal = 500 • SATMathematical = 600 What would we predict her GPA to be in the paralegal program? Y’ = 1.2 gpa + .00163satverb - .00194satmath - .411 Y’ = 1.2 (3.81)+ .00163(500)- .00194 (600)- .411 We predict Victoria will have a GPA of 3.812 Y’ = 4.572 + .815 - 1.164 - .411 = 3.812 Predict Victor’s GPA, his scores are as follows: We predict Victor will have a GPA of 2.656 • High School GPA = 2.63 • SAT - Verbal = 469 • SAT - Mathematical = 440 Y’ = 1.2 gpa + .00163satverb - .00194 satmath - .411 Y’ = 1.2 (2.63)+ .00163(469)- .00194 (440)- .411 = 2.66 Y’ = 3.156 + .76447 - .8536 - .411

  27. Thank you! See you next time!!

More Related