1 / 23

Regression

Regression . single and multiple. Overview. Defined: A model for predicting one variable from other variable(s). Variables: IV(s) is continuous, DV is continuous Relationship: Relationship amongst variables

lilith
Download Presentation

Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regression single and multiple

  2. Overview • Defined: A model for predicting one variable from other variable(s). • Variables: IV(s) is continuous, DV is continuous • Relationship: Relationship amongst variables • Example: Can we predict height from weight? (or weight from height?, or weight from multiple variables?, or height from multiple variables?, • Assumptions: Normality. Linearity. Multicollinearity

  3. Regression is about finding the best straight line

  4. The best straight line is the one that minimizes  S, the sum of the squares

  5. Once we find the best straight line, we know the “intercept” and the “slope”:

  6. Same Intercept, Different slope

  7. Same slope, Different Intercept

  8. Relationship between correlation and regression • Correlation • expresses the strength and direction of the relationship between two variables. • Regression • is an extension of correlation, and allows you to make predictions about one variable from other variable(s) • Bivariate regression (1 IV and 1 DV) produces the same result as correlation • Multiple regression (1+ IVs and 1 DV) goes a step farther than correlation

  9. Relationship between correlation and regression • Hypothesis: • What is the relationship between gun ownership and murder rate within a city? • Correlation: • Imagine you are a researcher interested in the relationship between number of registered weapons (“weapons”) and the murder rate (“murder”) so you collect data on those two variables from many different cities. You find a strong positive relationship (.885) between the two variables that is statistically significant (p=.003).

  10. Relationship between correlation and regression • Regression: • Now, imagine you are the Mayor of Los Angeles. You are considering lifting the ban on automatic weapons. You want to predict whether lifting the ban (so increasing the number of automatic weapons on the streets) will impact the murder rate. • You are going to use the data (from the above 8 cities) to PREDICT the relationship for a 9th city – Los Angeles. You find a strong positive relationship (.885) between the two variables that is statistically significant (p=.003).

  11. Relationship between correlation and regression • Regression: • We can now use numbers from output to create a “regression line” • For example, the regression line is: Y = a + bX Y = the unknown score on the variable you are predicting. a = the Y-intercept of the regression line. b = the slope of the regression line. X = the known score on the other variable you are using to make a prediction. • Y = a + b * X Murders = 4.047 + .853 * Weapons

  12. Relationship between correlation and regression • Regression: • Y = a + b * X Murders = 4.047 + .853 * Weapons • If you are the Mayor of Los Angeles, simply insert into the regression equation the number of weapons on the street in Los Angeles (X), and you can predict the number of murders (Y) • If 1000 weapons, then murders will be = 857 If 2000 weapons, then murders will be = 1710 If 3000 weapons, then murders will be = 2563

  13. Multiple Regression • Using several “predictors” simultaneously • Example: Study about internalizing violence (DV) • Degree of witnessing violence X1 • Measure of life stress X2 • Measure of social support X3 DV

  14. Multiple Regression • Given this diagram, what would you want to know: • (1) When all three entered, overall prediction (variance) of DV DV

  15. Multiple Regression • (2) unique prediction of each variable DV

  16. Multiple Regression • The three things you typically want to know are… • Overall effect (of all variables) • Unique effect of each variable, while controlling for the others • Unique effect of each variable, without controlling for others = R2 = Beta = correlation matrix (same as separate bivariate regressions)

  17. Multiple Regression • What we have just talked about is: • Entry(all simultaneously) • But you have other options as well: • Hierarchical(you specify order) • Stepwise (computer chooses based on criteria) • Backward • Forward • Stepwise

  18. Hierarchical • You enter the variables in a specified order (called steps or blocks). • Block 1 tells you unique effect of the variable(s) • Block 2 tells you unique effect of the new variable(s) • And so forth

  19. Forward • Computer first enters predictor with highest correlation to DV • Computer then enters predictor with highest semi-partial correlation to DV • (if V1 explained 40% of DV, then 60% unexplained, so which variable is best explainer of the 60%) • Computer then enters predictor with highest semi-partial correlation to DV • (if V1 and V2 explained 80%, then which variable best explains the 20%, etc) • and so forth… • Stops when no new variables significantly explains the residual variation.

  20. Backward • Computer enters all variables and calculates unique contribution of each. • A removal criteria is set, and if variable(s) don’t meet the criteria, they are removed from analysis. • The new model is then analyzed, if variable(s) don’t meet the criteria, they are removed from the analysis. • Stops when no more variables meet criteria

  21. Stepwise • Combination of Forward and Backward • Similar to Forward in that… • Computer first enters predictor with highest correlation to DV • Computer then enters predictor with highest semi-partial correlation to DV • Similar to Backward in that… • A removal criteria is set, and if variable(s) don’t meet the criteria, they are removed from analysis

  22. How to choose which variables and how • Correlational matrix • IV Variables somewhat correlated to DV • IV Variables not too correlated with other IV • Regression • Analyze your hypothesis first • Then start “exploratory” analysis • Statisticians frown upon too much exploratory work as “fishing” • Entry and Hierarchical preferred over stepwise. If stepwise, Backward preferred over others.

More Related