90 likes | 293 Views
Advanced Data Analysis: Multiple Regression. Advanced Data Analysis: Multiple Regression. What is regression analysis?
E N D
Advanced Data Analysis: Multiple Regression • What is regression analysis? • Statistical technique that allows researchers to investigate the relationship between a dependent variable (Y) and one (X1) or several independent variables (X1, X2, etc.) • Provides a mathematical statement of the relationship • Allows the simultaneous relationship between Y and several Xs • Variables must be of interval or ratio scale (?)
Advanced Data Analysis: Multiple Regression • Correlation versus regression • Correlation -- closeness of the relationship between two variables (Y and X) • Regression -- derivation of a linear equation that explains the relationship between two or more variables (Y, X1, X2, etc.) • General equation: Y = a + biXi + e • a is an intercept term; b represents the change in Y that is explained (or predicted) by a one unit change in X; e is an error term
Advanced Data Analysis: Multiple Regression • The regression equation (example): • Y = 32 + .55X + e • When X = 0, Y = 32 • For each increase in X, Y increases by .55 • When X = 1, Y = 32.55 • When X = 2, Y = 33.10
Advanced Data Analysis: Multiple Regression • Coefficient of determination (r2 orR2) • r2 = 1 - [unexplained variation (in Y by X) / total variation in Y] or • r2 = explained variation (in Y by X) / total variation in Y • R2 – proportion of variation in Y explained by all X’s • In a perfect world r2 (R2)= 1 • Should be > 0 -- will test this!
Advanced Data Analysis: Multiple Regression • How do researchers derive the mathematical relationship? • Estimate the “best” linear equation (Ordinary Least Squares algorithm)
Advanced Data Analysis: Multiple Regression • Interpretation of Results • Overall Model Evaluation • Ho: R2 = 0; Ha: R2 > 0 • F-test • Are individual b coefficients significant? • Ho: bi= 0; Ha: bi n.e. 0 • t-test • EXAMPLE
Advanced Data Analysis: Multiple Regression • Multicollinearity -- two or more X variables are significantly correlated • Reduces the overall predictive (or explanatory) power of each variable (lowers b-values) • Check correlations of Ivs (VIP) • Non-Linear Relationship -- relationship between X and Y cannot be explained with a straight line • Check non-linear relationship (transform X to X2)
Advanced Data Analysis: Multiple Regression • X variables are nominal or interval scaled • Use dummy or effects coding • One X – code 0 or 1 • Multiple levels – need two variables • Interpret results in same way • Use effects coding • One X – code -1 or 1 • Intercept term (a) is mean value of Y • X (X1, X2) variables may “interact” • Create INTERACTION = X1 * X2