• 170 likes • 315 Views
B AD 6243: Applied Univariate Statistics. Multiple Regression Professor Laku Chidambaram Price College of Business University of Oklahoma. Basics of Multiple Regression.
E N D
B AD 6243: Applied Univariate Statistics Multiple Regression Professor Laku Chidambaram Price College of Business University of Oklahoma
Basics of Multiple Regression • Multiple regression examines the relationship between one interval/ratio level variable and two or more interval/ratio (or dichotomous) variables • As in simple regression, the dependent (or criterion) variable is y and the other variables are the independent (or predictor) variables xi • The intent of the regression model is to find a linear combination of x’s that best correlate with y • The model is expressed as: Y = 0 + 1Xi + 2X2 … + nXn + I BAD 6243: Applied Univariate Statistics
A Graphical Representation Objective: To graphically represent the equation Y = 0 + 1Exp_X1 + 2RlExp_X2 + I BAD 6243: Applied Univariate Statistics
Selecting Predictors • Rely on theory to inform selection • Examine correlation matrix to determine strength of relationships with Y • Use variables based on your knowledge • Let the computer decide based on data set BAD 6243: Applied Univariate Statistics
Selecting Method of Inclusion • Enter • Enter – Block • Stepwise • Forward selection • Backward elimination • Stepwise BAD 6243: Applied Univariate Statistics
What to Look For? • b-values vs. standardized beta weights (β) • R: represents correlation between observed values and predicted values of Y • R-squared: represents the amount of variance shared between Y and all the predictors combined • Adjusted R-squared BAD 6243: Applied Univariate Statistics
First Order Assumptions • Continuous variables (also see next slide) • Linear relationships between Y and Xs • Sufficient variance in values of predictors • Predictors uncorrelated with external variables BAD 6243: Applied Univariate Statistics
Including Categorical Variables • Dichotomous variables: e.g., Gender • Coded as 0 or 1 • Dummy variables: e.g., Political affiliation • Create d - 1 dummy variables, where d is the number of categories • So, with four categories, you need three dummy variables BAD 6243: Applied Univariate Statistics
Second Order Assumptions • Independence of independent variables • Equality of variance • Normal distribution of error terms • Independence of observations BAD 6243: Applied Univariate Statistics
Violations of Assumptions BAD 6243: Applied Univariate Statistics
Multicollinearity • High correlations among predictors • Can result in: • Lower value of R • Difficulty of judging relative importance of predictors • Increases instability of model • Possible solutions: • Examine correlation matrices, VIFs and tolerances to judge if predictor(s) need to be dropped • Rely on computer assisted means • Other options BAD 6243: Applied Univariate Statistics
Heteroskedasticity • Systematic increase or decrease in variance • Can result in: • Confidence intervals being too wide or narrow • Unstable estimates • Possible solutions: • Transform data • Other options BAD 6243: Applied Univariate Statistics
Outliers • Undue influence of extreme values • Can result in: • Incorrect estimates and inaccurate confidence intervals • Possible solutions: • Identify and eliminate value(s), but … • Transform data • Other options BAD 6243: Applied Univariate Statistics
Autocorrelation • Observations are not independent (typically, observations over time) • Can result in: • Lower standard error of estimate • Lower standardized beta values • Possible solutions: • Search for key “missing” variables • Cochrane-Orcutt Procedure • Other options BAD 6243: Applied Univariate Statistics
A Graphical Representation BAD 6243: Applied Univariate Statistics