140 likes | 369 Views
Multiple Regression and Model Building. Purposes of multiple regression Applications Model and OLS criterion Inferences Model building - variable selection Data considerations LINE assumptions. Purposes of multiple regression. Prediction - y hat
E N D
Multiple Regression and Model Building • Purposes of multiple regression • Applications • Model and OLS criterion • Inferences • Model building - variable selection • Data considerations • LINE assumptions
Purposes of multiple regression • Prediction - y hat • To predict values of Y, the response variable, for given levels of X, the vector of predictor variables • Estimation - beta hat • To estimate the effect of individual predictor variables on the response variable Y
Applications • Relating portfolio return to market return • Negotiating professional sports salaries • Examining implications of a nation’s education policy for infant mortality • Assessing the effect of training on employee performance • Pricing models for residential real estate • Compensation models for Title VII compliance
Model and OLS criterion • y = b0 + b1x1 +…+ bkxk + e, page 528 bj is the expected change in y associated with a unit change in xj, all other variables remaining unchanged • Minimize S (Yi - Ŷihat)2 = S (Yi - XiB) 2 • where X is the matrix of x values with an initial column of ones • and B is the vector of OLS estimates of the beta vector • B = [X’X]-1X’Y
Inferences • Hypothesis of model usefulness (utility) page 535 • Tests of hypotheses H0: bj = 0 page 539 • F drop tests of hypothesespage 554 H0: bg = bg+1 = …= bk = 0 • Interval estimates of coefficients • Interval estimates of conditional means • Prediction intervals for individual values
Partitioning sum of squared error • S (yi-ybar)2 = S (yihat-ybar)2+S (yi-yihat)2 • Complete model TSS = SS(Model)c+ SSEc • Reduced model TSS = SS(Model)r+ SSEr • SSEc < SSEr
Testing nested models: the F drop test • Is SSEc significantly less than SSEr ? • H0: bg = bg+1 = … = 0 • HA: bk > 0 or <0 for some k • Test Statistic: • [(SSEr - SSEc)/# of variables dropped]/[SSEc/error df in C] • = F #of variables dropped, error df in complete model
Model Building - Blocks • Observed variables • Higher order terms • Powers (e.g., squared or cubed variables) • Interactions (products of variables) • Qualitative variables introduced with indicator or dummy (0, 1) variables • Transformations (eg. Ln[y])
Quantitative variables with higher order terms • Interactions X1* X2 • Age * Height • Height * Yrspro • Powers (X1)^2 • Age Squared - i.e. salary increases with age up to a point beyond which it declines
Models Indicator (dummy) Variables • Parallel lines - one quantitative X, one dummy variable D • Nonparallel lines - X, D, and X*D • Equidistant parabolas - X. X2, and D • Non-equidistant parabolas - X, X2,D, and X*D
Multiplicative models and transformations • Y = E(Y) * d Ln Y = b0 + b1x1 +…+ bkxk + e, where e = ln d