240 likes | 394 Views
PSYC 3030 Review Session. Gigi Luk December 7, 2004. Overview. Matrix Multiple Regression Indicator variables Polynomial Regression Regression Diagnostics Model Building. Matrix: Basic Operation. Addition Subtraction Multiplication Inverse |A| ≠ 0 A is non-singular
E N D
PSYC 3030 Review Session Gigi Luk December 7, 2004
Overview • Matrix • Multiple Regression • Indicator variables • Polynomial Regression • Regression Diagnostics • Model Building
Matrix: Basic Operation • Addition • Subtraction • Multiplication • Inverse • |A| ≠ 0 • A is non-singular • All rows (columns) are linearly independent Possible only when dimensions are the same Possible only when inside dimensions are the same 2x3 & 3x2
Matrix: Inverse Linearly independent: Linearly Dependent:
Some notations • n = sample size • p = number of parameters • c = number of values in x (cf. LOF, p. 85) • g = number of family member in a Bonferroni test (cf. p. 92) • J = I = H = x(x’x)-1x’
LS estimates x’y = (x’x)b x’x = x’y = (x’x)-1= Residuals e = = y – xb = [I – H]y Matrix: estimates & residuals
Matrix: Application in Regression df MS • SSE = e’e = y’y-b’x’y n-p SSE/n-p • SSM = 1 • SSR = b’x’y – SSM p-1 SSR/p-1 • SST = y’y n • SSTO = y’(1-J/n)y n-1 = y’y – SSM
Matrix: Variance-Covariance Var-cov (Y) = σ2(Y) = var-cov (b) = est σ2(b) = s2(b) = = MSE (x’x)-1 =
Multiple Regression • Model with more than 2 independent variables: y = β0 + β1X1 + β2X2 + εi
Coefficients of multiple determination: R2 = SSR/SSTO 0 ≤ R2 ≤ 1 alternative: Coefficients of partial determination: MR: R-square
SSTO SSR(X1) SSR(X2) SSR(X1,X2) SSR(X1|X2) SSR(X2|X1) SSE(X1) SSE(X2) SSE(X1,X2)
MR: Hypothesis testing • Test for regression relation (the overall test): Ho: β1 = β2 =….. =βp-1 =0 Ha: not all βs = 0 If F* ≤ F(1-α; p-1, n-p), conclude Ho. F*=MSR/MSE • Test for βk: Ho: βk = 0 Ha: βk ≠ 0 If |t|* ≤ t(1-α/2; n-p), conclude Ho. t* = bk/s(bk) ≈ F*= [MSR(xk|all others)/MSE]
MR: Hypothesis Testing (cont’) • Test for LOF: Ho: E{Y} = βo + β1X1+β2X2+….+ βp-1Xp-1 Ha: E{Y} ≠ βo + β1X1+β2X2+….+ βp-1Xp-1 If F* ≤ F(1-α; c-p, n-p), conclude Ho. F* = (SSLF/c-p)/(SSPE/n-c) • Test whether some βk=0: Ho: βh = βh+1 =….. =βp-1 =0 If F* ≤ F(1-α; p-1, n-p), conclude Ho. F* = [MSR(xh…xp-1|x1…xh-1)]/MSE
MR: Extra SS (p. 141, CK) • Full: y = βo+ β1X1+ β2X2 SSR(x1,x2) • Red: y = βo+ β1X1 SSR(x1) • SSR (x2|x1) = SSR(x1,x2) - SSR(x1) = Effect of X2 adjusted for X1 = SSE(x1) - SSE(x1,x2) • General Linear Test Ho: β2 = 0 Ha: β2 ≠ 0 F* =
Y = expressive vocabulary 0 X = receptive vocabulary Indicator variables y-hat = bo +b1X1 +b2X2 y-hat = bo +b1X1 girls boys bo+b2 slope = b1 bo
Y = expressive vocabulary 0 X = receptive vocabulary y-hat = bo + b1X1 +b2X2 + b12X1X2 If b12 > 0, then there is an interaction boys and girls have different slopes in the relation of X and Y. boys girls
Polynomial Regression • 2nd Order: Y = βo+ β1X1 + β2X2+εi • 3rd Order: Y = βo+ β1X1 + β2X2+ β3X3+εi • Interaction: Y = βo+ β1X1 + β2X2+ β11X21+ β22X22+ β12X1X2+ εi linear quadratic interaction
PR: Partial F-test (p.303, 5th ed.) • Test whether a 1st order model would be sufficient: Ho: β11= β22= β12= 0 Ha: not all βs in Ho =0 F* = In order to obtain this SSR, you need sequential SS (see top of p. 304 in text). This test is a modified test for extra SS.)
Regression Diagnostics • Collinearity: • Effects: (1) poor numerical accuracy (2) poor precision of estimates • Danger sign: several large s(bk) • Determinant of x’x ≈ 0 • Eigenvalues of c = # of linear dependencies • Condition #: (λmax/ λi)1/2 • 15-30 watch out • > 30 trouble • > 100 disaster
Regression Diagnostics • VIF (Variance Inflation Factor) = 1/(1-R2i) When to worry? When VIF ≈ 10 • TOL (Tolerance) = 1/VIFi
Model Building • Goals: • Make R2 large or MSE small • Keep cost of data collection, s(b) small • Selection Criteria: • R2 look at ∆R2 • MSE can or as variables are added
Random error Bias Model Building (cont’) • Cp≈ p = est. of 1/σ2 Σ{var(yhat) + [yhattrue – yhatp]} =SSEp/MSEall – (n-2p) =p+(m+1-p)(Fp-1) m: # available predictors Fp: incremental F for predictors omitted
Model Building (cont’) • Variable Selection Procedure • Choose min MSE & Cp≈ p • SAS tools: • Forward • Backward • Stepwise • Guided selection: key vars, promising vars, haystack • Substantive knowledge of the area • Examination of each var: expected sign & magnitude coefficients