1 / 24

PSYC 3030 Review Session

PSYC 3030 Review Session. Gigi Luk December 7, 2004. Overview. Matrix Multiple Regression Indicator variables Polynomial Regression Regression Diagnostics Model Building. Matrix: Basic Operation. Addition Subtraction Multiplication Inverse |A| ≠ 0 A is non-singular

morela
Download Presentation

PSYC 3030 Review Session

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PSYC 3030 Review Session Gigi Luk December 7, 2004

  2. Overview • Matrix • Multiple Regression • Indicator variables • Polynomial Regression • Regression Diagnostics • Model Building

  3. Matrix: Basic Operation • Addition • Subtraction • Multiplication • Inverse • |A| ≠ 0 • A is non-singular • All rows (columns) are linearly independent Possible only when dimensions are the same Possible only when inside dimensions are the same 2x3 & 3x2

  4. Matrix: Inverse Linearly independent: Linearly Dependent:

  5. Some notations • n = sample size • p = number of parameters • c = number of values in x (cf. LOF, p. 85) • g = number of family member in a Bonferroni test (cf. p. 92) • J = I = H = x(x’x)-1x’

  6. LS estimates x’y = (x’x)b x’x = x’y = (x’x)-1= Residuals e = = y – xb = [I – H]y Matrix: estimates & residuals

  7. Matrix: Application in Regression df MS • SSE = e’e = y’y-b’x’y n-p SSE/n-p • SSM = 1 • SSR = b’x’y – SSM p-1 SSR/p-1 • SST = y’y n • SSTO = y’(1-J/n)y n-1 = y’y – SSM

  8. Matrix: Variance-Covariance Var-cov (Y) = σ2(Y) = var-cov (b) = est σ2(b) = s2(b) = = MSE (x’x)-1 =

  9. Matrix: Variance-Covariance

  10. Multiple Regression • Model with more than 2 independent variables: y = β0 + β1X1 + β2X2 + εi

  11. Coefficients of multiple determination: R2 = SSR/SSTO 0 ≤ R2 ≤ 1 alternative: Coefficients of partial determination: MR: R-square

  12. SSTO SSR(X1) SSR(X2) SSR(X1,X2) SSR(X1|X2) SSR(X2|X1) SSE(X1) SSE(X2) SSE(X1,X2)

  13. MR: Hypothesis testing • Test for regression relation (the overall test): Ho: β1 = β2 =….. =βp-1 =0 Ha: not all βs = 0 If F* ≤ F(1-α; p-1, n-p), conclude Ho. F*=MSR/MSE • Test for βk: Ho: βk = 0 Ha: βk ≠ 0 If |t|* ≤ t(1-α/2; n-p), conclude Ho. t* = bk/s(bk) ≈ F*= [MSR(xk|all others)/MSE]

  14. MR: Hypothesis Testing (cont’) • Test for LOF: Ho: E{Y} = βo + β1X1+β2X2+….+ βp-1Xp-1 Ha: E{Y} ≠ βo + β1X1+β2X2+….+ βp-1Xp-1 If F* ≤ F(1-α; c-p, n-p), conclude Ho. F* = (SSLF/c-p)/(SSPE/n-c) • Test whether some βk=0: Ho: βh = βh+1 =….. =βp-1 =0 If F* ≤ F(1-α; p-1, n-p), conclude Ho. F* = [MSR(xh…xp-1|x1…xh-1)]/MSE

  15. MR: Extra SS (p. 141, CK) • Full: y = βo+ β1X1+ β2X2 SSR(x1,x2) • Red: y = βo+ β1X1  SSR(x1) • SSR (x2|x1) = SSR(x1,x2) - SSR(x1) = Effect of X2 adjusted for X1 = SSE(x1) - SSE(x1,x2) • General Linear Test Ho: β2 = 0 Ha: β2 ≠ 0 F* =

  16. Y = expressive vocabulary 0 X = receptive vocabulary Indicator variables y-hat = bo +b1X1 +b2X2 y-hat = bo +b1X1 girls boys bo+b2 slope = b1 bo

  17. Y = expressive vocabulary 0 X = receptive vocabulary y-hat = bo + b1X1 +b2X2 + b12X1X2 If b12 > 0, then there is an interaction  boys and girls have different slopes in the relation of X and Y. boys girls

  18. Polynomial Regression • 2nd Order: Y = βo+ β1X1 + β2X2+εi • 3rd Order: Y = βo+ β1X1 + β2X2+ β3X3+εi • Interaction: Y = βo+ β1X1 + β2X2+ β11X21+ β22X22+ β12X1X2+ εi linear quadratic interaction

  19. PR: Partial F-test (p.303, 5th ed.) • Test whether a 1st order model would be sufficient: Ho: β11= β22= β12= 0 Ha: not all βs in Ho =0 F* = In order to obtain this SSR, you need sequential SS (see top of p. 304 in text). This test is a modified test for extra SS.)

  20. Regression Diagnostics • Collinearity: • Effects: (1) poor numerical accuracy (2) poor precision of estimates • Danger sign: several large s(bk) • Determinant of x’x ≈ 0 • Eigenvalues of c = # of linear dependencies • Condition #: (λmax/ λi)1/2 • 15-30 watch out • > 30 trouble • > 100 disaster

  21. Regression Diagnostics • VIF (Variance Inflation Factor) = 1/(1-R2i) When to worry? When VIF ≈ 10 • TOL (Tolerance) = 1/VIFi

  22. Model Building • Goals: • Make R2 large or MSE small • Keep cost of data collection, s(b) small • Selection Criteria: • R2 look at ∆R2 • MSE  can  or  as variables are added

  23. Random error Bias Model Building (cont’) • Cp≈ p = est. of 1/σ2 Σ{var(yhat) + [yhattrue – yhatp]} =SSEp/MSEall – (n-2p) =p+(m+1-p)(Fp-1) m: # available predictors Fp: incremental F for predictors omitted

  24. Model Building (cont’) • Variable Selection Procedure • Choose min MSE & Cp≈ p • SAS tools: • Forward • Backward • Stepwise • Guided selection: key vars, promising vars, haystack • Substantive knowledge of the area • Examination of each var: expected sign & magnitude coefficients

More Related