1 / 70

General Structural Equation (LISREL) Models Week 4 #1

General Structural Equation (LISREL) Models Week 4 #1. Non-normal data: summary of approaches Missing data approaches: summary, review and computer examples Longitudinal data analysis: lagged dependent variables in LISREL models. Major approaches:.

dudley
Download Presentation

General Structural Equation (LISREL) Models Week 4 #1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. General Structural Equation (LISREL) Models Week 4 #1 Non-normal data: summary of approaches Missing data approaches: summary, review and computer examples Longitudinal data analysis: lagged dependent variables in LISREL models

  2. Major approaches: • Transform data to normality before using in SEM software • Can be done with any stats packages • Common transformations: log, sqrt, square • ADF (also called WLS [in LISREL] AGLS [EQS]) estimation • Requires construction of asymptotic covariance matrix • Requires large Ns

  3. Major approaches to non-normal data • Transform data to normality before using in SEM software • ADF (also called WLS [in LISREL] AGLS [EQS]) estimation • Scaled test statistics (Bentler-Satorra) • also referred to as “robust test statistics” • Bootstrapping • New approaches (Muthen) • Polychoric correlations (PM matrix) • Require asympt. Cov. Matrix • Not suitable for small Ns

  4. Scaled test statistics • Generate an asymptotic covariance matrix in PRELIS as well as the usual covariance matrix

  5. Scaled Test Statistics Added statistics provided when asymptotic covariance matrix specified in LISREL program Part 2A: ML estimation but scaled chi-square statistic DA NI=14 NO=1456 CM FI=e:\classes\icpsr2004\Week3Examples\nonnormaldata\relmor1.cov AC FI=e:\classes\icpsr2004\Week3Examples\nonnormaldata\relmor1.acc …PROGRAM MATRIX SPECIFICATION LINES ou me=mll sc nd=3 mi Degrees of Freedom = 67 Minimum Fit Function Chi-Square = 407.134 (P = 0.0) Normal Theory Weighted Least Squares Chi-Square = 409.627 (P = 0.0) Satorra-Bentler Scaled Chi-Square = 319.088 (P = 0.0) Chi-Square Corrected for Non-Normality = 342.559 (P = 0.0)

  6. Scaled Test Statistics Added statistics provided when asymptotic covariance matrix specified in LISREL program Caution: LISREL manual suggests standard errors are “robust” se’s but in version 8.54, identical to regular ML. Use nested chi-square LR tests if needed Degrees of Freedom = 67 Minimum Fit Function Chi-Square = 407.134 (P = 0.0) Normal Theory Weighted Least Squares Chi-Square = 409.627 (P = 0.0) Satorra-Bentler Scaled Chi-Square = 319.088 (P = 0.0) Chi-Square Corrected for Non-Normality = 342.559 (P = 0.0)

  7. Categorical Variable Model Joreskog: with ordinal variables, “no units of measurement.. Variances and covariances have no meaning.. the only information we have is counts of cases in each cell of a multiway contingency table.

  8. Categorical Variable Model

  9. Categorical Variable Model

  10. Categorical Variable Model

  11. Categorical Variable Model

  12. Categorical Variable Model

  13. Categorical Variable Model Bivariate normality: not testable 2x2 Issue: zero cells (skipped) Too many zero cells: imprecise estimates Only one non-zero cell in a row or column: estimation breaks down (in tetrachoric, PRELIS replaces 0 with 0.5; will affect estiamtes)

  14. Categorical Variable Model Polychoric correlation very robust to violations of underlying bivariate normality - doctoral dissert. Ana Quiroga, 1992, Upsala) LR chi-square very sensitive RMSEA measure: - no serious effects unless RMSEA >1 (PRELIS will issue warning)

  15. Categorical Variable Model What if underlying bivariate normality does not hold approximately? - reduce # of categories - eliminate offending variables - assess if conditional on covariates

  16. Bivariate data patterns not fitting the model

  17. Insert if time permits: brief overview of LISREL CVM approach • Subdirectory Week4Examples\OrdinalData

  18. Bootstrapping Hasn’t caught on as much as one might have thought Sample with replacement, repeat B times, get set of values for parameters and observe the distribution across “draws” Typically, bootstrap N = sample N (some literature suggestinng m<n might be preferred, but n is standard)

  19. Bootstrapping Notes on technique: Yung and Bentler in Marcoulides and Schumaker, Advanced SEM (text supp.) + article in Br. J. Math & Stat Psych. 47: 63-84 1994 Important development: see Bollen and Stine in Long, Testing Structural Equation Models.

  20. Bootstrapping in AMOS • Under analysis options, Bootstrapping tab 0 bootstrap samples were unused because of a singular covariance matrix. 0 bootstrap samples were unused because a solution was not found. 500 usable bootstrap samples were obtained.

  21. Bootstrapping in AMOS

  22. Bootstrapping in AMOS

  23. Missing Data • The major approaches we discussed last class: • EM algorithm to “replace” case values and estimate Σ, z • Nearest neighbor imputation • FIML

  24. The “mechanics” of working with missing data in PRELIS/LISREL Nearest Neighbor: In PRELIS syntax: IM (V356 SEX ) (V147 V176 V355) VR=.5 XN or XL

  25. The “mechanics” of working with missing data in PRELIS/LISREL The “matching variables” should have relatively few missing cases (for a given case, imputation will fail if any of the matching variables is missing). Matching variables may include variables in the “imputed variables” list (though if any of these variables has a large number of missing cases, this would not be a good idea).

  26. PRELIS imputation Can save results of imputation in raw data file

  27. Imputation It is even possible to then re-run PRELIS and do other imputations. (Although not advised, a variable that has been imputed can now be used as a “matching variable”. It is also possible to make another attempt at imputation for the same variable using different “matching variables”). (would need to read in raw data file back into PRELIS)

  28. Sample listing (IM) SAMPLE listing: Case 13 imputed with value 7 (Variance Ratio = 0.000), NM= 1 Case 14 not imputed because of Variance Ratio = 0.939 (NM= 2) Case 21 not imputed because of missing values for matching variables Number of Missing Values per Variable After Imputation V9 V147 V151 V175 V176 V304 V305 V307 -------- -------- -------- -------- -------- -------- -------- -------- 16 13 54 38 9 21 35 56 V308 V309 V310 V355 V356 SEX OCC1 OCC2 -------- -------- -------- -------- -------- -------- -------- -------- 32 37 36 29 62 13 0 0 OCC3 OCC4 OCC5 -------- -------- -------- 0 0 0 Distribution of Missing Values Total Sample Size = 1839 Number of Missing Values 0 1 2 3 4 5 6 7 8 9 Number of Cases 1584 162 50 17 10 5 7 2 1 1

  29. EM algorithm: PRELIS

  30. EM algorithm: PRELIS syntax: !PRELIS SYNTAX: Can be edited SY='G:\Missing\USA5.PSF' SE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 EM CC = 0.00001 IT = 200 OU MA=CM SM=emcovar1.cov RA=usa6.psf AC=emcovar1.acm XT XM ------------------------------- EM Algoritm for missing Data: ------------------------------- Number of different missing-value patterns= 80 Convergence of EM-algorithm in 4 iterations -2 Ln(L) = 98714.48572

  31. Multiple Group Approach Allison Soc. Methods&Res. 1987 Bollen, p. 374 (uses old LISREL matrix notation)

  32. Multiple Group Approach Note: 13 elements of matrix have “pseudo” values - 13 df

  33. Multiple group approach Disadvantage: - Works only with a relatively small number of missing patterns

  34. Other missing data option:FIML estimation LISREL PROGRAM FOR SEXUAL MORALITY AND RELIGIOSITY EXAMPLE DA NI=19 NO=1839 MA=CM RA FI='G:\MISSING\USA1.PSF' -------------------------------- EM Algorithm for missing Data: -------------------------------- Number of different missing-value patterns= 80 Convergence of EM-algorithm in 5 iterations -2 Ln(L) = 98714.48567 Percentage missing values= 1.81 Note: The Covariances and/or Means to be analyzed are estimated by the EM procedure and are only used to obtain starting values for the FIML procedure SE V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 V355 V356 SEX/ MO NY=11 NE=2 LY=FU,FI PS=SY TE=SY BE=FU,FI NX=3 NK=3 LX=ID C PH=SY,FR TD=ZE GA=FU,FR VA 1.0 LY 5 1 LY 8 2 FR LY 1 1 LY 2 1 LY 3 1 LY 4 1 FR LY 11 2 LY 7 2 LY 6 2 LY 9 2 LY 10 2 FR BE 2 1 OU ME=ML MI SC ND=4 LISREL IMPLEMENTATION

  35. FIML GAMMA V355 V356 SEX -------- -------- -------- ETA 1 -0.0137 0.0604 0.4172 (0.0024) (0.0202) (0.0828) -5.7192 2.9812 5.0358 ETA 2 -0.0066 0.1583 -0.3198 (0.0025) (0.0215) (0.0871) -2.6128 7.3654 -3.6705 GAMMA -- regular ML, listwise AGE EDUC SEX -------- -------- -------- ETA 1 -0.0130 0.0732 0.4257 (0.0025) (0.0205) (0.0904) -5.2198 3.5626 4.7098 ETA 2 -0.0076 0.1562 -0.3112 (0.0028) (0.0227) (0.0970) -2.7180 6.8715 -3.2087

  36. FIML (also referred to as “direct ML”) • Available in AMOS and in LISREL • AMOS implementation fairly easy to use (check off means and intercepts, input data with missing cases and … voila!) • LISREL implementation a bit more difficult: must input raw data from PRELIS into LISREL

  37. FIML

  38. FIML

  39. FIML

  40. (INSERT PRELIS/LISREL DEMO HERE) • EM covariance matrix • Nearest neighbour imputation • FIML

  41. EM algorithm: in SAS • PROC MI Example: religiosity/morality problem. /Week4Examples/MissingData/SAS SASMIProc1.sas

  42. SAS MI procedure libname in1 'e:\classes\icpsr2005\Week4Examples\MissingData2\SAS'; data one; set in1.wvssub3a; procmi; em outem=in1.cov; var V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 v355 v356 SEX; run; proccalis data=in1.cov cov mod; [calis procedure specifications]

  43. SAS MI procedure Data Set WORK.ONE Method MCMC Multiple Imputation Chain Single Chain Initial Estimates for MCMC EM Posterior Mode Start Starting Value Prior Jeffreys Number of Imputations 5 Number of Burn-in Iterations 200 Number of Iterations 100 Seed for random number generator 1254 Missing Data Patterns Group V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 V355 V356 SEX Freq 1 X X X X X X X X X X X X X X 1456 2 X X X X X X X X X X X X . X 173 3 X X X X X X X X X X X . X X 10

  44. SAS MI procedure Missing Data Patterns Group V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 V355 V356 SEX Freq 4 X X X X X X X X X X X . . X 10 5 X X X X X X X X X X . X X X 5 6 X X X X X X X X X . X X X X 9 7 X X X X X X X X X . X X . X 1 8 X X X X X X X X X . . X X X 2 9 X X X X X X X X . X X X X X 3 10 X X X X X X X X . X . X X X 1 11 X X X X X X X . X X X X X X 13 12 X X X X X X X . X X X X . X 2 13 X X X X X X X . X X X . . X 1 14 X X X X X X X . X X . X X X 3 15 X X X X X X X . X X . . X X 1 16 X X X X X X X . X . X X X X 1 17 X X X X X X X . X . X X . X 1

  45. SAS MI procedure Initial Parameter Estimates for EM _TYPE_ _NAME_ V9 V151 V175 V176 V147 MEAN 1.720790 1.174790 1.414770 8.058470 3.958927 Initial Parameter Estimates for EM V304 V305 V307 V308 V309 V310 V355 1.876238 2.151885 3.049916 2.395683 4.001110 4.896284 46.792265 Initial Parameter Estimates for EM V356 SEX 7.775246 0.489396

  46. SAS MI procedure Initial Parameter Estimates for EM _TYPE_ _NAME_ V9 V151 V175 V176 V147 COV V9 0.808388 0 0 0 0 COV V151 0 0.168983 0 0 0 COV V175 0 0 0.483982 0 0 COV V176 0 0 0 6.783348 0 COV V147 0 0 0 0 6.575298

  47. SAS MI procedure EM (MLE) Parameter Estimates _TYPE_ _NAME_ V9 V151 V175 V176 V147 MEAN 1.721840 1.180968 1.420315 8.046136 3.959583 COV V9 0.807215 0.184412 0.307067 -1.599731 1.301326 COV V151 0.184412 0.170271 0.137480 -0.626684 0.454568 COV V175 0.307067 0.137480 0.485803 -1.073616 0.753307 COV V176 -1.599731 -0.626684 -1.073616 6.805023 -3.428576 COV V147 1.301326 0.454568 0.753307 -3.428576 6.567477 COV V304 0.390792 0.165856 0.263160 -1.368173 1.069671 COV V305 0.455902 0.114129 0.249936 -1.353161 0.993579

  48. SAS PROC mi Multiple Imputation Variance Information Relative Fraction -----------------Variance----------------- Increase Missing Variable Between Within Total DF in Variance Information V9 0.000000239 0.000439 0.000439 1834.4 0.000653 0.000653 V151 0.000002904 0.000092789 0.000096275 1120.1 0.037561 0.036832 V175 0.000002180 0.000264 0.000266 1741.7 0.009913 0.009863 V176 0.000002364 0.003710 0.003713 1834.1 0.000765 0.000764 V147 0.000025982 0.003571 0.003602 1760.1 0.008731 0.008692 V304 0.000002260 0.001621 0.001623 1830.6 0.001674 0.001672 V305 0.000034129 0.001946 0.001987 1509.7 0.021050 0.020824 V307 0.000027451 0.003995 0.004028 1767.2 0.008245 0.008211

  49. Sas PROC mi SAS log: 115 proc mi; em outem=in1.cov; var NOTE: This is an experimental version of the MI procedure. 116 V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 v355 v356 SEX; run; NOTE: The data set IN1.COV has 15 observations and 16 variables. NOTE: PROCEDURE MI used: real time 2.77 seconds cpu time 2.65 seconds

  50. CALIS (SAS) proccalis data=in1.cov cov nobs=1836 mod;  nobs= not needed if working with raw data lineqs v9 = 1.0 F1 + e1, V175 = b1 F1 + e2, V176 = b2 F1 + e3, V147 = b3 F1 + e4, V304 = 1.0 F2 + e5, V305 = b4 F2 + e6, V307 = b5 F2 + e7, V308 = b6 F2 + e8, V309 = b7 F2 + e9, V310 = b8 F2 + e10, F1 = b9 V355 + b10 V356 + b11 SEX + d1, F2 = b12 V355 + b13 V356 + b14 SEX + d2; std e1-e10 = errvar:, - special convention for more than 1 at a time (generates warning msg.) v355=vv355, v356 = vv356, sex = vsex, d1 = vd1, d2= vd2; cov d1 d2 = covD1D2; run;

More Related