Structural Equation Modeling: Problems and Ambiguities with “Well Fitting” Models

Structural Equation Modeling: Problems and Ambiguities with “Well Fitting” Models Andrew Tomarken

Overview of Talk • Brief introduction to structural equation modeling (SEM) with emphasis on core concept of model fit • Review of several ambiguities and problems associated with well-fitting models that are typically ignored by users • Conclusions: • It is important for users to bear in mind what precisely is being tested when assessing model fit • Users need to look beyond omnibus measures of fit

What is SEM? • A set of methods for estimating and testing models that are hypothesized to account for the variances and covariances (and possibly mean structures) among a set of variables • Such models typically consist of sets of linear equations containing free, fixed, or otherwise constrained parameters • Two types of linear relations can be specified • between latent constructs (or factors) and their observable indicators (measurement model) • between latent constructs (structural model) • One way to think about it: Combines simultaneous equation/econometric approaches and factor-analytic/ psychometric approaches

SEM as a General Statistical Approach • Most statistical procedures conventionally used to test hypotheses can be considered special cases of SEM • Parallels development of GLMs in 1970’s and 1980’s as liberalization of classic linear models • Recent development of multilevel and mixture modeling within SEM domain represents further extension of GLM’s to latent continuous and categorical variables • Thus SEM may arguably be most general data-analytic framework at present time (Tomarken & Waller, 2005)

Some Advantages of SEM • High level of explicitness: Forces researchers to specify a model with a high level of detail • Typically aligns the statistical null hypothesis with the research hypothesis • In principle, allows for separate assessments of relations between observable indicators and latent variables (measurement model) and among latent variables • Can test models that are difficult or impossible to test with other procedures (e.g., factor of curves, associative growth) • Allows you to test the overall fit of even very complex models – and that’s the focus of today’s talk

Path Analysis Model (Lynam et al., 1993)

The Figure Implies Linear Equations Figure Equations Imp = a SES + b TE + c VIQ + e1 Del = d SES + e Imp + e2

Confirmatory Factor Analysis Model

The Figure Implies Linear Equations Figure Equations Visperc = Spatial + e_v Cubes = a Spatial + e_c Lozenges = b Spatial + e_l Paragraph = Verbal + e_p Sentence = c Verbal + e_s Wordmean = d Verbal + e_w

Latent Variable Causal Model (Trull, 2001)

A SEM Analysis: What Do We Want to Do?

Estimate Coefficients and Standard Errors

Assess Overall Fit

Model Comparisons

The Concept of Model Fit in SEM • The question: Does the structure implied by the model account for the observed variances and covariances among a set of variables? • We compare the observedcovariance matrix to the covariance matrix implied by the model • A fitting function (=F) assesses the discrepancy between S (sample cov. matrix) and (estimated population covariance matrix implied by the model) • Example: ML fitting function: • F – or something very much like F – appears in the formulae for all conventionally used statistical tests of fit and fit indices • Estimates of free parameters are chosen that meet two potentially competing goals: • minimize the discrepancy between the implied and observed matrices • respect the restrictions (constraints) on the covariance matrix implied by the model

Example of Model-Imposed Restrictions: 3 Variable Mediational Model

Example of Model-Imposed Restrictions: Confirmatory Factor Model 10 knowns: the 4 variances and 6 covariances among v1-v4 8 free parameters to estimate: 4 factor loadings (a-d) and the variances of the four error terms (e1-e4) This model also implies a set of constraints on the covariances among the observable variables C(1,3)C(2,4) = C(1,4)C(2,3) = C(1,2)C(3,4) This model will result in an estimated or implied covariance matrix that respects these constraints If the sample and implied matrices agree, the model fits

This Should Sound Familiar • Although the models and the specific criteria minimized may differ, the notion that statistical tests and fit indices evaluate model-imposed restrictions is completely consistent with general principles of statistical modeling, particularly in specific contexts (e.g., ML estimation)

How Do Users Typically Assess Overall Fit? • Hypothesis-testing using inferential statistical tests • Likelihood ratio chi-square test of exact fit – Compares target model to a saturated (just-identified model) • Nested chi-square tests for competing models (very important for model comparisons) • Fit indices that indicate degree of fit • Historically, more methodological papers on SEM have focused on measures of fit than any other topic

Below the Radar • Both methodological literature and empirical applications heavily emphasize statistical tests and descriptive indices of fit • This focus can blind users to an important point: Even “well-fitting” models can have substantial problems and uncertainties that are often ignored by researchers • Tomarken and Waller’s (2003) review indicated a number of respects in which users ignore several potential problems with models that appear to fit well • Ironically, these issues are not particularly subtle. Rather they are linked to core features of the concept of “model fit” in the SEM context

Potential Problems/Ambiguities with Well-Fitting Models ---- and/or the Researchers Who Test Them • Lack of clarity concerning what exactly is being tested. • A poorly fitting structural (i.e., path) component that is masked by a well-fitting composite model • A large number of equivalent models that will always yield identical fit to the target model • Questionable lower-order components of fit • Omitted variables that influence constructs included in the model • The presence of a number of non-equivalent and non-nested alternative models that could fit better but are rarely ever tested • Low power or sensitivity to detect critical misspecifications • Specifications driven by hidden post-hoc modifications that lower the validity and replicability of the results

Issue # 1: Do you Know What Exactly is Being Tested? • SEM models impose restrictions on variances and covariances among the observed variables (and sometimes on means too). • Unfortunately: • Researchers are often unaware of the restrictions tested by even simple models • Such restrictions sometimes do not reflect what the researcher would identify as core features of the model --- questions that motivated the study in the first place • Many models impose so many restrictions that it’s typically impossible for even specialists to figure them all out or render them comprehensible in a more global way • In short: • People often are unaware of what exactly is being assessed by statistical tests of fit or fit measures – and what is being assessed is often not exactly what the researcher had in mind

This Does Not Mean Overall Model Fit is Irrelevant! • One might argue: Let’s just ignore fit indices and look at what we’re really interested in • Flawed argument: One would not want to test coefficients, estimate direct and indirect effects, estimate proportion of variance, etc., etc in a model that does not fit well and appears to be mis-specified. Parameter estimates and standard errors will be inaccurate. • Don’t ignore fit but see it as a first step or necessary condition for looking at what you really are interested in. It is not an end in itself.

Why the Problem? • Educational • Perceptual/cognitive biases • Feature-positive effect: We attend more to presence (what’s there) than to absence (what’s not there) • Model restrictions are usually characterized by absence (e.g., coefficients that are fixed at 0). • Reliance on graphics and other user-friendly mechanisms for specifying models in software

Why the Problem? • Educational • Perceptual/cognitive biases • Feature-positive effect: We attend more to presence (what’s there) than to absence (what’s not there) • Model restrictions are usually characterized by absence (e.g., coefficients that are fixed at 0). • Reliance on graphics and other user-friendly mechanisms for specifying models in software • Complexity of many models makes it impossible to catalogue all restrictions

Issue # 2: A well-fitting composite model that masks an ill-fitting structural (path) model • In many latent variable SEM models it’s important to distinguish between: • Measurement model: Relations between manifest indicators and latent constructs • Structural (path) model: Relations among latent constructs • Composite model: The whole model that combines both the measurement and structural components • Typically, in latent variable models the clear majority of the restrictions are imposed at the level of the measurement model -- and that often fits well • Common result: A well-fitting composite model that masks an ill-fitting structural component • But the main motivation for the study typically is the structural component!

Chi-Square Tests of the C, M, and S Models • Composite: Global test of the composite model • Measurement: Global test of the measurement model • Structural: Nested Test assessing relative fit of the composite and mesurement models

Illustrating the Problem

Issue # 3: Equivalent Models • Two models are equivalent when their assessed fit across all possible samples is identical because they impose identical restrictions on the data • Such models are ubiquitous in statistics • In the context of SEM, two models are equivalent when their implied covariance matrices are identical because they impose the same restrictions on the variances and covariances • If their implied covariance matrices are identical, then for any given sample, their discrepancy functions will be identical. • If their discrepancy functions are identical, the values of all conventionally used fit indices will be identical.

The Problem • The typical structural equation model has many equivalent models that impose the same restrictions on the data • Typically, at least some are compelling theoretical alternatives to the target model of interest • Such equivalent models are almost never acknowledged by researchers

3 Equivalent Causal Models • These 3 models share the same restriction: [Cov(x,z)*Var(y)]-[Cov(x,y)*Cov(y,z)] = 0 • If variables are standardized, this restriction is: rxz-rxyryz=0 • All 3 models predict that the partial correlation between x and z, adjusting for y equals 0 • The overall fit of these 3 models will always be the same • However, they represent three radically different claims about causal structure

Three Equivalent Measurement Models All 3 models impose the same restriction on the implied covariance matrix: [Cov(x1,x3)*Cov(x2,x4)]-[Cov(x1,x4)*Cov(x2,x3)] = 0

Recommendations • Researchers need to acknowledge presence of equivalent models • Use designs that limit number of plausible equivalents (e.g., one rarely noted advantage of longitudinal relative to cross-sectional designs).

Issue # 4: Fixated on FitInattention to Lower-order Components • What are “lower-order components” ? • Specific model parameters (e.g., path coefficents) • Measures that can be derived from parameters • Direct, indirect, and total effects • Proportion of variance • In most other statistical procedures that we use (e.g., multiple regression), the focus is on lower-order components • There can be dissociations between measures of overall fit and lower-order components • A model can fit perfectly, yet have problematic or disappointing lower-order components • Lower-order components can indicate very strong effects, yet the overall fit can be terrible • Problem: Applied researchers often inappropriately de-emphasize lower-order components in favor of reliance on global fit indices

Sample Covariance Matrix SA Sample Covariance Matrix SB

How Can a Model with Problematic Lower-order Components Fit Well? • Residuals are part of the model! • Two types of residuals in SEM • Residual matrix that is difference between observed and implied covariance matrices • Residual variances and covariances (e.g., variance of an endogenous variable not accounted for by its predictors) that are model parameters • Residual variances • Typically, are just-identified (impose no restrictions) • Can easily “fill in the difference” to reproduce the observed variance of a variable even when predictors account for very small proportion of variance • In essence, a weak theory can be bailed out by residual terms

Residual Covariances are Often Critical Too

Other Respects in Which Local Features of a Model are Ignored • Confidence intervals around parameter estimates rarely reported • Potential problems with tests of parameters often ignored • Reliance on Wald tests • Incorrect chi-square distributions for tests at the boundary of the parameter space • Often invariance across different parameterizations is mistakenly assumed • Issue of assessment of fit at the level of individual subjects is typically ignored (e.g. no analysis of residuals or of individual contributions to fit) • Irony: In many cases, a more rigorous assessment of a model is afforded by a more traditional multiple regression approach!

Issue # 5: Omitted Variables • Sometimes measures of fit are sensitive to the problem of omitted variables (4A tested model, 4B true model) • Sometimes they are not (4A tested, 4C true model) • Thus, a well-fitting model could -- and typically does -- omit important variables

Structural Equation Modeling: Problems and Ambiguities with “Well Fitting” Models