Structural Equation Modeling

Structural Equation Modeling

What is SEM • Swiss Army Knife of Statistics • Can replicate virtually any model from “canned” stats packages (some limitations of categorical variables) • Can test process (B mediates the relationship between A and C) • Can test competing models • Simultaneous estimation of multiple equations • Error Free estimates (if used correctly)

Regression Logic • Estimated using Ordinary Least Squares (OLS) • Assumes normally distributed, independent error • Why? Not for estimating B’s, but for estimating the Standard Error for B’s!

SEM Logic X1 β1 β0 Y 1 e r β2 X2 This will result in the same parameter estimates

Symbols • One headed arrows: A B • “A causes B” • Two headed arrows: A B • “A is correlated with B” also referred to as “unanalyzed relationship”

More Symbols • Important Note: No arrows between variables fixes the relationship to zero. It is an argument that there is no relationship, not just one that you are not analyzing. 0 A B Is the same as A B

More Symbols X An observed variable (i.e. in your dataset) F An un-observed variable (i.e. not in your dataset) This is called a latent trait. If used correctly, these guys can get you error free estimates of parameters

Latent Trait • In a confirmatory factor analysis: • Latent trait is the shared variance • Remember, error doesn’t correlate with anything! F X1 X2 X3

You Already Know About Latent Traits! • You just might not know that you know • MANOVA e Y3 F X1 Y2 Y1 • F would be the first characteristic root

Lets Practice • Write out this model as a regression equation β1 Obesity SES 1 e r Height β2 r r β3 Days Walked to School

Estimation • Maximum Likelihood: • “Plug In” numbers until you reach a point where the error is minimized (which is the same as where the likelihood is maximized) • Have to make one very strong assumption: multivariate normality • But you can violate this to some degree without causing major problems (and most models will).

Problem When Estimating SEM • We can all solve • But most of us can’t solve • In SEM we call this under-identification

Model Fit • Remember, we are just drawing the representation of a bunch of equations • You can ask the question, at the end, how well these equations recreate the correlation matrix • The test of this is the Chi-Squared test for Goodness of Fit (surprise, surprise!) • This stat only works when the N size is between 50 and 200

Other Fit Indices • RMSEA: Root Mean Square Error. • This works well for larger sample sizes. • It is a parsimony adjusted measure (favors simpler models, punishes unneeded parameters) • Values lower that .10 considered adequate, lower than .05 considered good • CFI- Comparative Fit Index (values closer to 1 are better)

Assumptions • Now in Maximum Likelihood World • Iteratative method • Multivariate Normality • Much stronger assumption than regression • Estimating Population Coefficients: not a good small sample method (some newer methods of estimation may solve this though) • Same problems with multicollinearity (results in a mathematical error)

On To More Exciting Models! • Remember our good friend ANCOVA? • Analysis of Covariance: Linear Dependent Variable, Categorical Independent Variable with a linear covariate • Common form: Did treatment condition predict post-test after controlling for pre-test? • More powerful than repeated measures t-test (assumes some measurement error). This guy is 1 for treatment, 0 otherwise Just another incarnation of our even better friend, the general linear model! (Like many popular statistics!)

Let’s Draw an ANCOVA using Path Analysis β0 β1 Pre Post 1 e r Treatment (1,0) β2 • Major Problem: Error! • How can we judge effects when 20+ percent of our measures are error variance? • Attenuation AND Inflation

Solution? • What if I had all of the answers to all of the items making up the scales of my pre and post test? Or multiple sub scales making up a larger scale? • ANCOVA becomes something more exciting: I1 I2 I3 I1 I2 I3 Latent Pre Latent Post Treatment (0,1)

MIMIC Models • These are called MIMIC models, or Multiple Indicator, Multiple Cause Models • Our treatment effect is (theoretically) not attenuated by error I1 I2 I3 I1 I2 I3 Latent Pre Latent Post Treatment (0,1)

More MIMIC • These can get more interesting than an ANCOVA I1 I2 I3 I1 I2 I3 Latent Pre Latent Post (This model is arguing that the treatment only changes the first latent variable while the second latent variable is just influenced by the first) Treatment (0,1) 2nd Latent Pre 2nd Latent Post I1 I2 I3 I1 I2 I3

Latent Mediation I1 I2 I3 B I1 I2 I3 I1 I2 I3 C A

One More Interesting Model • Latent Mediation Models

Structural Equation Modeling