Topic Outline

Topic Outline Motivation Representing/Modeling Causal Systems Estimation and Updating Model Search Linear Latent Variable Models Case Study: fMRI

Discovering Pure Measurement Models Richard ScheinesCarnegie Mellon University Ricardo Silva*University College London Clark Glymour and Peter SpirtesCarnegie Mellon University

Outline • Measurement Models & Causal Inference • Strategies for Finding a Pure Measurement Model • Purify • MIMbuild • Build Pure Clusters • Examples • Religious Coping • Test Anxiety

Goals: • What Latents are out there? • Causal Relationships Among Latent Constructs Relationship Satisfaction Depression or Relationship Satisfaction Depression or ?

Needed: Ability to detect conditional independence among latent variables

Lead and IQ e2 e3 Parental Resources Lead Exposure IQ Lead _||_ IQ | PR e2 ~ N(m=0, s = 1.635) Lead = 15 -.5*PR + e2 PR ~ N(m=10, s = 3) e3 ~ N(m=0, s = 15) IQ = 90 + 1*PR + e3

Psuedorandom sample: N = 2,000 Parental Resources Lead Exposure IQ Regression of IQ on Lead, PR

Measuring the Confounder e1 e3 e2 X1 X2 X3 Parental Resources Lead Exposure IQ X1 = g1* Parental Resources + e1 X2 = g2* Parental Resources + e2 X3 = g3* Parental Resources + e3 PR_Scale = (X1 + X2 + X3) / 3

Scales don't preserve conditional independence X1 X2 X3 Parental Resources Lead Exposure IQ PR_Scale = (X1 + X2 + X3) / 3

Indicators Don’t Preserve Conditional Independence X1 X2 X3 Parental Resources Lead Exposure IQ Regress IQ on: Lead, X1, X2, X3

Structural Equation Models Work X1 X2 X3 Parental Resources Lead Exposure IQ b • Structural Equation Model • (p-value = .499) • Lead and IQ “screened off” by PR

Local Independence / Pure Measurement Models • For every measured item xi: • xi _||_ xj | latent parent of xi

Local Independence Desirable

Correct Specification Crucial

Strategies • Find a Locally Independent Measurement Model • Correctly specify the MM, including deviations from Local Independence

Correctly Specify Deviations from Local Independence

Correctly Specifying Deviations from Local Independence is Often Very Hard

Finding Pure Measurement Models - Much Easier

tetrad constraints CovWXCovYZ =(122L)(342L) ==(132L) (242L)= CovWYCovXZ WXYZ = WYXZ = WZXY Tetrad Constraints • Fact: given a graph with this structure • it follows that L W = 1L + 1 X = 2L + 2 Y = 3L + 3 Z = 4L + 4 1 4 2 3 W X Y Z

Early Progenitors Charles Spearman (1904) StatisticalConstraints Measurement Model Structure g m1 m2 r1 r2 rm1 * rr1 = rm2 * rr2

Impurities/Deviations from Local Independence defeat tetrad constraints selectively rx1,x2 * rx3,x4 = rx1,x3 * rx2,x4 rx1,x2 * rx3,x4 = rx1,x4 * rx2,x3 rx1,x3 * rx2,x4 = rx1,x4 * rx2,x3 rx1,x2 * rx3,x4 = rx1,x3 * rx2,x4 rx1,x2 * rx3,x4 = rx1,x4 * rx2,x3 rx1,x3 * rx2,x4 = rx1,x4 * rx2,x3

Purify True Model Initially Specified Measurement Model

Purify Iteratively remove item whose removal most improves measurement model fit (tetrads or c2) – stop when confirmatory fit is acceptable Remove x4 Remove z2

Purify Detectibly Pure Subset of Items Detectibly Pure Measurement Model

Purify

How a pure measurement model is useful Consistently estimate covariances/correlations among latents- test conditional independence with estimatedlatent correlations Test for conditional independence among latents directly

2. Test conditional independence relations among latents directly Question: L1 _||_ L2 | {Q1, Q2, ..., Qn} b21 b21= 0  L1 _||_ L2 | {Q1, Q2, ..., Qn}

MIMbuild Input: - Purified Measurement Model - Covariance matrix over set of pure items MIMbuild PC algorithm with independence tests performed directly on latent variables Output: Equivalence class of structural models over the latent variables

Purify &MIMbuild

Goal 2: What Latents are out there? • How should they be measured?

Latents and the clustering of items they measure imply tetrad constraints diffentially

Build Pure Clusters (BPC) Input: - Covariance matrix over set of original items BPC 1) Cluster (complicated boolean combinations of tetrads) 2) Purify Output: Equivalence class of measurement models over a pure subset of original Items

Build Pure Clusters

Build Pure Clusters • Qualitative Assumptions • Two types of nodes: measured (M) and latent (L) • M L (measured don’t cause latents) • Each m  M measures (is a direct effect of) at least one l  L • No cycles involving M • Quantitative Assumptions: • Each m  M is a linear function of its parents plus noise • P(L) has second moments, positive variances, and no deterministic relations

Build Pure Clusters Output - provably reliable (pointwise consistent): Equivalence class of measurement models over a pure subset of M For example: TrueModel Output

Build Pure Clusters Measurement models in the equivalence class are at most refinements, but never coarsenings or permuted clusterings. Output

Build Pure Clusters • Algorithm Sketch: • Use particular rank (tetrad) constraints on the measured correlations to find pairs of items mj, mk that do NOT share a single latent parent • Add a latent for each subset S of M such that no pair in S was found NOT to share a latent parent in step 1. • Purify • Remove latents with no children

Build Pure Clusters + MIMbuild

Case Studies Stress, Depression, and Religion (Lee, 2004) Test Anxiety (Bartholomew, 2002)

Specified Model Case Study: Stress, Depression, and Religion • Masters Students (N = 127) 61 - item survey (Likert Scale) • Stress: St1 - St21 • Depression: D1 - D20 • Religious Coping: C1 - C20 p = 0.00

Case Study: Stress, Depression, and Religion Build Pure Clusters

Case Study: Stress, Depression, and Religion • Assume Stress temporally prior: • MIMbuild to find Latent Structure: p = 0.28

Case Study : Test Anxiety Bartholomew and Knott (1999), Latent variable models and factor analysis 12th Grade Males in British Columbia (N = 335) 20 - item survey (Likert Scale items): X1 - X20: Exploratory Factor Analysis:

Case Study : Test Anxiety Build Pure Clusters:

Case Study : Test Anxiety Build Pure Clusters: Exploratory Factor Analysis: p-value = 0.00 p-value = 0.47

MIMbuild Scales: No Independencies or Conditional Independencies p = .43 Uninformative Case Study : Test Anxiety

Limitations • In simulation studies, requires large sample sizes to be really reliable (~ 400-500). • 2 pure indicators must exist for a latent to be discovered and included • Moderately computationally intensive (O(n6)). • No error probabilities.

Open Questions/Projects • IRT models? • Bi-factor model extensions? • Appropriate incorporation of background knowledge

References • Tetrad: www.phil.cmu.edu/projects/tetrad_download • Spirtes, P., Glymour, C., Scheines, R. (2000). Causation, Prediction, and Search, 2nd Edition, MIT Press. • Pearl, J. (2000). Causation: Models of Reasoning and Inference, Cambridge University Press. • Silva, R., Glymour, C., Scheines, R. and Spirtes, P. (2006) “Learning the Structure of Latent Linear Structure Models,” Journal of Machine Learning Research, 7, 191-246. • Learning Measurement Models for Unobserved Variables, (2003). Silva, R., Scheines, R., Glymour, C., and Spirtes. P., in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence , U. Kjaerulff and C. Meek, eds., Morgan Kauffman

Topic Outline

Topic Outline

Presentation Transcript

Outline

Outline

Outline

Outline

Outline

Outline

Outline

outline

outline

OUTLINE

Outline

Outline

TOPIC OUTLINE

Topic 1 Outline

Topic Outline

Topic Outline

Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8

Outline of Topic

Topic Outline-Photosynthesis

Topic Outline

Topic Sentence Outline

Topic Outline for Physics 1 Spring 2011