280 likes | 637 Views
Graphical Diagnostic Tools for Evaluating Latent Class Models: An Application to Depression in the ECA Study. Elizabeth S. Garrett. Department of Biostatistics Johns Hopkins University. GOAL. Provide tools for choosing the most appropriate latent class model.
E N D
Graphical Diagnostic Tools for Evaluating Latent Class Models:An Application to Depression in the ECA Study Elizabeth S. Garrett Department of Biostatistics Johns Hopkins University
GOAL • Provide tools for choosing the most appropriate latent class model. • Interpret objective diagnostic methods in reference to the latent class model.
Table of Contents • Introduction • Previous Work • Model Estimation • Diagnostic Methods for Latent Class Models • Extensions to Latent Class Regression • Application to the ECA Study • Validating Diagnostic Criteria for Depression Using LCM • Discussion and Further Research
Outline • Depression in relation to the LCM • Approach to Estimation • The ECA Study • Predicted Frequency Check Plot • Latent Class Estimability Display • Interpretation of Findings • Revisions
Motivating Question How should we describe “major depression?” • not depressed, depressed • none, moderate, severe • none, mild, moderate, severe • none, mood symptoms, somatic symptoms, both
How we conceptualize “major depression” • We use indicators of symptoms such as self-reported presence of sadness, weight change, etc. • A combination of these indicators is thought to define depression. • Using these combinations, we commonly seek to categorize individuals into depression classes. • These classes represent the construct “depression.” • “Depression” is a latent variable. • The construct of “Depression” can then be used for classification, description, and prediction
Depression in the Diagnostic and Statistical Manual of Mental Disorders, 3rd Edition DSM-III Criteria (generally): A. Dysphoria for 2 or more weeks B. Reported symptoms in 4 or more of the following symptom groups: 1. loss of appetite, weight change 2. insomnia, hypersomnia 3. retarded movement, restlessness 4. disinterest in sex 5. fatigue 6. feelings of guilt or worthlessness 7. trouble concentrating, thoughts slow or mixed 8. morbid thoughts, suicidal thoughts/attempts
Latent Class Model: Main Ideas • There are M classes of depression (e.g. none, mild, severe). m represents the proportion of individuals in the population in class m (m=1,…,M) • Each person is a member of one of the M classes, but we do not know which. The latent class of individual i is denoted by i. • Symptom prevalences vary by class. The prevalence for symptom j in class m is denoted by pmj. • Given class membership, the symptoms are independent.
Latent Class Model • M : number of classes • pi: vector of symptom probabilities given latent class i • : probability of being in latent class m, m=1,…M. • : the true latent class of individual i. • : vector of individual i’s report of symptoms.
Estimation Approach Bayesian Approach: Quantify beliefs about p, , and before and after observing data. Bayesian Terminology: Prior Probability: What we believe about unknown parameters before observing data. Posterior Probability: What we believe about the parameters after observing data.
Bayesian Estimation Approach We estimated the models using a Markov chain Monte Carlo (MCMC) algorithm: Specify prior probability distribution: P(p, , ) Combine prior with likelihood to obtain posterior distribution: P(p, , |Y) P(p, , ) x L(Y| p, , ) Estimate posterior distribution for each parameter using iterative procedure. P(1|Y) = ∫P(p, , |Y)
The Epidemiologic Catchment Area Study 3481 community-dwelling individuals in Baltimore were interviewed using the NIMH Diagnostic Interview Schedule. 8 self-reported symptom groups were completed for 2938 individuals*. 6 month prevalence of symptoms was assessed. * those with organic brain disorder were omitted as per DSM-III criterion
Predicted Frequency Check (PFC) Plot Compare observed symptom pattern frequencies to what the model predicts for a new sample of data from the same population. Symptom patterns: • 000000000 no reported symptoms • 000000001 report dysphoria only • 111111111 report all symptoms 29 = 512 possible patterns
Example: Pattern 001000001 : • restlessness/retarded movement • dysphoria We observed 24 individuals with this symptom pattern:
Example: 95% confidence interval for frequency? Non-parametric (saturated model) estimate:
Model Based Estimation Predicted frequency of pattern 001000001 and prediction interval in the 3 class model: 97.5% 2.5% (x)
Model Based Estimation Comparison of model based prediction interval to empirical confidence interval: 97.5% Observed 2.5%
Latent Class Estimability Display (LCED) Is there enough data to estimate all of the parameters in the model? • 2 class model: 19 parameters • 3 class model: 29 parameters • 4 class model: 39 parameters Problems arise when: • small data set • small class size e.g. N=1000 and class size = 0.01 10 individuals in class to estimate symptom prevalences • small data set and small class size
Weak “Identifiability”(Weak Estimability) Definition: A parameter in a (Bayesian) model is weakly identified if the posterior distribution of the parameter is approximately the same as the prior. P(1) P(1|Y) If a model is weakly identified it is still “valid”, but we cannot make inferences from the data about the weakly identified parameters.
Depression appears to be ‘dimensional’ none mild severe 2% of population is in severe class 14% in mild class: are they depressed or not? How does this compare to the DSM-III definition? Interpretation
Work Not Included in Talk • MCMC Algorithm • Log Odds Ratio Check Plot • Predicted Class Assignment Display • Extensions to Regression
Revisions Already Implemented • New example for Chapter 5 (LCRR) • Background/justification of latent class model as “gold-standard” in validation • Splus programs: on website with a “user’s guide”