350 likes | 532 Views
Incorporating Nonresponse in a Markov Latent Class Measurement Error Model of Consumer Expenditure. Brian Meekins, Clyde Tucker Bureau of Labor Statistics Paul Biemer RTI International.
E N D
Incorporating Nonresponse in a Markov Latent Class Measurement Error Model of Consumer Expenditure Brian Meekins, Clyde Tucker Bureau of Labor Statistics Paul Biemer RTI International Any opinions expressed in this paper are those of the authors and do not constitute policy of the Bureau of Labor Statistics
Markov Latent Class Analysis • Uses repeated measurements from panel survey data to estimate classification error • Used successfully in evaluation of labor force data (e.g., Biemer & Bushery, 2001 Tucker, Biemer, Meekins 2012) • Does not require external validation data; estimates of error directly from panel data • Applied to reports of expenditures in the CEIS
U.S. Consumer Expenditure Interview Survey (CEIS) • ~ 6,000 CU’s/year • CU’s interviewed every 3 months about prior 3 months expenditures • 4 consecutive interviews on each CU • New data: N=23,719 • From 2005.2 to 2009.2 • Unweighted analysis
1st Order Markov Model W X Y Z Where
Definition of Indicator Variables Define for Interview 1, 1, if reported as a purchaser for the quarter 2, if reported as non-purchaser with similar definition for X, Y, Z for 2nd, 3rd, and 4th interviews W=
2nd Order Markov Model W X Y Z
Mover-Stayer M W X Y Z 1, P(W=1) = P(X =1) = P(Y =1) = P(Z = 1) = 1 M = 2, P(W=1) = P( X =1) = P(Y =1) = P(Z = 1) = 0 3, P(W),P(X), P(Y), and P(Z) are unconstrained.
Measurement Error Model W X Y Z = = = = A B C D
Markov Latent Class Model Notation True expenditure status is a latent variable Latent Var. Indicator Var. Interview 1 W A Interview 2 X B Interview 3 Y C Interview 4 Z D
Definition of Latent Variables Where, 1, if one or more purchases of an item during the W= quarter (“purchaser”) 2, if no purchase (“non-purchaser”) with similar definition for X, Y, Z for 2nd, 3rd, and 4th interview
Model Assumptions • Markov or Mover-Stayer model assumptions • Equal measurement error across all interviews • No False Positives
Model Selection • Limitations on Lem forced estimation of separate pieces of the full model: ME & NR • Multiple iterations to avoid local maxima • Best refusal, noncontact, and measurement error variables were selected based on fit in these component models • Using best variables constructed combined model
Estimates ME Model • No covariates P(A=1|W=1) = P(B=1|X=1) = P(C=1|Y=1) = P(D=1|Z=1) = 0.763
Measurement Error Model W X Y Z = = = = A B C D Measurement Error Indicators
Measurement Error Model W X Y Z A B C D Measurement Error Indicators
ME Variables • Missing income data (CU) • Type and frequency of records used (interiew) • Length of interview (interview) • Ratio of expend. in last month to entire quarter (interview) • Combination of types of record and intlength (interview) • Number of expenditure questions imputed/allocated (interview) • Completion mode (interview) • Family size (CU)
Nonresponse Error Model E F G H Noncontact Covariates 1 Reports All Qtr 2 Reports Some 3 Reports None M Refusal Covariates
Nonresponse Error Model E F G H Noncontact Covariates 1 Latent .. likelihood c of Interview M Refusal Covariates
Noncontact Variables • Number of noncontact problems reported in CHI (interview) • Age (CU) • Owner/Renter (CU) • Urbanicity (CU)
Refusal Variables • Factor1 variables – Reluctance privacy concerns (CU) • Factor2 variables – Reluctance time concerns (CU) • Any reluctance mentioned (CU) • Conversion Refusal – only wave 1 available? (CU) • Region (CU)
Combined Model W X Y Z Measurement Error Indicators = = = = A B C D E F G H 1 Reports All Qtr 2 Reports Some 3 Reports None M Noncontact Covariates Refusal Covariates
Missing Reports • Of the estimated 1,939 missed reports • 1,721 were a result of item nonresponse • 218 were a result of unit nonresponse • Reminder: nonresponse conditioned on first wave response
Discussion • Can be conducted for each commodity category • Models need further specification & adjustments, thorough evaluation of covariates • Estimates can be used to inform: • Allocation of resources • Adjustment: relationship of NR and ME • Fatigue, conditioning, etc.
Brian MeekinsOffice of Survey Methods ResearchU.S. Bureau of Labor Statistics202-691-7594meekins.brian@bls.gov
Objective Diagnositics • Fit Statistics • L-square • Dissimilarity Index • BIC
Subjective Diagnostics • True purchase pattern given estimated purchase pattern • Mover-stayer classification by purchase pattern • Accuracy rates by subgroup • Accuracy is the percent of true purchasers that reported purchasing that commodity