200 likes | 277 Views
Estimating the Level of Underreporting of Expenditures among Expenditure Reporters: A Further Micro-Level Latent Class Analysis.
E N D
Estimating the Level of Underreporting of Expenditures among Expenditure Reporters:A Further Micro-Level Latent Class Analysis DISCLAIMER: The views expressed in this presentation are those of the authors and do not necessarily represent the views of the Bureau of Labor Statistics or the Department of Labor.
Outline • Background • Research Goals • Past Approach • Past Results • Current Approach • Results • Future Research
U.S. Consumer Expenditure (CE) Interview Survey • ~ 6,000 households/year • Interviewed every 3 months about prior 3 months expenditures • 5 consecutive interviews for each household • 6 years of CE data: 1996 – 2002
Research Goals This multi-year project has three goals: • Identify patterns of underreporting of expenditures in different commodities • Identify the characteristics of respondents contributing most to the underreports • Use the knowledge gained to design new procedures for overcoming underreporting
Phases of Research • Phase 1 (2003) • Markov LCA on macro level data • Non-reporters only • Phase 2 (2004) • (Ordered) LCA on micro level data for total consumer expenditures • Reporters only • Phase 2 (current) • (Ordered) LCA on micro level data for separate commodity expenditures • Examination of possible casual linkages to respondent characteristics • Reporters only • Phase 3 (future) • Combine the macro and micro analyses • All sample • Produce overall estimates of underreporting by category and respondent characteristics
Phase 1: Approach • Used 4 consecutive CE interviews “Since the 1st of (month, 3 months ago), have you (or any members of your household) had any expenses for __________?” • Used 1st and 2nd order Markov LCA to fit models to dichotomous response to screening question • Explored effect on underreporting of: • family size, income, age, family type, gender, education, record use, interview length
Phase 1: Design • Obtained estimates of false negative probability i.e. P(no purchase reported | made a purchase) • Produced estimates for each commodity of: • True proportion of “purchasers” • Accuracy rate i.e. P(report a purchase | truly made a purchase) • Used these estimates to examine relationships between demographic variables and probability of accurate reporting
Phase 1: Conclusions • Model fit was adequate for all commodities • Levels of underreporting vary by commodity • Variables were found to be positively related to accurate reporting included: • Education • Family Size • Income • Use of Records • Length of Interview • The effect of age was highly variable
Phase 2 in 2004 • Differences between Phase 1 and Phase 2: • Used only Interview 2 data, not Markov LCA • Micro level analysis • Reporters only • Latent variable represents level of underreporting, as opposed to purchasing status as in Phase 1
Approach • Analysis Plan • Ran both ordered latent class models and unordered. • Order was determined based on theoretical relationship between values of indicators and level of underreporting. • Ran all combinations of indicators in groups of 3 • Using only reporters • Using only 2nd interview data
Application of Model • For the final model: • Each combination of indicator was assigned to a latent class • The probability of being in that class given the value of the indicators was used to assign classes • Each respondent was assigned to a latent class given the value of their indicator variables • Expenditure means were found for each latent class.
Summary of Findings in 2004 • Levels of underreporting were found to vary by interview level characteristics including: • Number of contacts • Missing income data • Type and frequency of records used • Length of interview • Total expenditure means for respondents assigned to each latent class confirmed this
Current Phase 2 • Using same general methodology as 2004 • Refine indicators • Apply methodology to separate commodity categories • Identify best model for each commodity and assign respondents to latent classes • Examine the pattern of mean expenditures for each latent class to confirm results • Run demographic analysis to identify characteristics of members of each latent class
Indicators • Interview level indicators considered: • Number of contacts • Ratio of respondents/household members • Missing income data • Type and frequency of records used • Length of interview • Ratio of expenditures in last month to quarter • Combination of type of record and interview length
Indicator Coding • #contacts (1=0-2; 2=3-5; 3=6+) • Resp/hh size (1= <.5; 2= .5+) • Income missing (1=present; 2=missing) • Records use (1=never; 2=single type or sometimes; 3=multiple types and always) • Interview length (1= <45; 2=45-90; 3= 90+) • Month3 expn/all (1= <.25; 2= .25-.5; 3= +.5) • Combined records and length (1= poor; 2= fair; 3=good)
Demographic Coding • CU size (1=1; 2=2; 3=3+) • Age (1= 30<; 2= 30-49; 3=50+) • Education (1=< H.S.; 2= H.S.+) • Income rank (1= <=.25; 2=.25-.75 and missing; 3=+.75) • Race (1= White; 2= Other) • Tenure (1= renter; 2= owner) • Urban (1= urban; 2= rural)
Future Research • Other categories and total expenditures • Add a Markov component • Combine the macro and micro analyses (underreporting for both reporters and nonreporters) • Produce overall estimates of underreporting by category and respondent characteristics