200 likes | 461 Views
Measurement Error. e. M. Measurement Error. Survey record differs from its true value Sampling error: arise from the random sampling variation when n of units measured instead of N units. Measurement error (Systematic error, nonsampling error).
E N D
e M Measurement Error • Survey record differs from its true value • Sampling error: arise from the random sampling variation when n of units measured instead of N units. • Measurement error (Systematic error, nonsampling error)
Not decrease with the overall sample size. • Decrease with the repeated measurement. • Possible sources of measurement error in survey data • Interviewer • Coder
Simplified Model Normal Dist’n
Effect • In presence of measurement error with mean zero, the estimator of mean and total remain unbiased or consistent. • For more complex parameters, the nice feature may not hold. • Fuller (1995) points out that the usual estimators for dist’n func., quantiles, regression coef. are biased.
Assuming non-zero constant measurement error mean, the usual estimators of mean, total, proportion are also biased. • Furthermore, assuming correlated errors between individuals with the same interviewer, the usual estimator of standard errors are also biased.
If estimates of the measurement error variances are available, it is possible to obtain bias-adjusted estimators. • Repeated measurement of sub-samples • Allocate resources at design stage to make repeated observations on a sub-sample. • Hartley and Rao (1978) and Hartley and Biemer (1978) provided interview and coder assignment conditions that permit the estimation of sampling and measurement error. • Measurement error variance model.
Interviewer Effect • Earliest examination of measurement error in the survey focuses on evaluating the impact of interviewers on the data. • There is correlation among measured values collected by the same interviewer. • Hansen, Hurwitz and Bershad model shows
ANOVA model can be specified to estimate interviewer variability. • Model is appropriate for continuous responses. • For binary response, the result underestimate the intra-interview variability.
References • Full, W.A. Estimation in the presence of measurement error. • Scott, A. and Davis, P.. Estimating interviewer effects fro survey responses. • Hartley, H.O. and Biemer, P.. The estimation of nonsampling variances in current surveys. • Hartley, H.O. and Rao, J.N.K. The estimation of nonsampling variance components in sample surveys. • Measurement errors in surveys. Paul Biermer, et al.
Binary Data and Interviewer Effects An Example
Medical Questionnaires • Often use binary variables • Interested in proportion parameters • Very specialized studies • Few reviewers (highly skilled) • Very expensive to train • Large case loads • Interviewer variability is usually ignored because it affects binary data less than continuous data
New Zealand Quality of Health Care Study • Studying ‘adverse events’ in New Zealand hospitals • 2-stage design • PPS sample of hospitals • Systematic sample of 575 medical records drawn from each hospital for 1998 admissions • Average case load per reviewer is 300 (problem!) • Typical of such studies • Interest in proportion of hospital admissions associated with an adverse event
Model • Random Effects Model • Hospital Effect • Interviewer Effect • Respondent Effect • Assume an underlying continuous variable initially and then extend to the binary case
Assumptions • : Normally Distributed • : Normally Distributed • : Logisticly Distributed • All these effects are assumed to be uncorrelated with all other effects
Design Effect • Represents the inflation in variance due to the interviewer and cluster effects (i.e. inflation when there are not independent observations) • This is assuming small interviewer and PSU effects
Results • = 0.04 • = 0.002 • = 687 • Design Effect = 29.0 • So the variability is increased by a multiplicative factor of 29!
Conclusions • With high case loads, even small interviewer variability can have high impact on estimates of population means and proportions • Binary data poses special challenges and more research needs to be done when the PSU and interviewer correlations are not small