140 likes | 151 Views
Explore measurement error in covariates, correction methods, autopower tool, and its implications on power analysis with practical examples.
E N D
Sample size and power estimation when covariates are measured with error Michael Wallace London School of Hygiene and Tropical Medicine
Outline • Measurement error – what is it and what problems can it cause? • What can we do about it? • The problem of power – introducing autopower
Measurement error – a crash course • Often impossible to measure covariates accurately: e.g. Dietary intake, blood pressure, weight • Instead, we have error-prone observations • How these relate to the underlying true values is our 'measurement error model' • Common model: ”classical” error: • Observed = True + Measurement Error • ...but other models are available.
Why does it matter? • Simple linear regression: • Classical measurement error:
Why does it matter? • Simple linear regression: • Classical measurement error: • Regress Y on W to obtain an estimate of where
Why does it matter? • Simple linear regression: • Classical measurement error: • Regress Y on W to obtain an estimate of where
What can we do about it? • Need additional data to tell us about the measurement error • Validation (accurate measurements on some) • Replication (multiple measurements) • Validation 'best', but replication more practical • Huge variety of 'correction methods' available to try and remove bias induced by measurement error. • Two that are already available in Stata: • Regression calibration (Stata command: rcal) • Simulation extrapolation (Stata command: simex) • ...but these don't produce consistent effect estimates in general.
Conditional Score • If there is measurement error, then solving estimating equations as normal will give inconsistent effect estimates. • Conditional score solves modified estimating equations to avoid this. • Unlike regression calibration and simulation extrapolation, it produces consistent effect estimates for a range of models, including logistic regression. • We have produced cscore for Stata to implement this method in the case of logistic regression.
The problem of power • Measurement error hits us with a 'double whammy': • Bias • Wider confidence intervals • Bias will often remain a problem even if a correction method is used. • Sample size calculations generally impossible. • Simulation studies only recourse. • autopower aims to remove the leg work.
autopower in brief • autopower simulates datasets that suffer from measurement error. • Then sees how methods perform on these datasets. • Variety of methods available: • 'naïve', rcal, simex, cscore • Assumes: • Univariate logistic regression • Subjects are measured either once or twice
Example: specific design • “How well should regression calibration perform on this dataset?”
Example: estimating sample size • “What sample size do I need to achieve 80% power?”
Example: cost minimization • “Obtaining second observations is expensive, can I save money by considering a design where not everyone is measured twice?” • User specifies how much more it costs to measure a subject twice rather than once. • autopower then searches the 'r1-r2' space: • r1 = subjects measured once • r2 = subjects measured twice • Various tricks for practical speed.
References • General overview: Carroll, R. J., D. Ruppert, L. K. Stefanski, and C. M. Crainiceanu. 2006. Measurement Error in Nonlinear Models: A Modern Perspective, Second Edition. Chapman & Hall/CRC • Regression calibration: Stefanski, L. A., and R. J. Carroll. 1987. Conditional scores and optimal scores in generalized linear measurement error models. Biometrika 74: 703–716. • Simulation extrapolation: Cook J R and Stefanski L A. Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical Association, 89:1314–1328, 1994. • Conditional score: Carroll, R. J., and L. A. Stefanski. 1990. Approximate quasilikelihood estimation in models with surrogate predictors. Journal of the American Statistical Assocation 85: 652–63.