1 / 11

Non response and missing data in longitudinal surveys

Non response and missing data in longitudinal surveys. Traditional ways of handling attrition and missing data. Weighting typically used for attrition Sample design and initial non-response provides basic weights

csykes
Download Presentation

Non response and missing data in longitudinal surveys

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Non response and missing data in longitudinal surveys

  2. Traditional ways of handling attrition and missing data • Weighting typically used for attrition • Sample design and initial non-response provides basic weights • For several waves defines ‘typical’ pathways and provide weights for each one. e.g. LSYP may require 12 or more • For item non-response use ‘hot deck’ single imputation

  3. Problems with weighting procedures • Inefficient – can only use complete data for each combination of variables analysed • Restrictive since weights only provided for chosen ‘pathways’ • Possibly inconsistent results through different weights for different analyses • Not very transparent for use • Problematic for ‘structurally missing’ items

  4. Problems with hot deck imputation • Not theoretically based • Selection of ‘matched’ cases may not always be possible – especially in multilevel data • Single imputation does not allow easy computation of standard errors

  5. Multiple imputation – briefly and simply Consider the model of interest (MOI) We turn this into a multivariate response model and obtain residual estimates of (from an MCMC chain) where x, or y are missing. Use these to ‘fill in’ and produce a complete data set. Do this (independently) n (e.g. = 20) times. Fit MOI to each data set and combine according to rules to get estimates and standard errors. Note that at imputation stage we can use auxiliary data. Note also that we can handle attrition as missing data.

  6. What not to do • Omit all records with missing data – inneficient • In categorical data use an extra category for missing - biased • Plug in the mean over the non-missing values - biased

  7. Multiple imputation in MLwiN • Existing methods assume normality. For multilevel data they cannot handle level 2 variables with missing data • Cannot handle discrete variables with missing data. • REALCOM-IMPUTE links REALCOM with MLwiN and can handle level 2 and discrete variables. • It works by transforming discrete variables to normality using a ‘latent variable’ model so that all response variables have a joint multivariate normal distribution and then applies MI theory.

  8. Partially observed data values • Where we have a prior (estimated) probability distribution (PD) for a missing discrete variable value we simply insert an extra MCMC step that accepts the ‘standard’ MI value with a probability that is just the probability given by the PD. A corresponding step is used for normal data. • This thus uses all of the data efficiently. No data are discarded so long as it is possible to assign a PD. • May also reduce ‘partial response bias’ • Several completed data sets are produced and combined as in standard MI These procedures are computationally intensive but once the completed data sets are produced they can be used for many different models – so long as a model uses only variables that have been involved in the imputation procedure.

  9. References • Multilevel models with multivariate mixed response types (2009) Goldstein, H, Carpenter, J., Kenward, M., Levin, K. Statistical Modelling (to appear) - Gives methodological background • Handling attrition and non-response in longitudinal data. International Journal of longitudinal and Life Course studies. April 2009.http://www.journal.longviewuk.com/index.php/llcs- Discusses issues for longitudinal studies in detail

  10. Sampling weights • Consider a 2-level model: • Write level 2 weights as • Level 1 weights for j-th level 2 unit as Final level 1 weights We use as the level 1 random part explanatory variable instead of the constant =1 This will be used for imputation and for MOI

More Related