140 likes | 294 Views
Guide to Handling Missing Information. C ontacting researchers Algebraic recalculations, conversions and approximations Imputation method (substituting missing data). Imputation Method . - When recalculations not possible -e.g. no standard deviation for a study
E N D
Guide to Handling Missing Information • Contacting researchers • Algebraic recalculations, conversions and approximations • Imputation method (substituting missing data)
Imputation Method - When recalculations not possible -e.g. no standard deviation for a study • Use available data from other studies or other meta-analysis Imputation Method b. Multiple imputations Within study imputation
Within-study imputation _ ~ SDj= XjƩikSDi ______ _ Method 1.(Means) ƩikXi ~ SDj = Standard deviation (SD) for missing data from study j _ Xj =Mean from study with missing SD ƩikSDi =Summation of all known SD from different studies _ =Summation of means from different studies other than j (ƩikXi)
- ~ SDj= XjƩikSDi Assumptions ______ _ ƩikXi • Assumes SD to mean ratio is at the same scale for all studies • - Experimental scales can differ tremendously between different taxonomic groups or experimental designs
Method 2.(sample size) ~ sj=α+β(nj) -Regression techniques • Reports sample size but missing information to calculate pooled SD (required for Hedge’s d). α = Intercept β = slope of the linear regression of nvss nj = observed sample size of the study with missing data
~ Assumptions sj=α+β(nj) • Assumes n (observed sample size of the study with missing data) is a good predictor s. Method 3.No. of studies ~ sj= Ʃiksj √ni _____ K √nj K= number of studies with complete information on s and n (sample size of individual study)
Method 4. Follmanet al. (1992) Furukawa et al. (2006) ~ sj= √Ʃik[(ni-1)Ϭ2i] __________ √Ʃik(ni-1) Ϭ2= variance n= sample size of individual study
Assumptions • Some degree of homogeneity among the observed SD and X across studies • Assume information is missing at random and not due to reporting biases (non-random) -Imputations retain their original units. -Large variations among estimates will bias imputations. _
Multiple imputations • Use random sampling approach • Average repeated sampling for missing data Overall imputed synthesis
Advantage of multiple imputations • Variability is explicitly modeled therefore do no treat imputed value as true observation • e.g. Does not account for error associated with α or β. ~ sj=α+β(nj)
Methods: Multiple imputations • Various methods: use maximum likelihood or Bayesian models. • Requires specialized software • e.g. Hot Deck- To calculate pooled s but several SD values missing • Random sample of s drawn with replacement possible s • Process repeated with replacement from possible s • Repeat till we get “m” number of complete data sets
Methods: Hot deck _ _ calculate effect size= δ for each(m) data Calculate variance = Ϭ2 (δl) set δ = Ʃlm= 1δl . _ Pooled effect size ___ m . . Variance= Ϭ2(δ)= Ʃlm= 1 Ϭ2(δl) + (1+1) Ʃlm= 1(δl – δ)2 _ _ _________ _ _________ m m m-1 Rubin and Schenker (1991)If 30% data missing->m= 3 If 50% data missing->m= 5
Non-parametric analyses and bootstrapping • Alternative to Hedge’s d • Using weighting scheme • Does not require SD • E.g log response ratio lnR= lnXT XC If sample size available but no SD Ϭ2=(lnR)= nTnCnT+nC _ T= treatmentC= control ___ _ ___ Inverse of a simplified estimate of variance
Effects of Imputation • No standardized method for imputation-> bias Rubin and Schenker (1991) e.g. • Appropriateness of imputed data can be evaluated using a sensitivity analysis • Benefits despite potential bias • Improved variance estimate (i.e. smaller CI) over exclusion • May potentially improve representation of null studies