200 likes | 214 Views
This study examines various meta-analytic methods for detecting selective outcome reporting in the presence of dependent effect sizes, such as multiple outcomes or follow-ups. The goal is to evaluate the performance of different approaches and provide recommendations for handling dependency in meta-analysis.
E N D
Evaluating Meta-analytic Methods to Detect Selective Outcome Reporting in the Presence of Dependent Effect Sizes Melissa A. Rodgers & James E. Pustejovsky, UT Austin SRSM 2019 Chicago, IL
Detecting Selective Outcome Reporting • Systematic censoring of primary study results based on statistical significance or magnitude of effect size. • Publication bias: censoring of full studies • Selective outcome reporting: censoring of specific results • e.g. 5 outcome measures collected, only report 2 significant results; similar to p-hacking • Threat to the validity and important to include SOR detection in research syntheses
Tests for Selective Outcome Reporting • Trim & Fill (Duval & Tweedie, 2001) • Egger’s Regression Variants (Eggers et al., 1997; Stanley & Doucouliagos, 2014) • Selection/Weight Function Models (Hedges & Vevea, 2005; 3PSM)
Handling Dependent Effect Sizes • Primary studies commonly report multiple statistically dependent effect sizes • Multiple outcomes, multiple follow-ups, or multiple treatment comparisons • 4 Methods (Becker, 2000) • Ignore (violates assumption of independence) • Aggregate (average; knowledge of correlation between outcomes) • Sub-classify/Sample (shift unit-of-analysis) • Model (MLMA; Van de Noortgate et al., 2015, RVE; Hedges et al., 2010)
Goal of Study • Performance of available tests to detect SOR with different approaches to handle dependency • Common in practice: • Ignore dependence or create synthetic univariate datasets • Run univariate SOR tests • Modeling • Egger Regression + RVE = Egger Sandwich • Handful of applied studies • No methodological evaluation
Methods • Design: Two-group experiment with multiple, correlated outcomes • Effect Size Index: Standardized Mean Difference • Number of ES per study and related sample size • Sampled from Empirical Dataset (Lehtonen et. al., 2018) • Number of Studies: k = 20, 50, 80 • Parameters • Average Effect Size: • Between Study Heterogeneity: • Performance Criteria • Type – I error rate: No SOR; • Power: SOR at varying levels of censoring;
Thanks Melissa A. Rodgers Melissa.A.Rodgers@utexas.edu
References • Becker, B. J. (2000). Multivariate Meta-analysis. In S. D. Brown & H. E. A. Tinsley (Eds.), Handbook of Applied Multivariate Statistics and Mathematical Modeling(pp. 499–525). San Diego, CA: Academic Press. • Duval, S., & Tweedie, R. (2000). A nonparametric “trim and fill” method of accounting for publication bias in meta-analysis. Journal of the American Statistical Association, 95 (449), 89–98. • Egger, M., Smith, G. D., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected by a simple, graphical test. Bmj, 315 (7109), 629–634. • Hedges, L. V., Tipton, E., & Johnson, M. C. (2010). Robust variance estimation in meta-regression with dependent effect size estimates. Research Synthesis Methods, 1(1), 39–65. • Hedges, L., & Vevea, J. (2005). Selection method approaches. In Publication bias in meta-analysis: Prevention, assessment, and adjustments (pp. 145–174). Chichester, England: John Wiley & Sons. • Lehtonen, M., Soveri, A., Laine, A., Järvenpää, J., de Bruin, A., & Antfolk, J. (2018). Is bilingualism associated with enhanced executive functioning in adults? A meta-analytic review. Psychological Bulletin, 144(4), 394-425. • Stanley, T. D., & Doucouliagos, H. (2014). Meta-regression approximations to reduce publication selection bias. Research Synthesis Methods, 5 (1), 60–78. • Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2015). Meta-analysis of multiple outcomes: A multilevel approach. Behavior Research Methods, 47 (4), 1274–1294. doi:10.3758/s13428-014-0527-2
Conclusions & Feedback • Type-I Error • Ignoring dependence inflates Type-I error rates with SOR detection tests • 3SPM & Trim & Fill when aggregating/sampling depending on lower average effect size • Egger’s Regression for sampling maintains Type-I error rates • Egger’s Sandwich maintains Type-I error rates • Power • Limited especially high between-study heterogeneity and level of censoring is low • Egger’s Regression has more power then Trim & Fill • 3PSM has power advantage over Egger’s Regression and Egger’s Sandwich • Expansion of detection tests to additional multivariate methods