460 likes | 615 Views
Effect size calculation in educational and behavioral research. Wim Van den Noortgate ‘Power training’ Faculty Psychology and Educational Sciences, K.U.Leuven Leuven, October 10 2003 Questions and comments: Wim.VandenNoortgate@ped.kuleuven.ac.be. Applications A measure for each situation
E N D
Effect size calculation in educational and behavioral research Wim Van den Noortgate ‘Power training’ Faculty Psychology and Educational Sciences, K.U.Leuven Leuven, October 10 2003 Questions and comments: Wim.VandenNoortgate@ped.kuleuven.ac.be
Applications • A measure for each situation • Some specific topics
Applications • Expressing size of association • Comparing size of association • Determining power
M F Application 1: Expressing size of association Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33
Application 1: Expressing size of association Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33
Application 1: Expressing size of association Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33
Suppose simulated data are data from 10 studies, being replications of each other:
Comparing individual study results and combined study results • observed effect sizes may be negative, small, moderate and large. • CI relatively large • 0 often included in confidence intervals • Combined effect size close to population effect size • CI relatively small • 0 not included in confidence interval
Meta-analysis: Gene Glass (Educational Researcher, 1976, p.3): “Meta-analysis refers to the analysis of analyses”
Application 2: Comparing the size of association Example: Raudenbush & Bryk (2002)
Results meta-analysis: • The variation between observed effect sizes is larger than could be expected based on sampling variance alone: the population effect size is probably not the same for studies. • The effect depends on the amount of previous contact
Application 3: Power calculations Power = probability to reject H0 Power depends on - δ - α - N
‘Powerful’ questions: • Suppose the population effect size is small (δ = 0.20), how large should my sample size (N) be, to have a high probability (say, .80) to draw the conclusion that there is an effect (power), when testing with an α-level of .05? • I did not find an effect, but maybe the chance to find an effect (power) with such a small sample is small anyway? (N and α from study, assume for instance that δ=g)
Dichotomous independent-dichotomous dependent variable • Risk difference: .87-.60 = .27 • Relative risk: .87/.60 = 1.45 • Phi: (130 x 20 – 20 x 30)/sqrt (150 x 50 x 160 x 40) = 0.29 • Odds ratio: (130 x 20 / 20 x 30) = 4.33
Dichotomous independent-continuous dependent variable • Independent groups, homogeneous variance: • Independent groups, heterogeneous variance: • Repeated measures (one group): • Repeated measures (independent groups): • Nonparametric measures • rpb
Nominal independent-nominal dependent variable • Contingency measures, e.g.: • Pearson’s coefficient • Cramers V • Phi coefficient • Goodman-Kruskal tau • Uncertainty coefficient • Cohen’s Kappa
Nominal independent-continuous dependent variable • ANOVA: multiple g’s • η² • ICC
Continuous independent-Continuous dependent variable • r • Non-normal data: Spearman ρ • Ordinal data: Kendall’s τ, Somer’s D, Gamma coefficient • Weighted Kappa
More complex situations • Two or more independent variables • Regression models
Y continuous: Yi= a + bX + ei • X continuous: b estimated by • X dichotomous (1 = experimental, 0 = control), b estimated by • Y dichotomous: Logit(P(Y=1))= a + bX, If X dichotomous, b estimated by the log odds ratio
More complex situations • Two or more independent variables • Regression models • Stratification • Contrast analyses in factorial designs (Rosenthal, Rosnow & Rubin,2000)
More complex situations • Two or more independent variables • Regression models • Stratification • Contrast analyses in factorial designs • Multilevel models • Two or more dependent variables • Single-case studies
Yi = b0 + b1 phasei + ei • Yi = b0 + b1 timei + b2 phasei +b3 (timei x phasei) + ei
Comparability of effect sizes Example: gIG vs. ggain:
Comparability of effect sizes • Estimating different population parameters, e.g., • Estimating with different precision, e.g., g vs. Glass’s Δ
Choosing a measure • Design and measurement level • Assumptions • Popularity • Simplicity of sampling distribution Fisher’s Z = 0.5 log[(1+r)/(1-r)] Log odds ratio Ln(RR) • Directional effect size
Threats of effect sizes • ‘Bad data’ • Measurement error • Artificial dichotomization • Imperfect construct validity • Range restriction
Threats of effect sizes • ‘Bad data’ • Measurement error • Artificial dichotomization • Imperfect construct validity • Range restriction • Bias