310 likes | 471 Views
Austin Hilliard Feb 19 2009. Resampling: power analysis. We've done an experiment. Data:. boxplot([ctrl exp], 'labels' ,{ 'ctrl' 'exp' }). ctrl = 8 13 14 13 16 13 15 3 9 9. exp = 4 13 12 18 12 6 11
E N D
Austin Hilliard Feb 19 2009 Resampling: power analysis
We've done an experiment... • Data: boxplot([ctrl exp],'labels',{'ctrl''exp'}) ctrl = 8 13 14 13 16 13 15 3 9 9 exp = 4 13 12 18 12 6 11 4 9 9
...we've looked at the data... figure subplot(1,2,1), hist(ctrl,5), xlabel('ctrl') subplot(1,2,2), hist(exp,5), xlabel('exp')
...we've done some stats >>out = bootstrap2groups(@median,ctrl,exp) ans: gp1desc: 13 gp2desc: 10 testStat: 3 distMean: -0.0081 reflectLeft: -3.0162 pvalRight: 0.0955 pvalLeft: 0.0991 pval: 0.1946 groups: [10x2 double] But there was no effect!
Did we have sufficient power? • power = chance we have of detecting a difference of a particular size (δ), using a particular statistic, investigating a particular n • What can increase power? • increased n • increased across group variation • decreased within group variation Probably a cheat, but does increase power: • increase α → relax significance threshold
Our experiment • What was our n? → 10 • What statistic? • difference of medians • What was δ? → ~23% decrease • What was α? → 0.05 • Across group variation? • Within group variation? ctrl = 8 13 14 13 16 13 15 3 9 9 median = 13 stdev = 3.97 exp = 4 13 12 18 12 6 11 4 9 9 median = 10 stdev = 4.37
Retrospective power • We got a null result, is it “real”? • Maybe we didn't have large enough n • Maybe we chose poor descriptors/statistic, resulting in small δ • Maybe we really do have a significant difference (i.e. p < α), but study design prevented us from detecting it • study design = n, statistic, significance level (α)
Retrospective power • Basic strategy • Decide what δ to test → use actual (-23%), or can generate exp groups w/ different δ by scaling ctrl group • Generate null distribution of test stat • Compute 95% confidence interval for null dist (assumingα = .05) • Generate alternate dist of test stat using δ (preserve difference between groups when resampling) • See how many test stats in alt dist are outside confidence interval of null dist, divide by size of dist → power
What will this tell us? • Considering a 23% decrease from ctrl to exp as significant w/ α = .05, with 10 subjects/group, using the difference between medians as our test statistic, the power is the percentage of times that we will actually detect this difference • Have created null distribution as per normal, then, instead of testing just our actual data against this null dist, test permutations of actual data to see how often we get a significant result
Generate null distribution ctrl = 8 13 14 13 16 13 15 3 9 9 exp = 4 13 12 18 12 6 11 4 9 9 boxNULL = [ctrl; exp]; % combine datasets statsNULL = []; % initialize vec for pseudo stats for trials = 1:10000 % run 10,000 times pseudo_ctrl = sample(length(ctrl), boxNULL); % create pseudo groups pseudo_exp = sample(length(exp), boxNULL); pseudo_stat_null = ... % compute pseudo test stat median(pseudo_ctrl) - median(pseudo_exp); statsNULL = [statsNULL pseudo_stat_null]; % store pseudo test stat end NULLconfint = ... % compute 95% confidence interval percentile(statsNULL, [.025 .975]
Null distribution 95th percentile NULLconfint = -3.9700 3.9900
Generate alternate distribution % Don't combine groups here, want to preserve % difference since this is distribution reflecting % values of test stat when there IS a difference statsALT = []; % initialize vec for pseudo stats for trials = 1:10000 % run 10,000 times pseudo_ctrlALT = sample(length(ctrl), ctrl); % create pseudo groups pseudo_expALT = sample(length(exp), exp); % only sample from ctrl for % pseudo ctrl, only % sample from exp % for pseudo exp pseudo_stat_alt = ... % compute pseudo test stat median(pseudo_ctrlALT) - median(pseudo_expALT); statsALT = [statsALT pseudo_stat_alt]; % store pseudo test stat end
Compute power % Count test stats from alternate dist that % fall outside of significance threshold % (defined here as 95th percentile) of null dist right_power = count(statsALT > NULLconfint(2)); % count values of test stat in % alt dist % greater than 97.5 % percentile of null dist left_power = count(statsALT < NULLconfint(1)); % count values of test stat in % alt dist % less than .025 % percentile of null dist power = (right_power + left_power) / trials % sum counts and divide by size % of dist to compute power
power = 0.2963 → NOT GOOD mean(statsNULL) = -.02 mean(statsALT) = 2.78 std(statsNULL) = 1.99 std(statsALT) = 2.11
Test different values of δ our experiment
Try different test stat • difference of means, instead of difference of medians • but remember what our data looked like? mean is a poor descriptor → likely over-estimating power
Try different test stat • absolute value of difference of medians
function out = power_EXAMPLE(ctrl, change, Func) % ctrl is control group, should be column vector % change is factor to scale ctrl group by to create phantom exp % Func is method for computing descriptors exp = change * ctrl; % --- null hypothesis --- boxNULL = [ctrl; exp]; % combine datasets statsNULL = []; % initialize vec for pseudo stats for trials = 1:10000 % run 10,000 times pseudo_ctrl = sample(length(ctrl), boxNULL); % create pseudo groups pseudo_exp = sample(length(exp), boxNULL); pseudo_stat_null = ... % compute pseudo test stat feval(Func, pseudo_ctrl) - feval(Func, pseudo_exp); statsNULL = [statsNULL pseudo_stat_null]; % store pseudo test stat end out.NULLconfint = ... % compute 95% confidence interval percentile(statsNULL, [.025 .975]); % --- alternate hypothesis --- % Don't combine groups here, want to preserve % difference since this is distribution reflecting % values of test stat when there IS a difference statsALT = []; % intialize vec for pseudo stats for trials = 1:10000 % run 10,000 times pseudo_ctrlALT = sample(length(ctrl), ctrl); % create pseudo groups pseudo_expALT = sample(length(exp), exp); % only sample from ctrl for % pseudo ctrl, only % sample from exp % for pseudo exp pseudo_stat_alt = ... % compute pseudo test stat feval(Func,pseudo_ctrlALT) - feval(Func,pseudo_expALT); statsALT = [statsALT pseudo_stat_alt]; % store pseudo test stat end % Count test stats from alternate dist that % fall outside of significance threshold % (defined here as 95th percentile) of null dist out.right_count = count(statsALT > out.NULLconfint(2)); % count values of test stat for alt dist % greater than 97.5 % percentile of null dist out.left_count = count(statsALT < out.NULLconfint(1)); % count values of test stat for alt dist % less than .025 % percentile of null dist out.power = (out.right_count + out.left_count) / trials; % sum counts and divide by size of dist % to compute power figure subplot(1,2,1), hist(statsNULL), title('Null distribution') subplot(1,2,2), hist(statsALT), title('Alternate distribution')
% use this script to try different values of 'change' % using power_EXAMPLE.m % 'change' is scalar to create phantom exp group by scaling ctrl group % example: change = .9 will create phantom exp group that shows 10% % decrease compared to ctrl, since ctrl will be multiplied by .9 clear powers clear conf_ints clear runs clear more powers=[]; conf_ints=[]; runs = [.99 .95 .9 .85 .8 .75 .7 .65 .6 .55 .5 .45 .4 .35 .3 .25 .2 .15 .1 .01]; more = length(runs); for it = runs out = power_EXAMPLE(ctrl,it,@median); powers = [powers out.power]; conf_ints = [conf_ints; out.NULLconfint]; more more = more - 1; end figure subplot(1,2,1) plot(100.*(1-runs),powers,'-o','MarkerEdgeColor','b','MarkerSize',12,'Color','k') xlabel('Percent decrease from ctrl -> exp', 'fontsize', 14) ylabel('Power', 'fontsize', 14) subplot(1,2,2) hold plot(100.*(1-runs),conf_ints(:,1),'x','Color','r'),text(1,conf_ints(1,1)-2,'lower bound') plot(100.*(1-runs),conf_ints(:,2),'o','Color','b'),text(1,conf_ints(1,2)+2,'upper bound') xlabel('Percent decrease from ctrl -> exp', 'fontsize', 14) ylabel('Null distribution 95% confidence interval bounds', 'fontsize', 14) hold
Best way to increase power? • Increase n • Our power seemed to plateau around .7 (when using medians) • Maybe we just need to get more subjects • How many more? → Prospective power analysis
Prospective power • Basic strategy • Choose n and δ to test • Generate phantom ctrl group • sample from normal distribution w/ mean and standard deviation of real ctrl group n times • Generate phantom exp group • scale phantom ctrl by δ, or use pilot data to generate phantom exp with bigger n, as with ctrl group above • Proceed w/ “retrospective” power analysis
Generate groups function power = power_prospectEXAMPLE(ctrl, change, n) % ctrl is control group % change is factor to scale ctrl by to create exp % n is sample size to test u = mean(ctrl); % compute mean and stdev of actual ctrl group sig = std(ctrl); ctrl = normrnd(u, sig, n, 1); % generate phantom ctrl group by sampling from % norm dist w/ same mean and stdev as real % ctrl group, sample n times exp = change * ctrl; % generate exp group by scaling ctrls ...Now just run the retrospective analysis on these groups
Variability Prospective power values can fluctuate with each run, suggest running at least 10x Example: n = 20, run 100x mean power = .47 median power = .43
Try different values of n Use δ from actual data: 23% decrease from ctrl → exp Each point is mean of 10 runs for that n