Subgroup Analysis in Clinical Trials: The Skeptic, The Believer, and The Agnostic

Should subgroup analyses be performed if the overall results of a clinical trial show no difference between treatment groups? PUBH7420 – Group 7 April 23rd, 2012

Three types of people • The Skeptic • Automatically dismisses subgroup analysis as most likely due to the wiliness of chance • The Believer • Data are unlikely to play tricks; data could indicate that a subgroup truly differs from others • The Agnostic • Perhaps sides with the skeptic but has an uneasy feeling that subgroup analysis may in fact show something real Witte (2009)

The Skeptic • The trial probably was not powered to find differences within subgroups, thus: • Subgroup analysis is a great way to guarantee finding a statistically significant difference. • If we divide the data into subgroups, even just 2, it is unlikely that we’ll see the same effect in the subgroups. • If the overall treatment effect seems to be approaching significance - one subgroup may appear to have a large effect despite reduced power/increased type 1 error. • These analyses are typically not pre-planned or pre-specified.

The Believer • Pre-planned is not intrinsically better than post-hoc. • Is the strength of evidence for a hypothesis based upon whether someone has asked the question before or after data was collected? • Is it more likely to be right simply because somebody thought of it? • Perhaps, but then what we are doing is implicitly putting prior beliefs on hypotheses. • If estimated treatment effect differs between subgroups, the data is providing real information which we should acknowledge.

The Skeptic • The overall result of an RCT is a better estimate of treatment effect in the various subgroups examined than are the observed effects in those individual subgroups. • In Statistics, we call this phenomenon Stein’s Paradox and it applies to meta-analyses as well. • A genuine difference between subgroups is not necessarily due to the classification of the subgroups. • Difference could arise due to chance.

The Believer • Stein’s paradox is a heuristic. • Does this point apply to hypothesis testing? Power is greater with the larger sample size of a general sample, but if there is a statistically significant result in the subgroup analysis doesn’t that mean low power was overcome? • Randomization should diminish differences that arise by chance – confounding is controlled for and we expect confounders to be distributed similarly between treatment groups.

The Believer • Challenge the authority of the principle that thou shalt adjust for multiple testing. • This principle applies uncontroversially if I have a compelling reason to group the tests/intervals as part of one decision process. • Alternatively, if you want to protect against the inflated type 1 error, multiple tests should be adjusted for.

The Skeptic • Post-hoc subgroup analysis should only be considered as a hypothesis generating mechanism for further research. • Caveat: subgroup analyses can result—and have resulted—in trials that waste time/money and put patients at risk. • Course text has numerous examples of failed trials based off the findings of a subgroup analysis. • Publication bias

What to do… • Distinguish between prior and data-derived hypotheses. • Do not calculate p-values for data-derived hypotheses. • Place greater emphasis on the overall result than on what may be apparent within a particular subgroup. • Use tests of “interactions,” and/or correct for multiplicity of statistical comparisons.

Subgroup Analysis in Clinical Trials: The Skeptic, The Believer, and The Agnostic