180 likes | 199 Views
Case Study 8 The Physician’s Health Study of Aspirin in the prevention of Myocardial Infarction: Power and Gender Discrimination? Principal Investigator: C. Hennekins, MD. Materials: The Physicians’ Health Study : The PHS web site: http://phs.bwh.harvard.edu/ .
E N D
Case Study 8The Physician’s Health Study of Aspirin in the prevention of Myocardial Infarction: Power and Gender Discrimination?Principal Investigator:C. Hennekins, MD.
Materials: The Physicians’ Health Study: The PHS web site: http://phs.bwh.harvard.edu/ . Gender discrimination: Doughtery & Coulter. “Gender Balance in Cardiovascular Research Importance to Women's Health.” Tex Heart Inst J.38 (2011): 148–150.
Key words: Power, Sample size, Gender discrimination, Patient sample enrichment [with males]. Rick Chappell, Ph.D. Professor, Department of Biostatistics and Medical Informatics University of Wisconsin Medical School Stat 542 – Spring 2018, Week ??
Outline: Background for the PHS The PHS’ Sample Size Calculations: Does selecting only males for the PHS imply gender discrimination?
A. Background for the PHS - from their website: "A team of investigators from Harvard ... created the PHS as an effective way to test whether aspirin did, indeed, prevent myocardial infarction ... The investigators decided to also examine whether beta-carotene, which at the time was being touted as a wonder drug, prevented cancer. Why choose physicians as the study population? ... As a group, physicians would report their medical histories and health status more accurately than participants drawn from a general population. They would also be more likely to identify possible side effects of the study agents."
"Under the direction of PI Charles H. Hennekens, MD, and with funding from the NCI and the NHI, the PHS was launched in 1980. It was the first large, randomized trial conducted entirely by mail.” “A total of 11,037 [male] physicians were randomized to aspirin and 11,034 to aspirin placebo.”
"It employed ... a 2x2 factorial design [to examine both MI and colon cancer prevention strategies]. It assigned participants to get one of four possible [aspiring/beta-carotene/double-dummy placebo] combinations ... .” “The combination of a trial by mail and the 2x2 factorial design allowed the PHS to be conducted at a fraction of the cost of a standard primary prevention trial."
Justifying the Sample Size – are 22,071 Subjects Enough? Apparently not for purposes of detecting an effect of aspirin on CV mortality. From the study publication: “ . . . the evidence [of beneficial prophylactic effect of aspirin upon] concerning stroke and total cardiovascular deaths remains inconclusive because of the inadequate numbers of physicians with these end points.” "The Data Monitoring Board recommended the early termination of the blinding aspirin component of the trial on December 18, 1987. This decision was based on all available evidence, including ... the fact that no effect of aspirin on cardiovascular mortality could be detected in the trial until the year 2000 or later."
Despite the lack of an effect of aspirin on CV mortality, the primary outcome, there was a substantial effect on first MI. The website puts a more favorable spin on the results: “The trial's DSMB stopped the aspirin arm of the PHS several years ahead of schedule because it was clear that aspirin had a significant effect on the risk of a first myocardial infarction. As reported in the July 20, 1989 NEJM, aspirin reduced the risk of first myocardial infarction by 44% (P less than 0.00001).” Both this statement and the preceding one are true.
B. The PHS’ Sample Size Calculations: Using the observed proportion of cardiovascular mortality amongst (male) subjects in the PHS receiving placebo (as of 12/87): pc = 44/(.5 x 22071) = .0040, and with a 25% reduction with aspirin, pt = .75 x pc = .0030. Say size = .05 (one-tailed) and 1-b = power = 90%. Then n = (Za+ Zb)2 /[sin-1sqrt(pt) - sin-1sqrt(pc)]2 = (1.645 + 1.282)2/[sin-1sqrt(.0030) - sin-1sqrt (.0040)]2 = 119,000. whoops.
How did the PHS’ Designers Settle on 22,072 Subjects?
How could the designers have justified cardiovascular mortality as an endpoint with "only" 22,071 subjects? They originally expected a rate of 733/22071 = .033 events/4.7 years (average followup), drawn from figures for U.S. white males (off by an order of magnitude). Using the same size and power as above, Suppose the placebo rate is .033 and aspirin causes a 50% reduction for a rate of .0165. Then the predicted sample size is 3,620. Suppose the placebo rate is .033 and aspirin causes a 25% reduction for a rate of .025. Then the predicted sample size is 18,408. Note that sample size is roughly inversely proportional to treatment effect squared. .IP "IV." The study was originally designed to last 8 years, for a (conservatively extrapolated) rate of $.033 ~ times ~ 8/4.7 ~=~ .056$. Suppose the placebo rate for physicians is less than half of this, .025; aspirin causes a 25% reduction for a rate of .0188. Then the predicted sample size is 22912.
How could the designers have justified cardiovascular mortality as an endpoint with "only" 22,071 subjects? The study was originally designed to last 8 years, for a (conservatively extrapolated) rate of .033 x 8/4.7 = .056. Suppose the placebo rate for physicians is less than half of this, .025; aspirin causes a 25% reduction for a rate of .0188. Then the predicted sample size is 22,912.
C. Does selecting only males for the PHS imply gender discrimination? The PHS has been used as an example of excluding females from research. Dougherty & Coulter say: "Cardiovascular disease is an equal opportunity killer. Nonetheless, the typical CVD clinical trial comprises a population that is 85% male; those women who do participate are predominantly postmenopausal. Consider the Coronary Drug Project, the 1st major clinical trial funded in 1965 by the ... NHLBI.”
Further: "Why were randomized controlled trials systematically avoided in women in the 20th century? Perhaps the most proximate cause was the contemporary revelation of horrific birth defects in children exposed to thalidomide in utero. The recognition that pharmaceutical intervention in pregnant women could result in fetal injury chilled testing of investigational new drugs in women of childbearing potential and laid the legislative groundwork for the modern FDA."
Another perspective via a thought-experiment: What would be the best trial population to study if we are concerned withpreventing heart attacks only in women using non-hormonal treatments? Should it be one enriched with men because of their higher event rates & presumably higher treatment effect?
The number of U.S. heart disease deaths per 100,000 in 2015 for women (134) was slightly under 2/3 that for men (211). Suppose that this ratio holds for both the aspirin and placebo groups, so that the treatment effect is 2/3 as big for women as for men. Then the sample size will need to be 1/(2/3)2 = 2.5 times bigger, or 22,912 x 2.5 = 56,821.
Is gender-specificity worth it, or do we prefer the enriched population? Do men make a good experimental animal for cardiovascular disease prevention in women?