450 likes | 589 Views
Follow-up, compliance and analysis. Analysis (very brief): Standard analysis More exotic stuff Compliance/adherence How to measure Why bother? Follow-up Importance of complete follow-up Analysis issues: ITT, etc. Special topics Subgroup analysis.
E N D
Follow-up, compliance and analysis • Analysis (very brief): • Standard analysis • More exotic stuff • Compliance/adherence • How to measure • Why bother? • Follow-up • Importance of complete follow-up • Analysis issues: ITT, etc. • Special topics • Subgroup analysis
Analysis for clinical trials (review?) • 2 groups simplest • Analysis depends on type of outcome variable • Continuous • Binary • Binary, time to event
Analysis of trials with continuous outcomes • Compare mean in placebo with mean in active • e.g., effect of statins on lipids, b-blocker on MI • Usually compare mean change across two groups • Increased power • Can compare “after” only • Use t-test if normal distribution or close to it • If radically non-normal, use non-parametric analogue
Multiple Outcomes of Raloxifene Evaluation (MORE Trial)* • 7,705 postmenopausal women with: • BMD T below -2.5 or vertebral fractures • International 189 centers • Placebo vs. 60 or 120mg raloxifene (a SERM) * Ettinger, Black, et. al. JAMA, 8/99
Effect of Raloxifene on BMD 4 4 Lumbar Spine Hip 3 3 RLX 2 2 2.5%* % Change 1 1 RLX 2%* 0 0 PBO PBO -1 -1 -2 -2 0 12 24 36 12 24 36 0 Months Months *p<.01 (t-test)
Analysis of trials with binary outcomes • Compare proportion in placebo vs. active groups • e.g., occurrence of vertebral fracture on baseline vs. follow-up x-ray (yes/no, don’t know date) • Use a chi-square test
3 Years of Raloxifene: Effect on Vertebral Fracture RR 0.65 (0.53, 0.79)(p<.01) RR 0.54 (0.44, 0.67) (p<.01) % with fracture PBO RLX120 RLX 60
Analysis of trials with time-to-event outcomes • Compare survival curves in active vs. placebo groups • HERS trial: 1st occcurrence of secondary MI • Adjust for differential follow-up time • Due to long recruitment period • Conceptual: • Everyone will have the event if followed long enough • Those without event are censored • Use log rank test • Stratified chi-square at each “failure” time • Equivalent to proportional hazards model with single binary predictor
Raloxifene and Risk of Breast Cancer (MORE trial) 1.25 Placebo 3.8 per 1,000 1.00 0.75 p < 0.001 % of participants 0.50 Raloxifene 1.7 per 1,000 0.25 0.00 0 1 2 3 4 Years
3 Years of Raloxifene Did Not Significantly Decrease Risk of Non-spine Fractures 15 RR* = 0.91 (0.79, 1.06) 10 % with fractures 5 Placebo Raloxifene (60 + 120) * relative hazard from PH model 0 6 0 30 36 18 24 12 Months
Analysis for clinical trials: more exotic stuff • Repeated measures analysis • When outcome is repeated • Continuous: several measurements • Dichotomous: more than one occurrence of event • Cluster randomization designs • Randomize/analyze clusters • Techniques for correlated data (random effects ANOVA, etc.) • Adjusted analysis • Use linear regression, logistic or PH to adjust for BL variables • Problematic unless specified apriori (never)
Follow-up in RCT’s • What happens after randomization • Carefully lay out procedures to be followed • Describe on forms and in Operations Manual • First reaction: do everything on everyone at every visit • e.g. labs at all visits • But great opportunities for efficiencies • Ask the following: • Do only at some visits? • Do only on a subset? • Don’t do at all
Large and Simple Trials • Get a whole lot of people • Randomize, do as few follow-up measurements as possible • Difficult to carry out in practice • Examples • Physicians’ Health study: Randomize to aspirin or placebo, mail out drugs, follow-up by mail • Use data collected for other purposes for follow-up/endpoints • Population mortality • Medical info (Medicare, Kaiser)
Compliance or (mpc) adherence • Trial is meaningless unless participants adhere to interventions • Two aspects • 1. Adherence to medications/interventions • 2. Adherence to visit schedules/reporting • Lack of adherence leads to: • Bias • Decreased power • Uninterpretable results
Effect of incomplete visit follow-up on results in clinical trials Fracture Intervention Trial (alendronate vs. placebo) X-rays obtained at baseline, 2 years, 3 years Vertebral fractures defined from changes in radiographs FU radiographs on 97% of participants @ year 3 Time (yrs)Relative risk (CI) BL to 2 0.34 (66% reduction) BL to 3 0.49 ( 51% reduction)
Effect of Incomplete Follow-up: Virtual Experiment • FIT I: Follow-up x-rays on 97% of surviving participants at year 3 • What if follow-up less complete? • Randomly “lose” 50% between year 2 and 3
Use of Survival Analysis for X-Rays in FIT I:Virtual Experiment Time (yrs)Relative risk 2 0.34 3 0.49 3 (50% LTFU) 0.37 LTFU = Lost to follow-up
Effect of High Rate of Loss to Follow-up on Results • If early results differ from later results, could create bias when comparing one study to another • Even a “random” (therefore unbiased) loss to follow-up can affect results
Measuring adherence • Medication-taking • Just ask! (self report) • Pill counts • Biochemical assays for some drugs • High tech pill bottles • Visit schedule • N missed visits • Visits within schedule • etc.
Adherence goals • Ideal: all participants continue to take medication (perfectly) throughout the trial and attend all follow-up visits until the very end • Why might participants stop medication? • Side effects (real or perceived) • Complex regimens • Want to take true active medication • New info on old medication • New competing medication • Want to stop active medication • New info on old medication (e.g, ERT increases BC risk)
Some Examples of “Bad Adherence Days” • Women’s Health Initiative • After first year, letter sent to all participants “observed a small increase in cardiovascular disease among ppts on HRT”… • Many stopped medications • PROOF trial (effect of Calcitonin on osteoporosis) • 1994 to 1999 • 1997: Alendronate approved with significant marketing and excellent results
Effect of stopping medication: Classical interpretation • Placebo’s start active medication==>become more like actives • Actives stop active medication and start “inactive”==>become more like placebo • Two groups become more similar • Treatment effect is underestimated/conservative • Comforting • “Classical interpretation” may not hold: • Example: patients stop study meds to take a medication that is better than active study medication
Strategies to enhance compliance • Warm and fuzzy stuff • Participants to feel appreciated • Staff in clinic spend enough time • Sensitive to ppts. scheduling needs • Parties/events with all participants • Ease of logistics/transportation to clinics • Birthday cards • Gifts • Information, Newsletters, other
Strategies to enhance compliance II • Most drop outs occur in early study period • FIT (4 years total); 2/3 of drop outs occurred in first year, most of those in first 6 months • Make certain that ppt’s understand study requirements • Run-in period • Trial run of drug/treatment • Typically 2-4 weeks, usually of placebo (not always) • Value controversial
Study adherence: follow-up visits • Goal: visits all on time (within window) • Set appointments flexibly • Reminders prior to appt. • Give study calendar • Listen to concerns/problems
Need for consideration of compliance:Coronary Drug Project (CDP, NEJM 1980) 5 year mortality Overall Adherence > 80% (2/3)< 80% (1/3) Clofibrate (n=1065) 18% 15% 25%
Need for consideration of compliance:Coronary Drug Project (CDP, NEJM 1980) 5 year mortality Overall Adherence > 80% (2/3)< 80% (1/3) Clofibrate (n=1065) 18% 15% 25% Placebo (n=2695) 19% 15% 28% Lessons • Unknown/unmeasured confounders associated with compliance • Differ in placebo and active groups
Adherence of medication is not the same as adherence to visit schedule • “Drop out” is very vague term • Can have perfect visit adherence (come to all visits on time) but-- • Not take a single study med pill • Take only 60% of pills • If miss visits or stop coming to visits, then generally don’t take study medication • Exceptions do occur: Trial of once-yearly infusion treatment. May have perfect medication compliance but poor visit compliance
Follow-up visits for those who have stopped study medications? • Practice varies dramatically across studies • Option 1: Stop follow-up as soon as drug stops • Option 2: Continue to collect follow-up info • Advantages of each • ??
Follow-up visits for those who have stopped study medications? • Practice varies dramatically across studies • Option 1: Stop follow-up as soon as drug stops • Option 2: Continue to collect follow-up info • Advantages of each • O-2: Biased per previous slides (generally conservative) • O-1: Biased, but cannot predict direction • Choice related to analysis (ITT)
Intention to Treat Analysis (ITT) • ITT coined by AB Hill in textbook on Stat (1961) • One of the main Commandments of RCT bible • Original definition “All subjects will be analyzed according to the treatment group they were originally intended by the randomization process” • All: Analyze even if no pills taken or later found to be ineligible… • Originally intended:Regardless of compliance, analyze according to original assignment. • Alternative: randomized to treatment, took no pills. Analyze as a placebo
Beware of “we did an ITT analysis” • Generally considered sacred, almost god-like virtue • The term “ITT” used differently in different studies • ITT does NOT always mean that people were followed beyond stopping study medications • Examples where ITT may not guarantee holiness: • Patient stopped meds after 1 week and she was discontinued from study (including further follow-up) at that time. • Patient stopped meds after 1 week and follow-up continued. But in analysis, only follow-up until stopped meds is counted.
Alternatives in Analysis • per protocol or as treated analysis • If all ppts. are followed regardless of adherence to medications, several types of options • Include only those patients who took all study medications and completed all protocol visits (still ITT) • Include all patients but only for the time that they remained on study medications (still ITT) • If obtain complete follow-up on all ppts., can run several different types of analyses and any discrepancies could be informative.
Analysis based on post-randomization variables • Per-protocol limits analysis to adherers • Per-protocol is one example of analysis which stratifies based on post-randomization experience • Other examples? • More generally, subgroup analyses by post-rand. factors are biased
Problems with ITT/full follow-up approach • ITT/full follow-up not holy grail • Does not estimate full biologic efficacy of drug/intervention • Advising individual patients may depend on efficacy • Utility underestimated • May be anti-conservative for adverse effects • per-protocol may be preferred
Subgroups • After primary analysis, want to look at subgroups • Does effectiveness vary by subgroup • If drug effective, is it more effective in some populations? • If results overall show no effect, does drug work in subgroup of participants?
Example: Efficacy of alendronate • FIT II: Women with BMD T-score < -1.6 (osteopenic--only 1/3 osteoporotic) • Women without existing vertebral fractures (2) • Overall results: 14% reduction, p=.07 • Wimpy
RR for clinical fracture of alendronate(FIT II, Cummings, JAMA 1999) 1.5 P=0.07 0.86 (0.73 - 1.01) 1 B Relative Risk B B 0 Overall
RR for clinical fracture of alendronate by baseline BMD groups 1.14 (0.82 - 1.60) 1.03 B 1.5 (0.77 - 1.39) B 0.86 (0.73 - 1.01) B B 1 B Relative Risk B B B B B B B ???? 0.64 (0.50 - 0.82) 0 Overall T < -2.5 T > -2.0 -2.5 < T < -2.0 Baseline Femoral Neck BMD, by T-score
Subgroup analysis in HERS • Overall no effect of HRT or perhaps harm in year 1 • Is there a subgroup who benefit? • Is there subgroup with significant harm? • Look at relative hazard (RH) within subgroups defined by baseline variables • Medication use at baseline • Prior disease • Health habits • Compare RH in those with and without risk factor • RH in those using beta blockers compared to those not using • RH > 1 ==> harm • Get p-value for significance of difference of RH in those w and without
HERS: 4 years of HRT increased then decreased CHD Events Year E + P Placebo RH p-value 1 57 38 1.5 .04 2 47 48 1.0 1.0 3 35 41 0.9 .6 4 + 5 33 49 0.7 .07 > 5 ??? P for trend = 0.009
Subgroups: the final frontier in HERS Relative hazard (E vs. placebo) Subgroup Within Among Subgroup N (%) Subgroup Others p* history of smoking 1712 (62) 1.01 3.39 .01 current smoker 360 (13) 0.55 1.92 .03 digitalis use 275 (10) 4.98 1.26 .04 >= 3 live births 1616 (58) 1.09 2.72 .04 lives alone 775 (28) 2.97 1.14 .05 prior mi by chart review 1409 (51) 2.14 0.93 .05 beta-blocker use 899 (33) 2.89 1.15 .06 age >= 70 at randomization 1019 (37) 2.65 1.14 .06 * Statistical significance of interaction
Lots of subgroups were analyzed in HERS • history of smoking (at rv) 1712 (62) 1.01 3.39 0.30 .01 • current smoker (at rv) 360 (13) 0.55 1.92 0.29 .03 • digitalis use (at rv) 275 (10) 4.98 1.26 3.96 .04 • >= 3 live births 1616 (58) 1.09 2.72 0.40 .04 • lives alone (at rv) 775 (28) 2.97 1.14 2.60 .05 • prior mi by chart review (cr) 1409 (51) 2.14 0.93 2.30 .05 • beta-blocker use (at rv) 899 (33) 2.89 1.15 2.51 .06 • age >= 70 at randomization 1019 (37) 2.65 1.14 2.32 .06 • prior mi in most distant tertile 447 (16) 2.64 0.93 2.82 .07 • walk 10m or in exercise program (at rv) 1770 (64) 2.35 1.11 2.12 .08 • prior ptca by chart review (cr) 1189 (43) 0.92 1.98 0.46 .08 • prior mi within 2 years 420 (15) 3.20 1.28 2.50 .11 • tg > median (at rv) 1377 (50) 2.02 1.05 1.93 .12 • rales in the lungs (at rv) 80 ( 3) 0.43 1.65 0.26 .13 • digitalis or ace-inhibitor use (at rv) 653 (24) 2.33 1.24 1.88 .16 • previous ert for >= 12 months 302 (11) 4.19 1.41 2.98 .18 • serious medical conditions 1028 (37) 1.05 1.81 0.58 .21 • age >= 53 at lmp 578 (21) 3.19 1.38 2.31 .23 • hdl > median (at rv) 1315 (48) 1.18 1.95 0.61 .24 • lp(a) > median (at rv) 1378 (50) 1.26 2.08 0.60 .25 • use of non-statin llm (at rv) 420 (15) 0.89 1.69 0.52 .25 • married (at rv) 1588 (57) 1.26 1.98 0.64 .29 • lvef <= 40% 178 ( 6) 2.16 1.01 2.13 .31 • prior mi within 4 years 765 (28) 2.07 1.32 1.57 .32 • previous ert use for >= 1 year 327 (12) 2.86 1.41 2.03 .32 • prior mi within 1 year 194 ( 7) 2.88 1.43 2.02 .33 • chest pain (at rv) 982 (36) 1.25 1.88 0.67 .33 • dbp >= 90 mmhg (at rv) 149 ( 5) 0.91 1.62 0.56 .35 • prior ptca within 1 year 206 ( 7) 3.94 1.46 2.71 .38 • prior mi within 3 years 612 (22) 2.05 1.37 1.50 .40 • prior ptca within 4 years 838 (30) 1.15 1.70 0.68 .40 • use of any llm (at rv) 1296 (47) 1.23 1.76 0.70 .40 • diuretic use (at rv) 775 (28) 1.89 1.33 1.42 .41 • signs and symptoms of chf (at rv) 118 ( 4) 0.94 1.60 0.58 .42 • ace inhibitor use (at rv) 483 (17) 2.05 1.40 1.46 .44 • total cholesterol > median (at rv) 1377 (50) 1.32 1.80 0.74 .47 • l-thyroxine use (at rv) 414 (15) 2.29 1.43 1.60 .47 • poor/fair self-rated health (at rv) 665 (24) 1.30 1.72 0.76 .51 • heart murmur (at rv) 540 (20) 1.89 1.42 1.34 .53 • sbp >= 140 mmhg (at rv) 1051 (38) 1.37 1.72 0.80 .59 • prior ptca within 3 years 695 (25) 1.27 1.61 0.78 .62 • s3 heart sounds (at rv) 19 ( 1) 2.74 1.50 1.82 .63 • htn by physical exam (at rv) 557 (20) 1.32 1.62 0.81 .64 • >= 2 severely obstructed main vessels 1312 (47) 1.53 1.26 1.22 .69 • statin use (at rv) 1004 (36) 1.34 1.59 0.84 .71 • have you ever been pregnant 2564 (93) 1.55 1.15 1.35 .72 • calcium-channel blocker (at rv) 1511 (55) 1.61 1.38 1.17 .73 • previous hrt for >= least 12 months 132 ( 5) 1.24 1.60 0.78 .77 • ldl > median (at rv) 1373 (50) 1.44 1.63 0.89 .77 • prior ptca within 2 years 475 (17) 1.35 1.56 0.87 .81 • baseline left bundle branch block 212 ( 8) 1.31 1.55 0.85 .82 • white 2451 (89) 1.48 1.62 0.92 .88 • ever told you had diabetes 634 (23) 1.48 1.53 0.97 .94 • aspirin use (at rv) 2183 (79) 1.51 1.56 0.97 .95 • any alcohol consumption (at rv) 1081 (39) 1.54 1.57 0.98 .97 • gallstones or gallbladder dis. 633 (23) 1.55 1.52 1.02 .97 • baseline atrial fibrillation/flutter 33 ( 1) - 1.50 - - Total subgroups examined: 102 Total subgroups with p< .05: 6
Subgroups: conclusions • Subgroups are full of statistical problems • Multiple comparisons may lead to erroneous conclusions • Limited power in for subgroup analyses • Subgroups based on baseline variables are less bad • Subgroups based on post-randomization variables is more problematic
Follow-up and analysis: summary • Best trial: • All participants remain on medication • All participants are followed until end of study • Pre-planned analysis • Where possible, minimize subjectivity and adhoc-ness