1 / 43

Blinding Index for Clinical Trials

salena
Download Presentation

Blinding Index for Clinical Trials

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Blinding Index for Clinical Trials Heejung Bang, PhD Weill Medical College of Cornell University

    2. Outline Background Statistical methods Examples: CRISP and WASID trials Simulation study (if time permits) Discussion Current and future studies

    3. Background Human behavior is influenced by what we know or believe and everybody is tempted to find out something. Blinding or masking--single, double, triple (Dictionary for Clinical Trials 1999) Reduce selection and information bias, and improve compliance Huge efforts directed to disguise the dissimilarity between treatments (e.g., taste, smell, appearance, mode of delivery) Treatment assignment: the use of blocks of random lengths is common practice (suggested by ICH-E9)

    4. 4 Background Blinding is not always feasible or relevant -- some surgical treatments -- treatment vs. nothing -- early vs. late interventions (e.g., HIV) -- some patients don’t know if they are on treatment -- animal studies?? Imperfect blinding is preferable to an open design (Furberg & Soliman 2008) Bias can occur in every aspect of trials: data reporting, collection, assessment and classification. Bias may be more pronounced in trials with commercial sponsors.

    5. 5 Some alternative or back-up methods A third party can be employed in open trials in order to reduce study investigator bias Dose adjustment of the study medications may be handled in a run-in phase prior to randomization Prospective, randomized, open-label, blinded-endpoint (PROBE) design? Perhaps, not yet.

    6. 6 Important illustrative example In the single-blind placebo-controlled trial, investigating the effect of zinc on taste disorders showed statistically significant benefit. The identical trial was repeated with only one difference, double-blind, and showed no benefit. Problems: 1. responses were very subjective 2. ‘vested interest’ 3. bias from unblinding

    7. 7 FDA said: “[drug name]-related sided effects have the potential to unblind subjects and investigators. Unblinding may result in ascertainment bias of subjective study endpoints. We recommend that you administer a questionnaire at study completion to investigate the effectiveness of blinding the subjects and treating and evaluating physicians” Office of Therapeutics Research and Review, Center for Biologics Evaluation and Research, FDA 2003

    8. 8 FDA also said: “DRUDP requests that subjects and investigators state at the end of the subject’s participation as to what treatment assignment they think was made, in order to assess the adequacy of blinding” Office of Drug Evaluation ODE III. Center for Drug Evaluation and Research“, FDA 2005

    9. 9 CONSORT (revised version, 2001) Recommends reporting “how the success of blinding was evaluated.”

    10. Everybody knows blinding is important but…. Grossly incomplete reporting of procedures and any test for blinding. Call for urgent improvement (Schulz et al. 1996; Fergusson et al. 2004; Hróbjartsson et al. 2007 among others) Not many statistical methods available. Only two blinding indices in the literature. Most medical papers are exploratory or some are even incorrect. How to handle “Don’t know” answer? --- different from missing data! More Background

    11. 11 Common formats of blinding questionnaire With 3 response categories about their guess: “Drug”, “Placebo” or “Don’t know (DK)” With 5 response categories about guess and certainty of guess: “Strongly believe the treatment is drug” “Somewhat believe the treatment is drug” “DK” “Somewhat believe the treatment is placebo” “Strongly believe the treatment is placebo” Remarks: 1. We may re-ask those who answered DK initially. 2. Some don’t allow DK and force to guess --- I don’t think this is a good idea.

    12. Common data structures 2x3 format

    13. 2x5 format Common data structures

    14. Ancillary data from those who answered DK Remark: n3. (in ancillary data) = n.3 (in 2x3 or 2x5 format) if no missing data We may also collect

    15. 1. Chi-Square test (Hughes & Krahn, 1985) - Comparing the proportions of correct and incorrect answers - 2x2 Chi-Square test to compare Pcor and Pinc among participants excluding DKs - 2x3 Chi-Square test to compare Pcor and Pinc among all participants - If blinding was not maintained from above analyses, blinding was assessed in each arm separately. Remark: Strictly speaking, this is like performing a one-sample binomial test! Existing methods

    16. 16 2. Kappa statistic - Note that Kappa measures agreement but we should measure disagreement!

    17. 3. Blinding Index (James et al., 1996) Modified version of kappa statistics - BI = {1+PDK+(1-PDK)*KD}/2 where

    19. Limitations of existing methods Most methods are descriptive or use naďve or incorrect statistics (e.g., Hughes and Krahn 1985; Howard et al. 1982). James method is dominated by DK responses (DK should be real DK). Existing methods can not detect 1) different behaviors of two arms, 2) qualitatively different scenarios, nor 3) give the proportion of unblinding beyond chance.

    20. Define ri|i = Pi|i /(P1|i + P2|i) i = 1 for drug, i = 2 for placebo (i.e., proportion of correct guesses among participants with certain identification on the i-th arm) Without DK: new BIi = 2ri|i – 1 (i.e., proportion of participants who answer correctly on the i-th arm beyond chance level) With DK: new BIi = (2ri|i – 1)*(P1|i + P2|i ), estimated by

    21. 21 Remark: new BIi is identical to P1|1 – P2|1 for drug arm P2|2 – P1|2 for placebo arm under trinomial distribution.

    22. New blinding index (2x5 format) More general 2x5 format (& ancillary data for DK) new BIi = P1|i + w2|iP2|i + w31|iP31|i - P5|i- w4|iP4|i - w32|iP32|i subject to 0 = w31|i = w32|i = w2|i = w4|i = 1 & P1|i + P2|i+ + P31|i + P32|i + P4|i + P5|i = 1 Remarks: 1. -1<new BI<1. 2. “2x3 format” and “data without ancillary data” are special cases of this format. 3. Suggested weights: w31|i = w32|i =0.25 & w2|i = w4|i=0.5 or others for sensitivity analysis. 4. Multi-arms: Bonferroni correction suggested.

    24. 24 Example 1: Cholesterol Reduction in Seniors Program (CRISP) A pilot study for cholesterol lowering in the elderly. Cholesterol levels continue to be predictors of coronary heart disease in people >65 years. The CRISP was a 5-center pilot study to assess feasibility of recruitment and efficacy of cholesterol lowering in this age group. The main paper: LaRosa, Applegate, Crouse, Hunninghake, Grimm, Knopp, Eckfeldt, Davis & Gordon (Arch Int Med 1994)

    25. 25 METHODS: A double-blinded RCT with placebo vs. 20- vs. 40-mg lovastatin. 1 year follow-up. Endpoints were changes in lipid levels. RESULTS: 431 subjects with LDL in 159-221 mg/dL were randomized. In the 20- and 40-mg lovastatin groups, total cholesterol levels fell 17% and 20%; LDL fell 24% and 28%; triglyceride levels fell 4.4% and 9.9%, respectively. HDL rose 7.0% and 9.0%, respectively. No changes in the placebo group. CONCLUSION: Older subjects of both genders and a variety of racial/ethnic groups can be successfully recruited into a cholesterol-lowering trial. Lovastatin has effects similar to those reported in younger subjects. There is little advantage to the higher lovastatin daily dose.

    26. CRISP study

    27. CRISP results Hughes & Krahn’s method (with correction) Overall: Pcor = 68.1% > Pinc = 31.9% (p < 0.0001), unblinded Lovastatin: Pcor = 76.6% > Pinc = 23.4% (p < 0.0001), unblinded Placebo: Pcor = 51.8% > Pinc = 48.2% (p = 0.7893), blinded James et al.’s BI BI = 0.75 (95% CI: 0.71, 0.78), blinded. Bang et al.’s New BI - Using data in 2x3 format: Lovastatin: BI = 0.21 (95% CI: 0.15, 0.26), unblinded Placebo: BI = 0.01 (95% CI: -0.07, 0.10), blinded - Using data in 2x5 format + ancillary data for DK: Lovastatin: BI = 0.16 (95% CI: 0.11, 0.21), unblinded Placebo: BI = 0.01 (95% CI: -0.06, 0.08), blinded

    28. 28 Example 2: Warfarin-Aspirin Symptomatic Intracranial Disease (WASID) trial Hertzberg et al. (2008) investigated if use of dose modification schedule is effective for blinding trials of warfarin in the WASID study (Chimowitz et al. NEJM 2005). Compared with blinding in the SPINAF trial (Ezekowitz et al. NEJM 1992). The WASID team spent great attention on blinding issues.

    29. 29

    30. 30

    31. 31 Results In the WASID and SPINAF trials, new BI uniformly showed increased unblinding for warfarin than for aspirin, whereas other indices could not capture this fact. If you combine BIs from different arms, cancel-out effect can occur. Summarizing a pattern can be important. The observed trend may be explained in other occurrences associated with warfarin (e.g., number of dose change, hemorrhage)

    32. 32 Comments from Dr. Canette (at Stata) ‘It is clear that BI and New BI cannot be compared, because they are based on different paradigms. James et al. believe that the most important observations are "DK", and the new index doesn't rely much on these observations. I believe that some research needs to be done to determine the circumstances under which one of the indexes should be chosen over the other. At this point, the subject of which paradigm to choose is a totally subjective matter.’

    33. 33 Example 3: How to interpret this data? A sample run of ‘Blinding’ module in Stata using an artificial example (provided by Dr. Canette) -------------------------------------------------------------------------------- Methods |     Index.   Std. Err.    z    P-value 95% CI -------------------------------------------------------------------------------- James |     .69    .08    2.29    .99       .55 - .82 Bang Drug |      .25    .14    1.79    .04       .02 - .48 Bang Placebo|      .25    .14    1.79    .04       .02 - .48 ------------------------------------------------------------------------------- Some researchers would be happy with the blinding results, but Bang’s new BI rejects the blinding hypothesis. Some people would say that the new BI is being harsh; and the response is that all depends on your interpretation of DK.

    34. 34 Bang’s response Yes. All depend on the validity of DK. It can be one of New BI's limitations but we want to focus on estimate, not testing. Even when DK is truly DK (so James’ BI might be better), we still want to know % of unblinding in each arm. New BI clearly shows that. We may classify each scenario into 9 blinding scenarios.

    35. 35

    36. Our suggestion in practice Blinding assessment should be reported in publications for relevant trials, if not all trials. When unblinding (e.g., by new BI) is >20% in any arm. Perhaps, BI and new BI are reported together, especially, when the two methods yield different conclusions. Remark: Selective reporting can be problematic!

    37. 37 Another important & controversial question: When to ask blinding questions? (Letters & Response in CCT) Shortly after randomization vs. during vs. after trial? Henneicke-von Zepelin (2005) and Hemilä (2005) both claim ‘assessment may be inappropriate after the trial’ due to confounding between efficacy and correct guessing. Sackett (BMJ 2004) and others agree.

    38. 38 Bang et al’s response (2005): Statistically speaking, of course, the best approach is to ask twice or more. However, we still prefer ‘after the trial’. Although we may not be able to know if blinding is true blinding or DK is DK, we never want to make participants try to guess. As we ask more, they may become curious. ‘Less is more’ or ‘Let it be’. Blinding conveys stories during the entire course of the trial including early and late, efficacy and side effects. If you want to test the blinding at the beginning, do with the third party or in a pilot study.

    39. 39 We may ask some more qualitative questions: For example, ‘Why do you believe you were on treatment x?’ ‘When did you find out?’ preferably together with other general questions (e.g., participants’ satisfaction, problems or other comments) at the study close-out. --- Again, we do not want to make participants try to guess even in their future trials.

    42. Simulation results

    43. Simulation results (cont’)

    44. Discussion New BI is directly interpreted (as % of unblinding beyond chance) and detects relatively low degree of unblinding and captures different behaviors in different arms. DO NOT COMBINE! New BI’s extension to >2 arms is easy. We should encourage all the participants to provide their honest guess for the treatment and may include extra question(s) to evaluate 1) the credibility of DK and 2) reasons for guess, etc. --Some don’t like this idea Subgroup analyses can be important (Vitamin C trial 1975; Hemilä 1996).

    45. Assessment of blinding can be quite straightforward statistically, but the final conclusion relies on the subjectivity of investigators and the nature of the study. (For example, how large is large enough?) 1) BI estimation (& statistical testing), 2) classification into 9 blinding scenario, 3) careful interpretation and potential cause identification can provide a comprehensive evaluation of the blinding of clinical trials. If unblinding occurs, it is important to identify the causes and fix the problems, if possible, (except for treatment effect) for future study planning and conduct.

    46. 46 Discussion Blinding research is destined to be subjective, qualitative, and imperfect. However, empirical (quantitative) evidence is almost always good to have. At the end of the day, impacts on primary treatment effect?? --Unblinding may not invalidate primary results.

    47. 47 Personal belief about good treatments/trials ‘Treatment effect’ should be greater than ‘Noncompliance effect’ should be greater than ‘Unblinding effect’

    48. 48 Current and future research (collaborators: Drs. Park and Canette) Literature review. Meta-analytic approaches Classify existing trials into different blinding scenarios The effect of blinding on the effectiveness BI by types of interventions More simulation study for comparing blinding indices Developing Blinding questionnaire and procedure Park, Bang, Canette. Blinding in controlled trials, time to do it better (Editorial). Complementary Therapies in Medicine 2008. www.blindingindex.org

    49. 49 Acknowledgements Dr. Isabel Canette and Ms. Jiefeng Chen in the Stata team -- "blinding" module to compute two BIs available by Stata since March 2008. Dr. Jongbae Park, Mr. Stephen Flaherty and IT team at UNC-Medical School Co-authors: Ms. Liyun Ni (at Amgen) and Dr. Clarence E. Davis (at UNC)

More Related