1 / 25

BMS 617

BMS 617. Lecture 8 – Comparing Proportions. Types of independent and dependent variables. Last lecture examined t-tests A (two-class) t-test is applicable when The dependent (outcome) variable is a continuous, interval variable

sloan
Download Presentation

BMS 617

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics Core Facility

  2. Types of independent and dependent variables • Last lecture examined t-tests • A (two-class) t-test is applicable when • The dependent (outcome) variable is a continuous, interval variable • The independent (input) variable is a nominal (or ordinal) variable with two possible values • For example, in the GRHL2 comparison, the independent variable was “Basal type” (with values “Basal A” and “Basal B”) and the dependent variable was “log2 expression” Marshall University School of Medicine

  3. Independent and dependent variables that are both nominal • Another class of tests involves independent and dependent variables that are both nominal • Very common in clinical studies • Independent: treated vs not treated • Dependent: disease vs no disease Marshall University School of Medicine

  4. Contingency Tables In such experiments, data is usually presented in a contingency table Shows how value of dependent variable is contingent on the independent variable Aim is to compare the proportions between the two groups: is A/(A+B) different to C/(C+D)? Marshall University School of Medicine

  5. Types of study • There are four different study designs that lead to data presented in contingency tables: • Cross-sectional studies • Sample is selected at random from population. Sample is then divided into two groups depending on prior treatment or exposure to risk factor. Disease prevalence is compared in each group. • Prospective (longitudinal) studies • Two groups are selected: one with exposure to risk factor (or treated) and one without. Groups are then followed over time to see how many develop disease in each group • Experimental studies • Samples are selected, divided randomly in two groups. One group receives treatment (or is exposed to risk); one is not. Incidence of disease is compared in each group. • Case-control studies • Two groups of samples are selected: one with the disease (cases) and one without (controls). Each group is examined to see how many were treated or exposed to risk prior to the study Marshall University School of Medicine

  6. Example experimental study • Study from Frye et al (1996, NEJM) • Compared two treatments for Coronary Artery Disease • CABG and PTCA • 1829 patients in study, randomly assigned to CABG or PTCA • Outcome is 5-year survival Marshall University School of Medicine

  7. Calculations for Frye data • The risk for the CABG group is 372/914=40.7% • The risk for the PTCA group is 378/915=41.3% • The relative risk (for the PTCA group compared to the CABG group) is 41.3%/40.7%=1.01 • The PTCA group is 1.01 times more likely not to survive 5 years than the CABG group • The 95% confidence interval for the relative risk is 0.936 to 1.031 • There is little difference between the risk in these groups Marshall University School of Medicine

  8. Frye data for diabetic patients • Frye et al. also examined a subgroup of their patients who had diabetes • Still an experimental study • Controlled which patients received which treatment, observed outcome Marshall University School of Medicine

  9. Risk and relative risk for diabetic patients • Risk for CABG group is 87/180=48.3% • Risk for PTCA group is 104/173=60.1% • Relative risk for PTCA group (relative to CABG group) is 60.1%/48.3%=1.244 • For diabetic patients, the risk of dying within 5 years if you receive PTCA is 1.244 times the risk of dying within 5 years if you receive CABG • 95% confidence interval is 1.028 to 1.632 Marshall University School of Medicine

  10. Another example: AZT and HIV • Cooper et al. (via Motulsky): • Study of the effectiveness in using AZT to prevent HIV developing into AIDS. • Studied 936 patients, randomly treated either with AZT or with a placebo • After three years, compared the proportion for whom the disease had progressed to AIDS Marshall University School of Medicine

  11. Risk and relative risk for AZT We can perform the same analyses as before: Risk for AZT group is 76/475=16% Risk for placebo group is 129/461=28% Relative risk for AZT group is 16%/28%=0.57 AZT group are 0.57 times as likely to experience disease progression in three years as placebo group 95% Confidence interval for relative risk is 0.444 to 0.736 Marshall University School of Medicine

  12. The attributable risk • The difference between the two incidence rates is called the attributable risk: • Attributable risk = 28%-16% = 12% • A risk of 12% (of the disease progressing in three years) is attributable to not taking AZT • 95% Confidence interval for attributable risk is from 6.68% to 17.28% • Attributable risk is an intuitively useful value when studying risk factors Marshall University School of Medicine

  13. Number Needed to Treat (NNT) • Since the attributable risk is 12%, we can interpret this as: • Of those who didn’t receive AZT, 12% (about 1 in 8) progressed to the full disease in 3 years because they didn’t receive AZT • Another 16% also progressed to the disease, but they would have done so anyway… • Another way to look at this is that, for every 8 people or so who are treated, one is prevented from progressing to the full disease • The Number Needed to Treat (NNT) is the reciprocal of the attributable risk • NNT = 1/0.1198 = 8.35 • On average, 8.35 people need to be treated with AZT to prevent 1 from progressing to the full disease in 3 years • The 95% confidence interval is computed from the 95% CI for the attributable risk: • 1/0.1728 to 1/0.0668, or 5.79 to 14.97 Marshall University School of Medicine

  14. Significance tests for contingency tables • The confidence intervals for relative risk and/or attributable risk provide plenty of information about the differences between proportions in the contingency table • If needed, a p-value can also be provided • p-value is associated with a null hypothesis • The proportion of the “positive” outcomes is independent of the treatment group Marshall University School of Medicine

  15. Fisher’s Exact Test and Chi-squared tests • The best statistical test for a contingency table is Fisher’s Exact Test • For large numbers (very large numbers), this test is computationally prohibitive • In this case, a Chi-squared test can be used as an approximation • Historically, Chi-squared tests were always used, but increased computing power makes this unnecessary Marshall University School of Medicine

  16. p-value for the CABG-PTCA study • For the Frye et al. study, the null hypothesis is:The 5-year survival rate is the same for those treated with PTCA as for those treated with CABG • Using Fisher’s Exact test for these data, we get p=0.8122 • Assuming there is no difference in the survival rates between those treated with PTCA and those treated with CABG, there is a 81.22% chance of seeing a difference at least as big as the one observered Marshall University School of Medicine

  17. p-value for Frye’s data with diabetic patients For the restriction to diabetic patients, the p-value, by Fisher’s exact test, is 0.0325 Assuming the survival rate for diabetic patients is the same for those treated with PTCA as for those treated with CABG, there is a 3.25% chance of seeing a difference as large as the one observed Marshall University School of Medicine

  18. Case-control studies • In a case-control study, the investigator selects two groups of subjects: • One group with the disease (or outcome of interest) • One group without • Compare this to a prospective or experimental study where the investigator controls the groups based on the independent variable (treatment or risk factor) • In a case-control study, the investigator then looks back within each group to see how many were exposed to the risk factor or treatment • Sometimes called a “retrospective study” Marshall University School of Medicine

  19. Example: cholera vaccine • Example (Lucas et al. via Motulsky) • Performed a case-control study to measure the effectiveness of a vaccine for cholera • Scientifically ideal experiment is to recruit subjects, randomly give half the vaccine and half a placebo, and follow them to see how many in each group develop cholera • Study would take many years • Unethical if you believe vaccine works • Instead, investigators recruited a group of 43 subjects who had contracted cholera and 172 who had not, and compared how many had been vaccinated in each group Marshall University School of Medicine

  20. Lucas et al study • Note that in this study, investigators control the column totals • In the previous examples, investigators control the row totals • Makes an important difference to the interpretation of calculations Marshall University School of Medicine

  21. Relative risk is meaningless in case-control studies • It makes no sense to compute the risk or relative risk in case-control studies • The risk is the number affected in each group divided by the total in each group • In a case control study, this is determined merely by the choice of the investigator as to how many subjects to place in each group Marshall University School of Medicine

  22. Odds ratios • Results of a case-control study are summarized as an odds ratio • In our example, for the cholera group, the odds of being vaccinated are the number vaccinated divided by the number not vaccinated • 10/33 = 0.303 • The odds of being vaccinated for the controls are 94/78 = 1.205 • The odds ratio is the ratio of the odds: 0.303/1.205 = 0.251 • The odds of having been vaccinated for a cholera victim are 0.251 times the odds of having been vaccinated for a control • The 95% confidence interval for this odds ratio is 0.1166 to 0.5424 Marshall University School of Medicine

  23. Odds ratio and relative risk • If the disease (or other outcome) is rare, then the odds ratio is an approximation to the relative risk • Rare means less than about 10% of the population • So, if we assume cholera is rare in this population, vaccinated individuals have about 25% the risk of getting cholera as unvaccinated individuals Marshall University School of Medicine

  24. Statistical test for case-control studies • The statistical test used for case-control is also a Fisher’s exact test • the null hypothesis in our example is that the proportion who received the vaccine is the same for those with cholera as for those without • Fisher’s exact test gives p=0.0003 in this case • So if there is no difference in the proportion who received the vaccine between those with and those without cholera, the chances of seeing data showing at least as strong a relationship between the two due to sampling would be 0.0003 (or 0.03%). Marshall University School of Medicine

  25. Fisher’s exact test and Chi-squared tests • The best test to use to produce a p-value associated to contingency tests is Fisher’s exact test • Because this is a computationally intensive test, historically Chi-squared tests were used as an approximation • A “standard” chi-squared test will give a lower p-value than is accurate • Potentially much lower if the sample size is small • A corrected is available to the chi-squared test, called the “Yates continuity correction” which generally gives a higher p-value than is correct • Use Fisher’s exact test • If you are forced to use the Chi-squared test, use the Yates continuity correction • For decent sample sizes it makes little difference Marshall University School of Medicine

More Related