140 likes | 311 Views
Why multiple tests are a problem?. Rafael A. Irizarry. Other names. Multiple comparisons Data snooping Others?. References. H. Scheffe (1953), “ A method for judging all contrasts in the analysis of variance”, Biometrika 40:87-104
E N D
Why multiple tests are a problem? Rafael A. Irizarry
Other names • Multiple comparisons • Data snooping • Others?
References • H. Scheffe (1953), “A method for judging all contrasts in the analysis of variance”, Biometrika 40:87-104 • D.B. Duncan (1965), “A Bayesian Approach to multiple comparisons” Technometrics 7:171-222. • J.W. Tukey (1953), “The problem on multiple comparisons” reprinted in CWJWT Vol. VIII (1994) • R.G. Miller, Simultaneous Statistical nference, 2nd ed. (Springer 1981)
Thanks to Yoav Benjamini Benjamini and Hochberg (1995) “Controlling the false discovery rate: a practical and powerful approach to multiple testing”. JR Stat. Soc. Ser. B
Example E. Giovannucci, A. Ascherio, E. Rimm, M. Stampfer, G. Coldizt, W. Willett: ‘‘Intake of Carotenoids and Retinol in Relation to Risk of Prostate Cancer’’, Journal of the National Cancer Insitute 87(23):1767--1776 (6 Dec 1995).
‘‘Using responses to a validated, semiquantitative food Frequency questionnaire mailed to participants in the Health Professionals Follow-up Study in 1986, we assessed dietary intake for a 1-year period for a cohort of 47,894 eligible subjects initially free of diagnosed cancer....We calculate the relative risk (RR) for each of the upper categories of intake of a specific food or nutrient by dividing the incidence of prostate cancer among men in each of these categories by the rate among men in the lowest intake level....
‘‘Of 46 vegetables and fruits or related products, four were significantly associated with lower prostate cancer risk; of the four --- tomato sauce (P for trend = 0.001), tomatoes (P for trend = 0.03), and pizza (P for trend = 0.05), but not strawberries --- were primary sources of lycopene.’’
BUT the Methods section one page later states:‘‘For each of 131 food and beverage items listed ...’’And the (presumably strongest) carotenoids and p-valuesare listed in Table 2 (p.1770):Tomato sauce Tomatoes Tomato juice Pizza 0.001 0.03 0.67 0.05‘‘Our findings ... suggest that tomato-based foods may beespecially beneficial regarding prostate cancer risk.’’
What is a p-value again? When nothing protects, we expect 131 x 0.05 7 foods/nutrients to have p-values < 0.05
Microarrays When no genes are changing between two groups we expect 20,000 x 0.01 = 200 genes to have p-value < 0.01 However, false positives are not as bad as in other fields
What can we do? • p-values no longer mean what they used to… no argument • Histogram of p-values is useful plot • What can we do… lots of argument
Multiple Hypothesis Testing Null = Equivalent Expression; Alternative = Differential Expression
Error Rates • Per comparison error rate (PCER): the expected value of the number of Type I errors over the number of hypotheses PCER = E(V)/m • Per family error rate (PFER): the expected number of Type I errors PFER = E(V) • Family-wise error rate: the probability of at least one Type I error FEWR = Pr(V ≥ 1) • False discovery rate (FDR) rate that false discoveries occur FDR = E(V/R; R>0) = E(V/R | R>0)Pr(R>0) • Positive false discovery rate (pFDR): rate that discoveries are false pFDR = E(V/R | R>0) • Many others
Conclusions • Lets do a multiple comparison of the different beers sold by the IF