Significance Tests

Significance Tests P-values and Q-values

Outline • Statistical significance in multiple testing • Empirical distribution of test statistics • Family-wide p-values • Correlation and p-values • False discovery rates

Tests and Test Statistics • T-test is fairly robust to skew, but not robust to outliers – “thick tails” of distribution • Non-parametric tests are robust, but lose too much ability to detect differences (power) • Robust tests can be useful • Permutation tests are simple and easy to program • Some authors use: rather than To reduce numbers of low fold-changes in highly signficant scores

Distribution of test statistics Quantile plots of t-statistics: left: random distn; right: experiment

Distribution of Set of p-values

Multiple comparisons • Suppose 10,000 genes on a chip • None actually differentially expressed • Each gene has a 5% chance of exceeding the threshold score for a p-value of .05 • Type I error definition • On average, 500 genes should exceed .05 threshold ‘by chance’

Family-Wide Error Rate • ‘Corrected’ p-value: • Probability of finding a single false positive among all N tests • Normally all tests at same threshold • Simplest correction (Bonferroni) • pi* = Npi, (if Npi < 1, otherwise 1) • Fairly close to true false positive rate in simulations of independent tests • Too conservative in practice!

P-Values from Correlated Genes Null distribution from independent genes Null distribution from perfectly correlated genes Null distribution from highly correlated genes Rows: genes; columns: samples; entries: p-values from randomized distribution

The Effect of Correlation • If all genes are uncorrelated, Sidak is exact • If all genes were perfectly correlated • p-values for one are p-values for all • No multiple-comparisons correction needed • Typical gene data is highly correlated • First eigenvalue of SVD may be more than half the variance • More sensitive tests possible if we can generate joint null distribution of p-values

Re-formulating the Question • Independent: ~5% of genes exceed .05 threshold, all the time • Perfectly Correlated: all genes exceed .05 threshold ~5% of the time • Realistically correlated: .05 < f1 < 1 of genes exceeds .05 threshold, .05 < f2 < 1 of the cases • New question: for a given f1 and a, how likely is it that a fraction f1 of genes will exceed the a threshold?

Step-Down p-Values • Calculate single-step p-values for genes: p1, …, pN • Order the smallest k p-values: p(1), …, p(k) • For each k, ask: • How likely are we to get k p-values less than p(k) if no differences are real? • Generate null distribution by permutations • More significant genes, at the same level of Type I error, compared with single-step procedures • See Ge, et al, Test, 2003 • Bioconductor package multtest

False Discovery Rate • At threshold t* what fraction of genes are likely to be true positives? • Illustration: 10,000 independent genes In practice use permutation algorithm to compute FDR

pFDR • How to estimate the FDR? • ‘positive’ False Discovery Rate: • E(#false positives/#positives) * P(#positives >0) • Simes’ inequality allows this to be computed from p-values

Significance Tests