170 likes | 262 Views
Non -specific filtering and control of false positives: an update. Richard Bourgon 16 March 2009 bourgon@ebi.ac.uk. Experiment-wide type I error rates.
E N D
Non-specific filtering and control offalse positives: an update Richard Bourgon 16 March 2009 bourgon@ebi.ac.uk
Experiment-wide type I error rates • Family-wise error rate:P(V > 0), i.e., the probability of one or more false positives. For large m0, this is very difficult to keep small. • False discovery rate (FDR): let Q = V/R, or 0 if R is 0. The FDR is E(Q), i.e., the expected fraction of false positives among all discoveries.
A nice property of CDFs for continuous RVs > X = rnorm(100000) > F = pnorm > hist(X, breaks = 50) > hist(F(X), breaks = 50)
A “nice” property? • To compute a p-value for testing a null hypothesis H0, we typically… • Define a test statistic T, and compute its value t for the observed data. • Assume we know the distribution of T when H0 is true: F0. • Compute p = 1 – F0(t), i.e., define p = P(T > t | H0 is true). • Compare p to some α. • Now define the random variable P = 1 – F0(T). If H0 is true, then… • F0(T) is uniformly distribution on [0,1]. • By symmetry, P is uniformly distribution on [0,1] as well. • Suppose 20% of genes are differentially expressed, so that
Non-specific filtering • For a given gene, write the data as ((c1,Y1),…,(cp,Yp)). • First group (c = 1): i = 1, …, p1. • First group (c = 2): i = p1 + 1, …, p1 + p2. • Conditions under which we expect little variation in Y: • Genes which are absent in both samples. (Probes will still report noise and cross-hybridization, typically at the same level in both groups.) • Probe sets which do not respond to target. • Genes which are not differentially expressed. • A “non-specific” filter: • Ignores c1, …, cp, i.e., f(Y). • Helps identify any of these three classes, based on our a priori understanding of array behavior. • Apply standard testing to genes passing the filter, using some g(c,Y).
Increased detection rate • Stage 1 non-specific filter statistic: compute and remove the θ smallest. • Stage 2: standard two-sample t-test for genes passing stage 1.
Increased power? • An increased detection rate implies increased power only if we are still controlling type I errors at the nominal level.
Result: independence of stage 1 and stage 2 test statistics • For genes for which the null hypotheses is true, f(Y) and g(c, Y) are statistically independent in both of the following cases: • For normally distributed data: • Stage 1: overall variance, • Stage 2: the standard two-sample t-statistic. • Non-parametrically: • Stage 1: any function of the data which doesn’t depend on the order of the arguments. S2 above, or the IQR, are both candidates. • Stage 2: the Wilcoxon rank sum test statistic. • Both can be extended to the multi-class context: ANOVA and Kruskal Wallis. • Bonferroni and Holm go through easily — in expectation.
Independence: Benjamini & Hochberg and Storey FDR adjustments • What is the FDR associated with use of cutoff α? Naive estimator: • V is not observable, but E(V) is m0α, bounded by mα. • E(R) cannot be computed, but R can be used as an estimator. • Evaluating at each p(i) using morgives BH95 or Storey adjustments, respectively:
Independence: Benjamini & Hochberg and Storey FDR adjustments The foregoing motivation for the BH95 and Storey procedures uses E(V(α)) = m0α. Marginal independence of true null f(Y) and g(c,Y) means that this still applies at stage 2 in expectation. Define M0 to be the random number of true nulls passing stage 1. Then
For true nulls, we show independence between P and f(Y) over repeated data realizations. The P within a single realization may be correlated. • FDR control is on average only: no guarantees for a single realization Repeated data realizations Genes: stage I and stage II statistics
Correlation and a single data instance • Given pervasive correlation (here, all pairs at +ρ), the empirical distribution of p-values for a single data instance can vary widely. Most extreme distributions in 1000 trials
FWER: Westfall and Young • Westfall and Young (1993) controls FWER with more power, but depends on the joint distribution of all p-values: • WY93 is valid under subset pivotality. If this holds for the one-stage procedure, it holds for the two-stage non-specific filtering approach as well. • Distribution of min Pj under is typically estimated by permutation. If filtering changes correlation structure, new structure is used by permutation!
Correlation and FDR control Storey et al. q-values. Correlation: all pairs at +ρ. Some anti-conservative bias in FDR estimation. oFDR substantially greater than nominal for a small fraction of data instances. BH more conservative, since fixed at 1.
Conclusions • In actual examples, non-specific filtering leads to (biologically) significant increases in the number of genes identified. • Commonly used stage 1/stage 2 test statistic pairs are statistically independent for genes which are not differentially express • Given this independence, Bonferroni and Holm FWER control is valid in the two-stage procedure. • Correlation structure may change under filtering. • Permutation-based Westfall and Young correction accounts for this. FDR control, however, may suffer. • Effect of filtering on correlation can be checked, and impact, assessed.