410 likes | 494 Views
Statistical Inference and Random Field Theory. Will Penny SPM short course, London, May 2003. M.Brett et al. Introduction to Random Field Theory, To appear in HBF, 2 nd Edition. image data. parameter estimates. design matrix. kernel. General Linear Model model fitting statistic image.
E N D
Statistical Inference and RandomField Theory Will Penny SPM short course, London, May 2003 M.Brett et al. Introduction to Random Field Theory, To appear in HBF, 2nd Edition.
image data parameter estimates designmatrix kernel • General Linear Model • model fitting • statistic image realignment &motioncorrection Random Field Theory smoothing normalisation StatisticalParametric Map anatomicalreference corrected p-values
Overview 1. Terminology 2. Theory 3. Imaging Data 4. Levels of Inference 5. SPM Results +FDR ?
Overview 1. Terminology 2. Theory 3. Imaging Data 4. Levels of Inference 5. SPM Results
Inference at a single voxel NULL hypothesis, H: activation is zero a = p(t>u|H) p-value: probability of getting a value of t at least as extreme as u. If a is small we reject the null hypothesis. u=2 t-distribution
Sensitivity and Specificity ACTION At u1 Don’t Reject Reject Sens=10/10=100% Spec=7/10=70% H True (o) TN FP H False (x) FN TP TRUTH Sensitivity = TP/(TP+FN) = b Specificity = TN/(TN+FP) = 1 - a FP = Type I error or ‘error’ FN = Type II error a = p-value/FP rate/error rate/significance level b = power At u2 Sens=7/10=70% Spec=9/10=90% Eg. t-scores from regions that truly do and do not activate o o o o o o o x x x o o x x x o x x x x u1 u2
Inference at a single voxel NULL hypothesis, H: activation is zero a = p(t>u|H) We can choose u to ensure a voxel-wise significance level of a. This is called an ‘uncorrected’ p-value, for reasons we’ll see later. We can then plot a map of above threshold voxels. u=2 t-distribution
Signal Inference for Images Noise Signal+Noise
Use of ‘uncorrected’ p-value, a=0.1 11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5% Percentage of Null Pixels that are False Positives Using an ‘uncorrected’ p-value of 0.1 will lead us to conclude on average that 10% of voxels are active when they are not. This is clearly undesirable. To correct for this we can define a null hypothesis for images of statistics.
Family-wise Null Hypothesis • Family of hypotheses • Hk k = {1,…,K} • H = H1H2 … HkHK FAMILY-WISE NULL HYPOTHESIS: Activation is zero everywhere If we reject a voxel null hypothesis at any voxel, we reject the family-wise Null hypothesis A FP anywhere gives a Family Wise Error (FWE) Family-wise error rate = ‘corrected’ p-value
Use of ‘uncorrected’ p-value, a=0.1 Use of ‘corrected’ p-value, a=0.1 FWE
The Bonferroni correction Given a family of N independent voxels and a voxel-wise error rate v the Family-Wise Error rate (FWE) or ‘corrected’ error rate is α = 1 – (1-v)N ~ Nv Therefore, to ensure a particular FWE we choose v = α / N A Bonferroni correction is appropriate for independent tests If v=0.05 then over 100 voxels we’ll get 5 voxel-wise type I errors. But we’ll get a much higher α. To ensure α=0.05 we need v=0.0005 ! A correction for multiple comparisons
The Bonferroni correction Independent Voxels Spatially Correlated Voxels Bonferroni is too conservative for brain images
Overview 1. Terminology 2. Theory 3. Imaging Data 4. Levels of Inference 5. SPM Results
Consider a statistic image as a lattice representation of a continuous random field Use results from continuous random field theory Random Field Theory Lattice representation
Topological measure threshold an image at u excursion set Au (Au)=# blobs - # holes At high u, (Au)=# blobs Reject HΩ if Euler char non-zero α Pr((Au)> 0 ) Expected Euler char p–value (at high u) αE[(Au)] Euler Characteristic (EC)
Example – 2D Gaussian images • α = R (4 ln 2) (2π) -3/2 u exp (-u2/2) Voxel-wise threshold, u Number of Resolution Elements (RESELS), R N=100x100 voxels, Smoothness FWHM=10, gives R=10x10=100
Example – 2D Gaussian images • α = R (4 ln 2) (2π) -3/2 u exp (-u2/2) For R=100 and α=0.05 RFT gives u=3.8 Using R=100 in a Bonferroni correction gives u=3.3 Friston et al. (1991) J. Cer. Bl. Fl. M.
Developments 2D Gaussian fields 3D Gaussian fields 3D t-fields Friston et al. (1991) J. Cer. Bl. Fl. M. (Not EC Method) Worsley et al. (1992) J. Cer. Bl. Fl. M. Worsley et al. (1993) Quant. Brain. Func.
Restricted search regions Box and frame have same number of voxels Box has 16 markers Frame has 32 markers
Au Unified Theory • General form for expected Euler characteristic • 2, F, & t fields • restricted search regions α = S Rd (W)rd (u) rd (u): EC density; depends on type of field (eg. Gaussian, t) and the threshold, u. Rd (W): RESEL count; depends on the search region – how big, how smooth, what shape ? Worsley et al. (1996), HBM
Au Unified Theory • General form for expected Euler characteristic • 2, F, & t fields • restricted search regions α = S Rd (W)rd (u) rd (u):d-dimensional EC density – E.g. Gaussian RF: r0(u) = 1- (u) r1(u) = (4 ln2)1/2 exp(-u2/2) / (2p) r2(u) = (4 ln2) exp(-u2/2) / (2p)3/2 r3(u) = (4 ln2)3/2 (u2 -1) exp(-u2/2) / (2p)2 r4(u) = (4 ln2)2 (u3 -3u) exp(-u2/2) / (2p)5/2 Rd (W): RESEL count R0(W) = (W) Euler characteristic of W R1(W) = resel diameter R2(W) = resel surface area R3(W) = resel volume Worsley et al. (1996), HBM
Resel Counts for Brain Structures (1) Threshold depends on Search Volume (2) Surface area makes a large contribution FWHM=20mm
Overview 1. Terminology 2. Theory 3. Imaging Data 4. Levels of Inference 5. SPM Results
Functional Imaging Data • The Random Fields are the component fields, Y = Xw +E, e=E/σ • We can only estimate the component fields, using estimates of w and σ • To apply RFT we need the RESEL count which requires smoothness estimates
Y = X + ? errors Component fields voxels ? = + parameters design matrix data matrix scans variance componentfields
^ Estimated component fields voxels ? ? = + parameters design matrix errors data matrix scans parameterestimates • estimate residuals estimated variance = Each row is an estimated component field estimatedcomponentfields
Roughness || Point Response Function PRF Gaussian PRF fx 0 0 = 0 fy 0 0 0 fz || = (4ln(2))3/2 / (fx fy fz) RESEL COUNT R3() = () / (fx fy fz) α = R3() (4ln(2))3/2 (u 2 -1) exp(-u 2/2) / (2)2 Smoothness Estimation Approximate the peak of the Covariance function with a Gaussian
Model fit & assumptions valid distributional results Multivariate normality of component images Covariance function of component images must be - Stationary(pre SPM99) - Can be nonstationary (SPM99 onwards) - Twice differentiable Smoothness smoothness » voxel size lattice approximation smoothness estimation practically FWHM 3 VoxDim otherwise conservative Typical applied smoothing: Single Subj fMRI: 6mm PET: 12mm Multi Subj fMRI: 8-12mm PET: 16mm RFT Assumptions
Overview 1. Terminology 2. Theory 3. Imaging Data 4. Levels of Inference 5. SPM Results
Cluster and Set-level Inference • We can increase sensitivity by trading off anatomical specificity • Given a voxel level threshold u, we can compute the likelihood (under the null hypothesis) of getting n or more connected components in the excursion set ie. a cluster containing at least n voxels CLUSTER-LEVEL INFERENCE • Similarly, we can compute the likelihood of getting c clusters each having at least n voxels SET-LEVEL INFERENCE Weak vs Strong control over FWE
n=12 n=82 n=32 Levels of inference voxel-level P(c 1 | n > 0, t 4.37) = 0.048 (corrected) At least one cluster with unspecified number of voxels above threshold set-level P(c 3 | n 12, u 3.09) = 0.019 At least 3 clusters above threshold cluster-level P(c 1 | n 82, t 3.09) = 0.029 (corrected) At least one cluster with at least 82 voxels above threshold
Overview 1. Terminology 2. Theory 3. Imaging Data 4. Levels of Inference 5. SPM Results
SPM99 results I Activations Significant at Cluster level But not at Voxel Level
SPM99 results II Activations Significant at Voxel and Cluster level
False Discovery Rate ACTION At u1 Don’t Reject Reject FDR=3/13=23% a=3/10=30% H True (o) TN FP H False (x) FN TP TRUTH At u2 FDR=1/8=13% a=1/10=10% Eg. t-scores from regions that truly do and do not activate FDR = FP/(FP+TP) o o o o o o o x x x o o x x x o x x x x a = FP/(FP+TN) u1 u2
Signal False Discovery RateIllustration: Noise Signal+Noise
Control of Familywise Error Rate at 10% FWE Occurrence of Familywise Error Control of False Discovery Rate at 10% 6.7% 10.5% 12.2% 8.7% 10.4% 14.9% 9.3% 16.2% 13.8% 14.0% Percentage of Activated Pixels that are False Positives
Summary • We should correct for multiple comparisons • We can use Random Field Theory (RFT) • RFT requires (i) a good lattice approximation to underlying multivariate Gaussian fields, (ii) that these fields are continuous with a twice differentiable correlation function • To a first approximation, RFT is a Bonferroni correction using RESELS. • We only need to correct for the volume of interest. • Depending on nature of signal we can trade-off anatomical specificity for signal sensitivity with the use of cluster-level inference.