340 likes | 355 Views
This short course provides an overview of statistical inference, multiple comparisons, and random field theory in the context of voxel-by-voxel hypothesis testing. It covers topics such as power, false discovery rate, generalisability, and non-parametric inference. The course also discusses statistical parametric mapping, multiple comparisons terminology, threshold tests, and the Bonferroni correction.
E N D
Statistical Inference, Multiple Comparisonsand Random Field Theory Andrew Holmes SPM short course, May 2002
Overview… …a voxel by voxel hypothesis testing approach • reliably identify regions showing a significant experimental effect of interest • Assessment of statistic images • multiple comparisons • random field theory • smoothness • spatial levels of inference & power • false discovery rate later... • Generalisability, random effects & population inference • inferring to the population • group comparisons • Non-parametric inference later...
image data parameter estimates designmatrix kernel • General Linear Model • model fitting • statistic image realignment &motioncorrection random field theory smoothing normalisation StatisticalParametric Map anatomicalreference corrected p-values
condition 1 condition 2 Statistical Parametric Mapping… – parameter estimate variance estimate statistic image orSPM = voxel by voxelmodelling
Null hypothesis H test statistic null distributions Hypothesis test control Type I error incorrectly reject H test level Pr(“reject” H | H) test size Pr(“reject H | H) p –value min a at which Hrejected Pr(T t | H) characterising “surprise” t –distribution, 32 df. F –distribution, 10,32 df. Classical hypothesis testing…
Multiple comparisons… t59 • Threshold at p ? • expect (100 p)% by chance • Surprise ? • extreme voxel values • voxel level inference • big suprathreshold clusters • cluster level inference • many suprathreshold clusters • set level inference • Power & localisation • sensitivity • spatial specificity p = 0.05 Gaussian10mm FWHM (2mm pixels)
Multiple comparisons terminology… • Family of hypotheses • Hk k = {1,…,K} • H = Hk • Familywise Type I error • weak control – omnibus test • Pr(“reject” HH) • “anything, anywhere”? • strong control – localising test • Pr(“reject” HW HW) W: W & HW • “anything, & where”? • Adjusted p–values • test level at which reject Hk
Threshold u tk > u reject Hk reject any Hk reject H reject H if tmax > u Valid test weak control Pr(Tmax > uH) strong control since W Pr(TWmax > uHW) Adjusted p –values Pr(Tmax > tkH) p = 0.05 p = 0.0001 p = 0.0000001 Simple threshold tests… u
“The” Bonferroni inequality Carlo Emilio Bonferroni (1936) For any set of events Ak : Bonferroni correction Ak : correctly “accept” Hk Tk < u & Hk Assess Hk at level ' correction ' = / K Adjusted p –values min(1,Kpk ) Conservative for correlated tests independent: K tests some dependence : ? tests totally dependent: 1 test ua = -1(1-/K) The “Bonferroni” correction… 5mm 10mm 15mm
Consider statistic image as lattice representation of a continuous random field Use results from continuous random field theory SPM approach: Random fields… lattice represtntation
Topological measure of excursion set Au Au = {x R3 : Z(x) > u} # components - # “holes” Single threshold test large u, near Tmax Euler char. #local max Expected Euler char p–value Pr(Zmax > u ) Pr((Au)> 0 ) E[(Au)] single threshold test u s.t. E[(Au)] = Euler characteristic…
E[(Au)] () ||(u 2 -1) exp(-u 2/2) / (2)2 largesearch region R3 ( volume || smoothness Au excursion set Au = {x R3 : Z(x) > u} Z(x) Gaussian random field x R3+ Multivariate Normal Finite Dimensional distributions + continuous + strictly stationary + marginal N(0,1) + continuously differentiable + twice differentiable at 0 + Gaussian ACF(at least near local maxima) Au Expected Euler characteristic…
Smoothness || variance-covariance matrix of partial derivatives (possibly location dependent) Point Response Function PRF Full Width at Half Maximum FWHM Gaussian PRF – kernel var/cov matrix ACF 2 = (2)-1 FWHM f = (8ln(2)) fx 0 0 = 0 fy 0 1 0 0 fz 8ln(2) ignoring covariances || = (4ln(2))3/2 / (fx fy fz) Resolution Element (RESEL) Resel dimensions (fx fy fz) R3() = () / (fx fy fz) if strictly stationary E[(Au)] = R3() (4ln(2))3/2 (u 2 -1) exp(-u 2/2) / (2)2 R3() (1 – (u))for high thresholds u Smoothness, PRF, resels...
Y = X + ^ Component fields… voxels ? ? = + parameters design matrix errors data matrix scans s2 variance parameterestimates • estimate residuals estimated variance = estimatedcomponentfields “Image regression”
Smoothness estimation… • Smoothness • from standardised residuals • empirical derivatives at each voxel • Resels per voxel (RPV) – an “image” of smoothness • correction for estimation of variance field 2 • function of degrees of freedom • covariances often ignored • Euler Characteristics • using discrete methods
Au Unified p-values… • General form for expected Euler characteristic • 2, F, & t fields • restricted search regions •D dimensions • E[(WAu)] = S Rd (W)rd (u) Rd (W):d-dimensional Minkowski functional of W – function of dimension, spaceWand smoothness: R0(W) = (W) Euler characteristic of W R1(W) = resel diameter R2(W) = resel surface area R3(W) = resel volume rd (W):d-dimensional EC density of Z(x) – function of dimension and threshold, specific for RF type: E.g. Gaussian RF: (strictly stationary &c…) r0(u) = 1- (u) r1(u) = (4 ln2)1/2 exp(-u2/2) / (2p) r2(u) = (4 ln2) exp(-u2/2) / (2p)3/2 r3(u) = (4 ln2)3/2 (u2 -1) exp(-u2/2) / (2p)2 r4(u) = (4 ln2)2 (u3 -3u) exp(-u2/2) / (2p)5/2
Primary threshold u examine connected components of excursion set Suprathreshold clusters Reject HW for clusters of voxels W of size S > s Localisation (Strong control) at cluster level increased power esp. high resolutions (f MRI) Thresholds, p –values Pr(Smax > s H ) Nosko, Friston, (Worsley) Poisson occurrence (Adler) Assumme form for Pr(S=s|S>0) Suprathreshold cluster tests… 5mm FWHM 10mm FWHM 15mm FWHM (2mm2 pixels)
n=12 n=82 n=32 Levels of inference… voxel-level P(c 1 | n 0, t 4.37) = 0.048 (corrected) P(t 4.37) = 1 - {4.37} < 0.001 (uncorrected) omnibus P(c7 | n 0, u 3.09) = 0.031 set-level P(c 3 | n 12, u 3.09) = 0.019 Parameters u - 3.09 k - 12 voxels S - 323 voxels FWHM - 4.7 voxels D - 3 cluster-level P(c 1 | n 82, t 3.09) = 0.029 (corrected) P(n 82 | t 3.09) = 0.019 (uncorrected)
Model fit & assumptions valid distributional results Multivariate normality of component images Strict stationarity (pre SPM99) of component images homogeneous spatial structure Smoothness smoothness » voxel size lattice approximation smoothness estimation practically FWHM 3 VoxDim otherwise conservative (voxel level) lax (spatial extent) spatial smoothing? temporal smoothing? Assumptions…
Random effects & variance components • Fixed effects • Are you confident that a new observation from any of subjects 1-3 will be greater than zero? • Yes!using within-subjects variance • infer for these subjects – case study • Random effects • Are you confident that a new observation from a new subject will be greater than zero? • No!using between-subjects variance • infer for any subject – population
^ 1 ^ ^ 2 ^ ^ 3 ^ ^ 4 ^ ^ 5 ^ ^ 6 ^ Multi-subject analysis…? estimated mean activation image p < 0.001 (uncorrected) SPM{t} — ^ •– c.f. 2 / nw – c.f. p < 0.05 (corrected) SPM{t}
^ 1 ^ ^ 2 ^ ^ 3 ^ ^ 4 ^ ^ 5 ^ ^ 6 ^ Two-stage analysis of random effect… level-one(within-subject) level-two(between-subject) an estimate of the mixed-effects model variance 2+2/w ^ variance 2 (no voxels significant at p < 0.05 (corrected)) — ^ •– c.f. 2/n = 2 /n + 2 / nw – c.f. p < 0.001 (uncorrected) SPM{t} contrast images timecourses at [ 03, -78, 00 ]
Two stage random effects group comparison vs. two-sample t-test 12 subjects level-one(within-subject) contrast images level-two(between-subject)
Multi-stage multi-level modelling… estimated contrasts from level-1 fits, level-2 model & level-2 contrasts level-1 data, model & contrast(s) parameter estimation inference level 2 estimated contrasts and residual variance level 2(population)inference
Hypothesis testing !? • Why test? • reliability genuine effects integrity of research (hopefully) • The fallacy… • point null hypothesis(no change) • things are never the same! (always some small chance change) • given enough observations can always reject null hypothesis ! • fMRI !?(lots of observations) …testing, rather than estimating • significant important !? …and: “absence of evidence isnotevidence of absence” !?
Ch5 Ch4 Multiple Comparisons,& Random Field Theory Worsley KJ, Marrett S, Neelin P, Evans AC (1992) “A three-dimensional statistical analysis for CBF activation studies in human brain”Journal of Cerebral Blood Flow and Metabolism12:900-918 Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC (1995) “A unified statistical approach for determining significant signals in images of cerebral activation”Human Brain Mapping 4:58-73 Friston KJ, Worsley KJ, Frackowiak RSJ, Mazziotta JC, Evans AC (1994)“Assessing the Significance of Focal Activations Using their Spatial Extent”Human Brain Mapping 1:214-220 Cao J (1999)“The size of the connected components of excursion sets of 2, t and F fields”Advances in Applied Probability (in press) Worsley KJ, Marrett S, Neelin P, Evans AC (1995)“Searching scale space for activation in PET images”Human Brain Mapping 4:74-90 Worsley KJ, Poline J-B, Vandal AC, Friston KJ (1995)“Tests for distributed, non-focal brain activations”NeuroImage 2:183-194 Friston KJ, Holmes AP, Poline J-B, Price CJ, Frith CD (1996)“Detecting Activations in PET and fMRI: Levels of Inference and Power”Neuroimage 4:223-235
index • overview • multiple comparisons • random field theory • random effects • hypothesis testing fallacy