Understanding Linear Models & Contrasts in Statistical Mapping: A Comprehensive Guide

SPM short course – May 2009Linear Models and Contrasts Jean-Baptiste Poline Neurospin,I2BM, CEA Saclay, France

Adjusted data Your question: a contrast Statistical Map Uncorrected p-values Design matrix images Spatial filter • General Linear Model • Linear fit • statistical image realignment & coregistration Random Field Theory smoothing normalisation Anatomical Reference Corrected p-values

Make sure we know all about the estimation (fitting) part .... Plan • Make sure we understand the testing procedures : t- and F-tests • But what do we test exactly ? • An example – almost real

One voxel = One test (t, F, ...) amplitude • General Linear Model • fitting • statistical image time Statistical image (SPM) Temporal series fMRI voxel time course

90 100 110 -10 0 10 90 100 110 -2 0 2 b2 b1 b2 = 100 b1 = 1 Fit the GLM Mean value voxel time series box-car reference function Regression example… + + =

-2 0 2 90 100 110 90 100 110 -2 0 2 b2 b1 b2 = 100 b1 = 5 Fit the GLM Mean value voxel time series Regression example… + + = box-car reference function

…revisited : matrix form b2 = b1 + + error Y es = + + ´ ´ f(t) 1 b1 b2

b1 b2 Box car regression: design matrix… data vector (voxel time series) parameters error vector design matrix a = ´ + m ´ Y = X b + e

Add more reference functions ... Discrete cosine transform basis functions

…design matrix = the betas (here : 1 to 9) parameters error vector design matrix data vector b1 a m b3 b4 b5 b6 b7 b8 b9 b2 = + ´ = + Y X b e

S = s2 the squared values of the residuals number of time points minus the number of estimated betas Fitting the model = finding some estimate of the betas= minimising the sum of square of the residuals S2 raw fMRI time series adjusted for low Hz effects fitted box-car fitted “high-pass filter” residuals

Take home ... • We put in our model regressors (or covariates) that represent how we think the signal is varying (of interest and of no interest alike) QUESTION IS : WHICH ONE TO INCLUDE ? • Coefficients (=parameters) are estimated using the Ordinary Least Squares (OLS) by minimizing the fluctuations, - variability – variance – of the noise – the residuals • Because the parameters depend on the scaling of the regressors included in the model, one should be careful in comparing manually entered regressors • The residuals, their sum of squares and the resulting tests (t,F), do not depend on the scaling of the regressors.

Plan • Make sure we all know about the estimation (fitting) part .... • Make sure we understand t and F tests • But what do we test exactly ? • An example – almost real

c’ = 1 0 0 0 0 0 0 0 contrast ofestimatedparameters c’b T = T = SPM{t} varianceestimate s2c’(X’X)+c T test - one dimensional contrasts - SPM{t} A contrast= a linear combination of parameters: c´´b box-car amplitude > 0 ? = b1 > 0 ? => b1b2b3b4b5.... Compute 1xb1+ 0xb2+ 0xb3+ 0xb4+ 0xb5+ . . . and divide by estimated standard deviation

contrast ofestimatedparameters varianceestimate Estimation [Y, X] [b, s] Y = X b + ee ~ s2 N(0,I)(Y : at one position) b = (X’X)+ X’Y (b: estimate of b) -> beta??? images e = Y - Xb(e: estimate of e) s2 = (e’e/(n - p)) (s: estimate of s, n: time points, p: parameters) -> 1 image ResMS Test [b, s2, c] [c’b, t] Var(c’b) = s2c’(X’X)+c (compute for each contrast c, proportional to s2) t = c’b / sqrt(s2c’(X’X)+c) (c’b -> images spm_con??? compute the t images -> images spm_t??? ) under the null hypothesis H0 : t ~ Student-t( df ) df = n-p How is this computed ? (t-test)

additionalvarianceaccounted forby tested effects X0 X0 X1 F = errorvarianceestimate S2 S02 F ~ (S02 - S2 ) /S2 Or this one? F-test : a reduced model or ... Tests multiple linear hypotheses : Does X1 model anything ? H0: True (reduced) model is X0 This (full) model ?

H0: b3-9 = (0 0 0 0 ...) H0: True model is X0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 X0 X0 X1(b3-9) c’ = SPM{F} This model ? Or this one ? F-test : a reduced model or ... multi-dimensional contrasts ? tests multiple linear hypotheses. Ex : does DCT set model anything? test H0 : c´´ b = 0 ?

Estimation [Y, X] [b, s] Y=X b + ee ~ N(0, s2 I) Y=X0b0+ e0e0 ~ N(0, s02 I) X0 : X Reduced Estimation [Y, X0] [b0, s0] b0 = (X0’X0)+ X0’Y e0 = Y - X0 b0(eà: estimate of eà) s20 = (e0’e0/(n - p0)) (sà: estimate of sà, n: time, pà: parameters) Test [b, s, c] [ess, F] F ~ (s0 - s) / s2 -> image spm_ess??? -> image of F : spm_F??? under the null hypothesis : F ~ F(p - p0, n-p) additionalvariance accounted forby tested effects How is this computed ? (F-test) Error varianceestimate

T and F test: take home ... • T tests are simple combinations of the betas; they are either positive or negative (b1 – b2 is different from b2 – b1) • F tests can be viewed as testing for the additional variance explained by a larger model wrt a simpler model, or • F test the sum of the squares of one or several combinations of the betas • in testing “single contrast” with an F test, for ex. b1 – b2, the result will be the same as testing b2 – b1. It will be exactly the square of the t-test, testing for both positive and negative effects.

Plan • Make sure we all know about the estimation (fitting) part .... • Make sure we understand t and F tests • But what do we test exactly ? Correlation between regressors • An example – almost real

« Additional variance » : Again Independent contrasts

« Additional variance » : Again Testing for the green correlated regressors, for example green: subject age yellow: subject score

« Additional variance » : Again Testing for the red correlated contrasts

« Additional variance » : Again Testing for the green Entirely correlated contrasts ? Non estimable !

« Additional variance » : Again Testing for the green and yellow If significant ? Could be G or Y ! Entirely correlated contrasts ? Non estimable !

Plan • Make sure we all know about the estimation (fitting) part .... • Make sure we understand t and F tests • But what do we test exactly ? Correlation between regressors • An example – almost real

C1 V A C1 C2 C3 V C2 C3 C1 C2 A C3 A real example (almost !) Experimental Design Design Matrix Factorial design with 2 factors : modality and category 2 levels for modality (eg Visual/Auditory) 3 levels for category (eg 3 categories of words)

Design Matrix not orthogonal • Many contrasts are non estimable • Interactions MxC are not modelled Asking ourselves some questions ... V A C1 C2 C3 Test C1 > C2 : c = [ 0 0 1 -1 0 0 ] Test V > A : c = [ 1 -1 0 0 0 0 ] [ 0 0 1 0 0 0 ] Test C1,C2,C3 ? (F) c = [ 0 0 0 1 0 0 ] [ 0 0 0 0 1 0 ] Test the interaction MxC ?

Modelling the interactions

C1 C1 C2 C2 C3 C3 • Design Matrix orthogonal • All contrasts are estimable • Interactions MxC modelled • If no interaction ... ? Model is too “big” ! Asking ourselves some questions ... Test C1 > C2 : c = [ 1 1 -1 -1 0 0 0] V A V A V A Test V > A : c = [ 1 -1 1 -1 1 -1 0] Test the category effect : [ 1 1 -1 -1 0 0 0] c = [ 0 0 1 1 -1 -1 0] [ 1 1 0 0 -1 -1 0] Test the interaction MxC : [ 1 -1 -1 1 0 0 0] c = [ 0 0 1 -1 -1 1 0] [ 1 -1 0 0 -1 1 0]

Asking ourselves some questions ... With a more flexible model C1 C1 C2 C2 C3 C3 V A V A V A Test C1 > C2 ? Test C1 different from C2 ? from c = [ 1 1 -1 -1 0 0 0] to c = [ 1 0 1 0 -1 0 -1 0 0 0 0 0 0] [ 0 1 0 1 0 -1 0 -1 0 0 0 0 0] becomes an F test! Test V > A ? c = [ 1 0 -1 0 1 0 -1 0 1 0 -1 0 0] is possible, but is OK only if the regressors coding for the delay are all equal

Design and contrast SPM(t) or SPM(F) Fitted and adjusted data Convolution model

Toy example: take home ... • F tests have to be used when • Testing for >0 and <0 effects • Testing for more than 2 levels • Conditions are modelled with more than one regressor • F tests can be viewed as testing for • the additional variance explained by a larger model wrt a simpler model, or • the sum of the squares of one or several combinations of the betas (here the F test b1 – b2 is the same as b2 – b1, but two tailed compared to a t-test).

Plan • Make sure we all know about the estimation (fitting) part .... • Make sure we understand t and F tests • But what do we test exactly ? Correlation between regressors • A (nearly) real example • A bad model ... And a better one

True signal and observed signal (---) Model (green, pic at 6sec) TRUE signal (blue, pic at 3sec) Fitting (b1 = 0.2, mean = 0.11) Residual(still contains some signal) => Test for the green regressor not significant A bad model ...

A bad model ... b1= 0.22 b2= 0.11 Residual Variance = 0.3 P(Y|b1= 0) => p-value = 0.1 (t-test) P(Y|b1= 0) => p-value = 0.2 (F-test) = + Y X b e

True signal + observed signal Model (green and red) and true signal (blue ---) Red regressor : temporal derivative of the green regressor Global fit (blue) and partial fit (green & red) Adjusted and fitted signal Residual(a smaller variance) => t-test of the green regressor significant => F-test very significant => t-test of the red regressor very significant A « better » model ...

A better model ... b1= 0.22 b2= 2.15 b3= 0.11 Residual Var = 0.2 P(Y|b1= 0) p-value = 0.07 (t-test) P(Y|b1= 0, b2= 0) p-value = 0.000001 (F-test) = + Y X b e

Flexible models : Gamma Basis

Summary ... (2) • The residuals should be looked at ...! • Test flexible models if there is little a priori information • In general, use the F-tests to look for an overall effect, then look at the response shape • Interpreting the test on a single parameter (one regressor) can be difficult: cf the delay or magnitude situation • BRING ALL PARAMETERS AT THE 2nd LEVEL

Lunch ?

Plan • Make sure we all know about the estimation (fitting) part .... • Make sure we understand t and F tests • A (nearly) real example • A bad model ... And a better one • Correlation in our model : do we mind ? ?

Model (green and red) Fit (blue : global fit) Residual Correlation between regressors True signal

Correlation between regressors b1= 0.79 b2= 0.85 b3 = 0.06 Residual var. = 0.3 P(Y|b1= 0) p-value = 0.08 (t-test) P(Y|b2= 0) p-value = 0.07 (t-test) P(Y|b1= 0, b2= 0) p-value = 0.002 (F-test) = + Y X b e

Model (green and red) red regressor has been orthogonalised with respect to the green one  remove everything that correlates with the green regressor Fit Residual Correlation between regressors - 2 true signal

Correlation between regressors -2 0.79*** 0.85 0.06 b1= 1.47 b2= 0.85 b3 = 0.06 Residual var. = 0.3 P(Y|b1= 0) p-value = 0.0003 (t-test) P(Y|b2= 0) p-value = 0.07 (t-test) P(Y|b1= 0, b2= 0) p-value = 0.002 (F-test) = + Y X b e See « explore design »

1 2 1 2 1 1 2 2 1 1 2 2 Design orthogonality : « explore design » Black = completely correlated White = completely orthogonal Corr(1,1) Corr(1,2) Beware: when there are more than 2 regressors (C1,C2,C3,...), you may think that there is little correlation (light grey) between them, but C1 + C2 + C3 may be correlated with C4 + C5

Summary • We implicitly test for an additional effect only, be careful if there is correlation • Orthogonalisation = decorrelation : not generally needed • Parameters and test on the non modified regressor change • It is always simpler to have orthogonal regressors and therefore designs ! • In case of correlation, use F-tests to see the overall significance. There is generally no way to decide to which regressor the « common » part should be attributed to • Original regressors may not matter: it’s the contrast you are testing which should be as decorrelated as possible from the rest of the design matrix

Conclusion : check your models • Check your residuals/model - multivariate toolbox • Check group homogeneity - Distance toolbox • Check your HRF form - HRF toolbox www.madic.org !

Understanding Linear Models & Contrasts in Statistical Mapping: A Comprehensive Guide