Why N ’ How (I forgot the title)

Why N’ How (I forgot the title) Donald G. McLaren, Ph.D.Department of Neurology, MGH/HMS GRECC, ERNM Veteran’s Hospital http://www.martinos.org/~mclaren 11/15/2012

Types of Data

Types of Data – Dependent Variable • Task Data • Single Condition • Multiple Conditions • Multiple Predictors Per Condition • Functional Connectivity – Correlation • Functional Connectivity -- ICA • Context-Dependent Connectivity • VBM • DTI • Other??

Factors, Levels, Groups, Classes Continuous Variables/Factors: Age, IQ, Volume, Behavioral measures (emotional scale, memory ability), Images, etc. Discrete Variables/Factors: Gender, Handedness, Diagnosis Levels of Discrete : Handedness: Left and Right Gender: Male and Female Diagnosis: Normal, MCI, AD • Group or Class: Specification of All Discrete Factors: • Left-handed Male MCI • Right-handed Female Normal

Overview • From a line to the GLM and matrices • Statistical Tests • Contrasts • Designs • Power • Caveats

General Linear Model(GLM) Y=aX+b

y1 y2 x1 x2 GLM Theory Is Activity correlated with Age? Activity Dependent Variable, Measurement Subject 1 Subject 2 HRF Amplitude IQ, Height, Weight Age Of course, you’d need more then two subjects … Independent Variable

Matrix Formulation Activity Intercept: b y1 Slope: m b m b m y1 y2 1 x1 1 x2 y2 = * x1 x2 Age b= Linear Model System of Linear Equations y1 = 1*b + x1*m y2 = 1*b + x2*m Intercept = Offset X = Design Matrix b = Regression Coefficients = Parameter estimates = “betas” = Intercepts and Slopes Y = X*b

Activity Intercept: b y1 Slope: m m= [0 1]* b m b m b m y1 y2 1 x1 1 x2 y2 = * x1 x2 Age ? g = C*b = 0 b= Hypotheses and Contrasts Is Activity correlated with Age? Does m = 0? Null Hypothesis: H0: m=0 C=[0 1]: Contrast Matrix

Activity Intercept: b y1 Slope: m b= [1 0]* b m b m b m y1 y2 1 x1 1 x2 y2 = * x1 x2 Age ? g = C*b = 0 b= Hypotheses and Contrasts Is Activity different from 0? Does b = 0? Null Hypothesis: H0: b=0 C=[1 0]: Contrast Matrix

Slope: m Activity y1 b= [1 0]* b m Intercept: b b m b m y1 y2 1 x1 1 x2 y2 = * x1 x2 Age ? g = C*b = 0 b= Hypotheses and Contrasts Is Activity different from 0? Does b = 0? Null Hypothesis: H0: b=0 C=[1 0]: Contrast Matrix

Activity y1 b= [1 0]* b m Intercept: b b m b m y1 y2 1 x1 1 x1 y2 = * x1 x2 Age ? g = C*b = 0 b= Hypotheses and Contrasts Is Activity different from 0? Does b = 0? Null Hypothesis: H0: b=0 C=[1 0]: Contrast Matrix

Activity y1 b= [1 ]* b Intercept: b b b y1 y2 1 1 y2 = * x1 x2 Age ? g = C*b = 0 b= Hypotheses and Contrasts Is Activity different from 0? Does b = 0? Null Hypothesis: H0: b=0 C=[1 0]: Contrast Matrix

b m y1 y2 y3 y4 1 x1 1 x2 1 x3 1 x4 = * More than Two Data Points Activity Intercept: b Slope: m Age Y = X*b+n y1 = 1*b + x1*m y2 = 1*b + x2*m y3 = 1*b + x3*m y4 = 1*b + x4*m • Model Error • Noise • Uncertainty

In Matrix Form The General Linear Model observed = predicted + random error

Summary of the GLM Y= X. β+ ε Observed data: Imaging uses a mass univariate approach – that is each voxel is treated as a separate column vector of data. Y is Dependent Brain Value at various subjects/time points at a single voxel Parameters: Define the contribution of each component of the design matrix to the value of Y Estimated so as to minimise the error, ε, i.e. least sums of squares Error: Difference between the observed data, Y, and that predicted by the model, Xβ. Not assumed to be spherical in fMRI Design matrix: Several components which explain the observed data, i.e. the BOLD time series for the voxel Timing info: onset vectors, Omj, and duration vectors, Dmj HRF, hm, describes shape of the expected BOLD response over time Other regressors, e.g. realignment parameters At the group level: these are covariates or grouping columns (see later slide)

Brain Imaging • From the beginning (almost)…. [ 5 6 7 5 ]

Spatial Normalization, Atlas Space Native Space MNI305 Space Subject 1 Subject 1 MNI305 Subject 2 Subject 2

Group Analysis Does not have to be all positive! Contrast Amplitudes Variances (Error Bars) Contrast Amplitudes

Mass Univariate Analyses (1) Run the GLM for each voxel. (2) Compute the statistic from the GLM for each voxel (3) Inferences

+3% 0% -3% Statistical Parametric Map (SPM) Significance t-Map (p,z,F) (Thresholded p<.01) sig=-log10(p) Contrast Amplitude CON, COPE, CES Contrast Amplitude Variance (Error Bars) VARCOPE, CESVAR “Massive Univariate Analysis” -- Analyze each voxel separately

SPM/FSL/AFNI/CUSTOM • It is important to recognize that all programs that utilize the GLM will produce the same result. However, if your design matrices or variance correction methods are different, then you will see differences. • Some slides show illustrations from FSL, others show illustrations from SPM, MATLAB, or other software. These can be done in all programs.

Types Of Analysis

“Random Effects (RFx)” Analysis RFx

“Random Effects (RFx)” Analysis RFx • Model Subjects as a Random Effect • Variance comes from a single source: variance across subjects • Mean at the population mean • Variance of the population variance • Does not take first-level noise into account (assumes 0) • “Ordinary” Least Squares (OLS) • Usually less activation than individuals

“Mixed Effects (MFx)” Analysis MFx RFx • Down-weight each subject based on variance. • Weighted Least Squares vs (“Ordinary” LS)

“Mixed Effects (MFx)” Analysis MFx • Down-weight each subject based on variance. • Weighted Least Squares vs (“Ordinary” LS) • Protects against unequal variances across group or groups (“heteroskedasticity”) • May increase or decrease significance with respect to simple Random Effects • More complicated to compute • “Pseudo-MFx” – simply weight by first-level variance (easier to compute)

“Fixed Effects (FFx)” Analysis FFx RFx

“Fixed Effects (FFx)” Analysis FFx • As if all subjects treated as a single subject (fixed effect) • Small error bars (with respect to RFx) • Large DOF • Same mean as RFx • Huge areas of activation • Not generalizable beyond sample.

Sample 18 Subjects Population vs Sample Group Population (All members) Hundreds? Thousands? Billions? • Do you want to draw inferences beyond your sample? • Does sample represent entire population? • Random Draw?

fMRI Analysis Overview Subject 1 Preprocessing MC, STC, B0 Smoothing Normalization Preprocessing MC, STC, B0 Smoothing Normalization Preprocessing MC, STC, B0 Smoothing Normalization Preprocessing MC, STC, B0 Smoothing Normalization First Level GLM Analysis X X X X X C C C C C Raw Data Subject 2 First Level GLM Analysis Raw Data Higher Level GLM Subject 3 First Level GLM Analysis Raw Data Subject 4 First Level GLM Analysis Raw Data

Second-Level Modeling • These are all random effects (because of variance corrections and using beta’s from the first level) • Mean across subjects divided by variance across subjects. • Low subjects with very low variance between them can lead to a significant finding, even if no subject was significant at the single subject level • Implications for analysis (e.g. SLBT??)

Statistical Tests

Implementing the T-test Variance Estimate Sqrt(Var*cT(XTX)-1c) c = +1 0 0 0 0 0 0 0 t-test H0: cT = 0 contrast ofestimatedparameters T = varianceestimate

Implementing the F-test 0 0 1 0 0 0 0 00 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 1 H0: cT = 0 c = additionalvarianceaccounted forby effects ofinterest F = errorvarianceestimate

Contrasts and the Full Model

T/r/F Notes • If F is a single row contrast, then F=T^2 • An F-test has no direction • In many programs, T-tests are one-tailed, thus have a p-value half of the same F-test • There are formulas to convert between T/r and other statistics (e.g. cohen’s d) • To avoid double-dipping, when you extract an ROI to plot the correlation and get the correlation value, DO NOT make inferences from the plots, but from the voxel-wise analysis.

Contrasts • Identify the Null Hypothesis • Ho: A=B • Make the Null Hypothesis equal 0 • Ho: A-B=0 • Identify the columns for A and B, apply their weights • Ho: 1*A+(-1)*B • Contrast  [1 -1]

Contrasts • What if A and B are not individual columns as in the case of A1,A2,B1,B2… • [1 1 -1 -1] would work, but will over estimate the magnitude of the effect • A is the average A1 A2, or Ho: (A1+A2)/2=0 • [½ ½ 0 0] • B is the average B1 B2, or Ho: (B1+B2)/2=0 • [0 0 ½ ½] • [½ ½ -½ -½]

Why N ’ How (I forgot the title)