280 likes | 457 Views
Temporal Basis Functions & Correlated Regressors. Gary Price & Patti Adank. fMRI for Dummies 29-03-06. Correlated Regressors (or: the trouble with multicollinearity…). by (a slightly puzzled) Patti Adank. fMRI for Dummies 29-03-06. Sources: Will Penny
E N D
Temporal Basis Functions & Correlated Regressors Gary Price & Patti Adank fMRI for Dummies 29-03-06
Correlated Regressors (or: the trouble with multicollinearity…) by (a slightly puzzled) Patti Adank fMRI for Dummies 29-03-06
Sources: • Will Penny • Rik Henson’s slides www.mrc-cbu.cam.ac.uk/Imaging/Common/rikSPM-GLM.ppt • previous years’ presenters’ slides fMRI for Dummies 29-03-06
Correlations between regressors • in multiple regression analysis: • problems for behavioural data • behavioural example (fictional) • solutions • in the General Linear Model: • problems for neuroimaging data • PET example • solutions? fMRI for Dummies 29-03-06
Multiple Regression Analysis & Correlated Regressors fMRI for Dummies 29-03-06
Multiple regression analysis • Multiple regression characterises the relationship between several independent variables (or regressors), X1, X2, X3 etc, and a single dependent variable, Y: Y = β1X1 + β2X2 +…..+ βLXL + ε • The X variables are combined linearly and each has its own regression coefficient β (weight) • βs reflect the independent contribution of each regressor, X, to the value of the dependent variable, Y • i.e. the proportion of the variance in Y accounted for by each regressor after all other regressors are accounted for fMRI for Dummies 29-03-06
Multiple regression analysis Fit straight line through points for Y and X • some statistics: if the model fits the data well: • R2 is high (reflects the proportion of variance in Y explained by the regressor X) • the corresponding p value will be low fMRI for Dummies 29-03-06
Multiple regression analysis: multicollinearity • multiple regression results are sometimes difficult to interpret: • the overall p value of a fitted model is very low, • but individual p values for the regressors are high • this means that the model fits the data well, even though none of the X variables has a significant impact on predicting Y. • How is this possible? • caused when two (or more) regressors are highly correlated: problem known as multicollinearity fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example • When is multicollinearity between regressors a problem: • no: when you just want to predict Y from X1 and X2, the values of R2 and p will be correct • yes: but when you want assess how individual regressors impact the independent variable: - individual p values can be misleading: a p value can be high, even though the variable is important); - the confidence intervals on the regression coefficients are very wide and may include zero: you cannot be confident whether an increase in the X value is associated with an increase, or a decrease, in Y. fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example • Measures for multicollinearity: In general: if r > 0.8 between regressors it can be expected that they show multicollinearity In SPSS: Tolerance: proportion of a regressor’s variance not accounted for by other regressors in the model low tolerance values are an indicator of multicollinearity Variance Inflation Factor (VIF) the reciprocal of the tolerance large VIF values are an indicator of multicollinearity fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example • Example: • Question: how can the perceived clarity of a auditory stimulus be predicted from the loudness and frequency of that stimulus? • perception experiment in which subjects had to judge the clarity of an auditory stimulus. • model to be fit: Y = β1X1 + β2X2 + ε Y = judged clarity of stimulus X1 = loudness X2 = frequency fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example • What happens when X1 (pitch) and X2 (loudness) are collinear, i.e., strongly correlated? • Correlation loudness & frequency : 0.945 (p<0.000) • high loudness values correspond to high frequency values frequency fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example • Contribution of individual predictors: • X1 (loudness) is entered as sole predictor: Y = 0.859X1 + 24.41 R2 = 0.74 (74% explained variance in Y) p < 0.000 • X2 (frequency) entered as sole predictor: Y = 0.824X1 + 26.94 R2 = 0.68 (68% explained variance in Y) p < 0.000 fMRI for Dummies 29-03-06
Regression analysis: multicollinearity example • Collinear regressors X1 and X2 entered together: Resulting model: Y = 0.756X1 + 26.94 (X2?) R2 = 0.74 (74% explained variance in Y) p < 0.000 Individual regressors: X1 (loudness): R2 = , p < 0.000 X2 (frequency): R2 = 0.555, p < 0.594 fMRI for Dummies 29-03-06
Regression analysis: removing multicollinearity • How to deal with collinearity: 1. Increase the sample size (no data like more data) 2. Orthogonalise the correlated regressor variables - using factor analysis - this will produce linearly independent regressors and corresponding factor scores. - these factor scores can subsequently be used instead of the original correlated regressor values fMRI for Dummies 29-03-06
General Linear Model & Correlated Regressors fMRI for Dummies 29-03-06
General Linear Model • the General Linear Model can be seen as an extension of multiple regression (or multiple regression is just a simple form of the General Linear Model): • Multiple Regression only looks at ONE Y variable • GLM allows you to analyse several Y variables in a linear combination (time series in voxel) • ANOVA, t-test, F-test, etc. are also forms of the GLM fMRI for Dummies 29-03-06
General Linear Model and fMRI Y= X. β+ ε Observed data Y is the BOLD signal at various time points at a single voxel Design matrix Several components which explain the observed data, i.e. the BOLD time series for the voxel Timing info: onset vectors, Omj, and duration vectors, Dmj HRF, hm, describes shape of the expected BOLD response over time Other regressors, e.g. realignment parameters Experimental manipulations Parameters Define the contribution of each component of the design matrix to the value of Y Error/residual Difference between the observed data, Y, and that predicted by the model, Xβ. fMRI for Dummies 29-03-06
fMRI: constructing the design matrix • In analysing fMRI data, the problem of multicollinearity occurs when specifying regressors in the design matrix • If the regressors are linearly dependent (correlated) then the results of the GLM are not easy to interpret • because variance attributable to an individual regressor may be confounded with other regressor(s) • this may lead to misinterpretations of activations in certain brain areas fMRI for Dummies 29-03-06
fMRI: an example • for example: - suppose that a response to a stimulus Sr is highly correlated with the associated motor response Mr; - and suppose it is hypothesised that a specific region’s activity for Sr is not influenced by Mr; - then this region should be tested only after removing all the variance from the regressor for Mr all variance that can be explained for by Sr; - dangerous: as the motor response does influence the signal in the region; the test signal will be overly significant! -> variance is wrongly assigned to Sr fMRI for Dummies 29-03-06
fMRI: PET example • Andrade et al., (1999) Ambiguous Results in Functional Neuroimaging Data Analysis Due to Covariate Correlation, NeuroImage 10, 483-486 • Andrade at al. show how correlated regressors can lead to misinterpretations • collected PET data from a single subject and generated a covariate (regressor) variable that correlated strongly with the activation conditions used in the experiment (0 for rest, 1-6 increasing linearly with activation levels in the experiment) fMRI for Dummies 29-03-06
fMRI: PET example • two purposes: 1. detect areas where the signal correlated with the generated covariate 2. search for differences in activation versus control periods • Implies fitting two models: • One with activation-vs-rest plus covariate regressors (r = 0.845): M = C1 (ac-rest) + C2 (covariate) • One with variance from covariate C2 removed: M* = C1 + C2* (C2* = 0.845•√(SSC2/SSC1) fMRI for Dummies 29-03-06
fMRI: PET example • For model M and M* • SPM processing • parameters for C1 and C2/C2* were tested using t-tests and transformed into z-scores • Results: • differences between M and M* occurred only for activation related to C1 (the rest/activation regressor) • e.g., parahippocampal activation significant in M but not in M* • left precuneal, superior temporal, medial frontal activity significant in M* but not in M fMRI for Dummies 29-03-06
fMRI: PET example • Example voxels: (54, -56, 34) activated in M (p = 0.004) not in M* (p = 0.901) (6, 28, -28) activated in M* (p = 0.014) not in M (p = 0.337) fMRI for Dummies 29-03-06
fMRI: dealing with multicollinearity • Andrade et al. suggest a technique using the F-statistic to orthogonalise correlated regressors without having to re-estimate the β parameters (which can be very time-consuming) using principles from linear model theory (Christensen, 1996) • Other technique used to remove correlations from regressors: Gram-Schmidt orthogonalisation (cf. Rik Henson’s slides) Christensen, 1996, Plane answers to Complex Questions: The Theory of Linear Models, Springer-Verlag, Berlin fMRI for Dummies 29-03-06
Dealing with multicollinearity in SPM • Use toolbox “Design Magic” - Multicollinearity assessment for fMRI for SPM99 (SPM5?) • Author: Matthijs Vink • URL: http://www.matthijs-vink.com/tools.html • Allows you to assess the multicollinearity in your fMRI-design by calculating the amount of factor variance that is also accounted for by the other factors in the design (expressed in R2). • also allows you to reduce correlations between regressors through use of high-pass filters fMRI for Dummies 29-03-06
Conclusion • When fitting a model in multiple regression analysis or constructing your design matrix, correlations between regressors can lead to misinterpretations of the influence of the independent variables on the dependent variable • Multicollinearity is a hassle, but can be dealt with, usually though orthogonalisation procedures involving (groups of) regressors fMRI for Dummies 29-03-06
Assessing multicollinearity in SPM • The end fMRI for Dummies 29-03-06