240 likes | 265 Views
SGPP : Spatial Gaussian Predictive P rocess Models for Neuroimaging D ata. Yimei Li Department of Biostatistics St. Jude Children ’ s Research Hospital Joint work with Dr. JungWon Hyun and Dr. Hongtu Zhu. Outline. Motivation
E N D
SGPP: Spatial Gaussian Predictive Process Models for Neuroimaging Data Yimei Li Department of Biostatistics St. Jude Children’s Research Hospital Joint work with Dr. JungWon Hyun and Dr. Hongtu Zhu
Outline • Motivation • Spatial Gaussian Predictive Process Models for Neuroimaging data • Simulation Studies • Real Data Analysis • Discussion
Motivation Images with missing values Images with predicted values Future Images Base line Images Future Covariates: Gender, Age, Treatment group etc.
Motivation Preprocessed data: single voxel Design matrix Voxel-wise Analysis β • The power is not optimal (Li et al., 2011; Polzehl et al., 2010) • The voxel-wise analysis is also not optimal in prediction, since it does not • account for spatial dependence of imaging data.
Spatial Gaussian Predictive Process Model • The first one is to develop SGPP to delineate the association • between high-dimensional imaging data and a set of covariates of • interest, such as age, while accurately approximating spatial • dependence of imaging data. • The second one is to develop a simultaneous estimation and • prediction framework for the analysis of neuroimaging data.
Models : neuroimaging measures at d_m voxel : covariates of interests : individual imaging variation and the medium to long range spatial dependence : spatially correlated errors that captures the short and local dependence Short range dependence : is copy of : is copy of Medium to long dependence
Median to long dependence: FPCA Spectral Decomposition Since admits the Karhunen-Loeve expansion and its Approximation by: , is referred to as the (j,l) th functional principle score of the ith subject : is the lebesgue measurement : are uncorrelated random variables with
Short Range Dependence:Multivariate Simultaneous Autoregressive Model : is an autocorrelation parameter, which controls the strength of the local spatial dependence. It is the same across the brain and the value between 0 and 1. : Denotes the cardinality of N(d) : Denotes the independent and Identical copies of , with : a vector of unknown parameters : variance structure for the short range dependence
Variance and Model Approximation Variance Model approximation
Path Diagram for SGPP Prediction Process Training Data Set Testing Data Set Estimate SGPP model parameters
Model Validation We evaluate the prediction accuracy of the proposed model by quantifying the prediction error at all voxels with missing data, specifically the rtMSPE for each j is given by: : Denotes the set of all subjects in the test set Model to compare: GLM+SAR GLM+FPCA+SAR VWLM GLM+FPCA
Simulation Studies I • We simulation 900 pixel on 30*30 phantom for 50 subjects. At given pixel the data was generated from a bivariate Gaussian process model: : is generated from uniform (1,2) , where are independently generated according to: The regression coefficients and eigenfuctions are set as follows:
Simulation Results (1) β estimate True β Estimated β (2) The first 10 relative eigenvalues of simulated data (a)
Simulation Results (3) Estimation: Eigenfuctions: True Eigenfuctions: Estimated Eigenfuctions: (4) Estimation: REML:
Simulation Results (5) Prediction: TE: 15 randomly selected subjects, TR: the other 35 subjects For each subject in the test set, we considered the imaging data with 10%, 30% And 50% missingness respectively. The missing pixels were randomly sampled according to missingness. We fit the SGPP model to the training set and estimate all the components in the model We predicted the missing data in the test set and obtained rtMSPE. We compared rtMSPE for the proposed model with those for VWLM, GLM+fPCA, and GLM+SAR model in table below:
Simulation II: Non Gaussian Random Field Similar set up as previous simulation. Except that a class of non-Gaussian Random field was generated by squaring the Gaussian random fields whose Correlations are the squared root of the desired correlation We examine the rtMSPE for this set up as below:
Real Data Analysis • Surface data set of 43 infants at close to 1 year old. • The response were based on the SPHARM-PDM representation of left lateral ventricle surfaces. • The left lateral ventricle surface of each infant is represented by 1002 location vectors with each location vector consisting of spatial x,y and z coordinates of the corresponding vertex on the SPHARM-PDM surface • Gender, gestational age are covariates we considered in the model i: ith subject j: jth coordinate : estimates for covariates for jth coordinates, include intercept, gender and gestational age
Real Data Analysis (1) β estimate for (1, Gender, Gestational age) effects Gender Gestational age Intercept X Y Z
Real Data Analysis (2) Eigenvalues
Real Data Analysis (3) Eigenfunctions Component 1 Component 2 Component 3 X Y Z
Real Data Analysis (4) –log p value map for β Gender Gestational Age X Raw P Value Y Z X Corrected P Value Y Z
Prediction Randomly select 13 infants as testing set to estimate the prediction error
Discussion • SGPP and our prediction method can be used to directly solve missing data problems in neuroimaging studies • The proposed model can be extended to neuroimaging data obtained from clustered studies • For instance, SGPP can be extended to predict a follow-up structural alternation and neural activity based on an individual's baseline image and covariates information • It also can be extended to predict disease diagnosis and prevention