540 likes | 642 Views
Meta-analysis of neuroimaging data What, Why, and How Tor D. Wager Columbia University. Uses of meta-analysis in neuroimaging. Meta-analysis is an essential tool for summarizing the vast and growing neuroimaging literature. Wager, Lindquist, & Hernandez, in press.
E N D
Meta-analysis of neuroimaging data What, Why, and How Tor D. Wager Columbia University
Uses of meta-analysis in neuroimaging • Meta-analysis is an essential tool for summarizing the vast and growing neuroimaging literature Wager, Lindquist, & Hernandez, in press
Uses of meta-analysis in neuroimaging • Assess consistency of activation across laboratories and task variants • Compare across many types of tasks and evaluate the specificity of activated regions for particular psychological conditions • Identify and define boundaries of functional regions • Co-activation: Develop models of functional systems and pathways Wager, Lindquist, & Kapan, 2007
Functional networks in meta-analysis • Use regions or distributed networks in a priori tests in future studies
Locating emotion-responsive regions 164 PET/fMRI studies, 437 activation maps, 2478 coordinates
Why identify consistent areas? • Making statistic maps in neuroimaging studies involves many tests (~100,000 per brain map) • Many studies use uncorrected or improperly corrected p-values Long-term Memory P-value thresholds used # of Maps Uncorrected Corr. How many false positives? A rough estimate: 663 peaks, 17% of reported activations Wager, Lindquist, & Kaplan, 2007
Consistently Activated regions Reported peaks 163 studies Consistency Emotion: 163 studies
Ventral surface Lateral surface (R) Medial surface (L) Fig 4: MKDA Results rdACC Central sulcus Pre SMA pgACC vmPFC Gyrus rectus dmPFC dmPFC PCC pOFC BF OCC CM, MD sgACC mTC aINS latOFC Deep nuclei vmPFC sTC lFG TC Kober et al., in press, NI
Disgust responses: Specificity in insula? Feldman-Barrett & Wager, 2005; Phan, Wager, Taylor, & Liberzon, 2002; Phan, Wager, Liberzon & Taylor, 2004 Search Area: Insula
Meta-analysis plays a unique role in answering… • Is it reliable? • Would each activated region replicate in future studies? • Would activation be insensitive to minor variations in task design? • Is it task-specific? • Predictive of a particular psychological state or task type? • Diagnostic value? The Neural Correlates of Task X
Monte Carlo: Expected maximum proportion Under the null hypothesis Peak coordinate locations (437 maps) … Damasio, 2000 Liberzon, 2000 Wicker, 2003 Kernel convolution Apply threshold Significant regions Weighted average … Comparison indicator maps Proportion of activated Comparisons map (from 437 comparisons) Meta-analysis: Multilevel kernel density estimate (MKDE) Permute blobs within study maps E Wager, Lindquist, & Kaplan, 2007; Etkin & Wager, in press
MKDA: Key points • Statistic reflects consistency across studies. Study comparison map is treated as a random effect. Peaks from one study cannot dominate. • Studies are weighted by quality (see additional info on handouts for rationale) • Spatial covariance is preserved in Monte Carlo. Less sensitive to arbitrary standards for how many peaks to report.
Weighted average Whether and how to weight studies/peaks MKDA analysis weights by sqrt(sample size) and study quality (including fixed/random effects) Study quality weight Weighted proportion of activating studies Sample size for map c Activation indicator (1 or 0) for map c Fixed effects Random effects
Monte Carlo Simulation • Simulation vs. theory (e.g. Poisson process) • Simulation allows: • Non-stationary spatial distribution of peaks (clumps) under null hypothesis; randomize blob locations • Family-wise error rate control with irregular (brain-shaped) search volume • Cluster size inference, given primary threshold Monte Carlo: E(max(P|H0))
Kernel convolution Apply significance threshold Density kernel Peak density or ALE map Significant results OR ALE kernel Compare with Activation Likelihood Estimate (ALE), Kernel Density Analysis (KDA) Peak coordinates Combined across studies Ignores the fact that some studies report more peaks than others! Density kernel: Chein, 1998; Phan et al., 2002; Wager et al., 2003, 2004, 2007, in press Gaussian density kernel + ALE: Turkeltaub et al., 2002; Laird et al., 2005; others
Comparison with other methods MKDA • Statistic reflects consistency across studies. Study comparison map is treated as a random effect. Peaks from one study cannot dominate. • Studies are weighted by quality • Spatial covariance is preserved in Monte Carlo. Less sensitive to arbitrary standards for how many peaks to report. KDA/ALE • Peaks are lumped together, study is fixed effect. Peaks from one study can dominate, studies that report more peaks dominate. • No weighting, or z-score weighting (problematic) • Spatial covariance is not preserved in Monte Carlo. Effects of reporting standards large. See handouts for more comparison points
ALE approach • Treats points as if they were Gaussian probability distributions. • Summarize the union of probabilities at each voxel: probability of any peak “truly” lying in that voxel is the probability that peak Xi lies in a given voxel The bar indicates the complement operator Null hypothesis: No peaks lie in voxel Alt hypothesis: At least one peak lies in voxel
ALE meta-analysis • Analyst chooses smoothing kernel • ALE analysis with zero smoothing: • Every voxel reported in any study is significant in the meta-analysis • Test case: 3-peak meta analysis, one peak activates in voxel: ALE statistic: Highest possible value! • In practice: 10 – 15 mm FWHM kernel
Working memory Long-term memory Executive WM Memory Inhibition Task switching Response selection Density analysis: Summary Wager et al., 2004; Nee, Wager, & Jonides, 2007; Wager et al., in press; Van Snellenberg & Wager, in press
Specificity • Task-related differences in relative activation frequency across the brain: • MKDA difference maps (e.g., Wager et al., 2008) • Task-related differences in absolute activation frequency • Nonparametric chi-square maps (Wager, Lindquist, & Kaplan, 2007) • Classifier systems to predict task type from distributed patterns of peaks (e.g., Gilbert)
MKDA Difference maps: Emotion example • Approach: • Calculate density maps for two conditions, subtract to get difference maps • Monte Carlo: Randomize blob locations within each study, re-calculate density difference maps and save max • Repeat for many (e.g., 10,000) iterations to get max distribution • Threshold based on Monte Carlo simulation Experienced Perceived
Emotion example: Selective regions Experience > Perception Perception > Experience OFC Amy TP IFG Hy OFC aIns Midb TP OFC dmPFC Amy mOFC Hy vaIns TP Amy OFC aIns vaIns IFG pgACC PAG Midb Hy PAG TP CB CB Amy D Wager et al., in press, Handbook of Emotion
Task-brain activity associations in meta-analysis • Measures of association: • Chi-square • But requires high expected counts (> 5) in each cell. Not appropriate for map-wise testing over many voxels • Fisher’s exact test (2 categories only) • Multinomial exact test • Computationally impractical! • Nonparametric chi-square • Approximation to exact test • OK for low expected counts
Nonparametric chi-square: Details • Idea of exact test: • Conditionalize on marginal counts for activation and task conditions. • Null hypothesis: no systematic association between activation and task • P-value is proportion of null-hypothesis possible arrangements that can produce distribution across task conditions as large as observed or larger.
Nonparametric chi-square: Details • Permutation test: • Permute activation indicator vector, creating null-hypothesis data (no systematic association) • Marginal counts are preserved. • Test 5,000 or more samples and calculate P-value based on observed null-hypothesis distribution
Density difference vs. Chi-square • Relative vs. absolute differences Experience Perception Voxels (one-dimensional brain) Chi-square Density
Can we predict the emotion from the pattern of brain activity? • Approach: predict studies based on their pattern of reported peaks (e.g., Gilbert, 2006) • Use naïve Bayesian classifier (see work by Laconte; Tong; Norman; Haxby). Cross-validate: predict emotion type for new studies that are not part of training set. Experienced Perceived
Classifying experienced emotion vs. perceived emotion: 80% accurate EXPvs.PER EXP DMPFCvs.Pre-SMA DMPFC PAGvs.Ant. thalamus PAG Deep cerebellar nuc.vs. Lat. cerebellum Deep cerebellar nuc. Perception Experience
Outline: Why and How… Consistency: Replicability across studies Consistency in single-region results: MKDA Consistency in functional networks: MKDA + Co-activation Specificity and “reverse inference” Brain-activity – psychological category mappings for individual brain regions: MKDA difference maps; Nonparametric Chi-square Brain-activity – psychological category mappings for distributed networks Applying classifier systems to meta-analytic data
Extending meta-analysis to connectivity Co-activation: If a study (contrast map) activates within k mm of voxel 1, is it more likely to also activate within k mm of voxel 2? Measures of association: Kendall’s Tau-b Fisher’s exact test Nonparametric chi-square Others…
Kendall’s Tau: Details • Ordinal “nonparametric” association between two variables, x and y • Uses ranks; no assumption of linearity or normal distribution (Kendall, 1938, Biometrika) • Values between [-1 to 1], like Pearson’s correlation Tau is proportion of concordant pairs of observations sign(x diff. between pairs)= sign(y diff. between pairs) Tau = (# concordant pairs - # discordant pairs) / total # pairs
Meta-analysis functional networks: Examples • Emotion: Kober et al. (in press), 437 maps
Meta-analysis of emotion Meta-analysis of cognitive function Statistics Martin Lindquist Lisa Feldman Barrett Ed Smith Acknowledgements Tom Nichols Luan Phan Steve Taylor Israel Liberzon Derek Nee John Jonides Ed Smith Students Hedy Kober Lauren Kaplan Jason Buhle Jared Van Snellenberg Funding agencies: National Science Foundation National Institute of Mental Health
Whether and how to weight studies/peaks • Studies (and peaks) differ in sample size, methodology, analysis type, smoothness, etc. • Advantageous to give more weight to more reliable studies/peaks • Z-score weighting • Advantages: Weights nominally more reliable peaks more heavily • Disadvantages: Small studies can produce variable results. Reporting bias: High z-score peaks are high partially due to error; “capitalizing on chance” • Must convert to common Z-score metric across different analysis types in different studies
Whether and how to weight studies/peaks • Alternative: Sample-size weighting • Advantages: • Weights studies by the quality of information their peaks are likely to reflect • Avoids overweighting peaks reported due to “capitalizing on chance” • Disadvantages: Ignores relative reliability of various peaks within studies