MVPD – Multivariate pattern decoding

MVPD – Multivariate pattern decoding 23.4.2009 Christian Kaul MATLAB for Cognitive Neuroscience

Outline • What is MVPD • What types of classifiers are there? • MVPD in fMRI • How to design an experiment – a few examples • The MVPD MatLab toolbox • Common problems when thinking about MVPD of fMRI data • Relevant introduction papers

MVPD – Multivariate pattern decoding • What is MVPD? • Methodology in which an algorithm is trained to tell two or more conditions from each other. • The algorithm is then presented with a new set of data and categorises/classifies it into the conditions previously learned. • MVPD is a relatively new tool in fMRI, however note that Pattern Classification as such has long been developed and used in Artificial Intelligence and Neuronal Networks.

What types of classifiers are there? • The most common classifiers used for fMRI data are • LDA (Linear Discriminant Analysis) • SVM (Support Vector Machines) • SLR (Sparse Logistic regression) algorithm: maximize margin! • All are generally doing a good job. • SLR and LDA find solutions based on linear combinations of features only. • However SVMs also take non-linear effects into account. This is largely done by mapping the information into a higher dimensional space (feature space).

Non-linear SVMs - Feature space 2 examples: Downside of non-linear SVMs: There are more and more parameters to be optimized during learning.

MVPD in fMRI • In situations where we do find a univariate effect, a multivariate effect is unlikely to reveal anything new! • But when conventional analysis is not feasible, multivariate analysis might be an option. • What are we actually measuring? • What does a “pattern of brain activity” mean? • Example: • Visual feature sensitive information is present in BOLD signal

fMRI of basic visual features • Conventional analysis was thought to be not feasible due to its lack of spatial resolution, compared to invasive single cell recordings. + = Haynes & Rees (2006)

fMRI of basic visual features • Conventional analysis was thought to be not feasible due to its lack of spatial resolution, compared to invasive single cell recordings. + Haynes & Rees (2006)

Mean signal LDA Kamitani & Tong (2006) Haynes & Rees (2005) fMRI of basic visual features Pattern Multivariate Decoding! Often multivariate results are presented ROI specific...

Multivariate pattern analysis – how to design an experiment Does the pattern of activity contain meaningful information we can extract? • Not the level of brain activity is addressed, but the pattern of information within the activity. • questions that can be answered with multivariate pattern analysis: • “What have I seen?” Decoding of visual input, majority of publications • “What have I heard/ felt/ …?” Decoding of other sensitive input should be possible. • “What am I going to do next?” Decisions seem to be coded in distinctive patterns of brain activity.

More interesting Questions? Does feature selective information contained in the BOLD signal for an irrelevant stimulus change underdifferent levels of attentional load in a central task? ? + =

Experiment 1 Low High Accuracy Chance V1 V2 V3 ROI • Prediction (from load theory): • Feature selective information should be reduced in high load condition % correct decoded

Decoding Result Low High Accuracy Chance V1 V2 V3 ROI actual expected Result: Feature selective information NOT reduced

Question: Example 2 - intentions • At the beginning of each trial, the word “select” was presented that instructed the subjects to freely and covertly choose one of two possible tasks, addition or subtraction. From the button press, it was possible to determine the covert intention of the subject during the previous delay period. • Decoding objective : • Can subjects decision be decoded? Haynes et al, 2007c

Example 2, Result • In the anterior medial prefrontal cortex decoding during the delay (green bars) was highest but was at chance level during the task execution (red bars) after onset of the task-relevant stimuli. In contrast posterior & superior medial prefrontal cortex (MPFCp) encoded the chosen task only once it had entered the stage of execution, but not during the delay period. • Results presented with “searchlight” approach: A spherical searchlight centered on one voxel is used to define a local neighborhood. Haynes et al, 2007c

Example 3 – Voxel based tuning functions Monkey-data like tuning functions with fMRI! Serences et al, 2008

Example 4 – Real time reconstruction of seen images Miyawaki et al, 2008

The MVPD MatLab “toolbox” • MatLab- functions to perform MVPD with “any” suitable data. • Presented is the basic control-script. • It is quite easy to follow the workflow in this control-script as a demonstration of how MVPD using SLR can look like. •  If anyone is interested in working with the code, please contact me directly: c.kaul@ucl.ac.uk

The common problems when thinking about MVPD of fMRI data • Decoding of what? TR, block average, betas. • Overfitting - too many features at too few data samples. • Voxel selection.

The common problems when thinking about MVPD of fMRI data: TR, BLOCK or BETA? • In principle there are 3 different strategies how to get your brain pattern: single TRs (raw data), averaged blocks of TRs, betas (spm-estimates). single TRs avg. BLOCKs BETAs Noise Number of observations

The common problems when thinking about MVPD of fMRI data - OVERFITTING • (1) an SVM classifier is unstable on a small-sized training set; • (2) SVM’s optimal hyper-plane may be biased when the positive feedback samples are much less than the negative samples • (3) overfitting happens because the number of feature dimensions is much higher than the size of the training set.

Over-fitting and Under-fitting • To avoid overfitting, cross-validation is used to evaluate the fitting provided by each parameter value set tried during the grid or pattern search process.

The common problems: VOXEL SELECTION (LDA & SVM) • To reduce feature input dimensionality (# of voxels) it is common to preselect voxels: • ROI based selection on voxels • But: ROI must be defined independent from classification • Threshold based selection of voxels • But: threshold must be independent from classification • Searchlight approach: A fixed sphere is moved over the brain, voxel-by-voxel • But: multiple comparisons! • SLR does not have this problem due to automatic relevance detection

Relevant introduction papers • Revealing representational content with pattern-information fMRI--an introductory guide. • Mur M, Bandettini PA, Kriegeskorte N • Machine learning classifiers and fMRI: a tutorial overview. • Pereira F, Mitchell T, Botvinick M • Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns. • Yamashita O, Sato MA, Yoshioka T, Tong F, Kamitani Y.

Thanks – enjoy this sunny afternoon!

MVPD – Multivariate pattern decoding

MVPD – Multivariate pattern decoding

Presentation Transcript

Meal Patterns

Arithmetic Coding: Basic Ideas

The Hero Archetype

Multivariate Coarse Classing of Nominal Variables

Decoding the Meaning of the Reading Results with an emphasis on dyslexia

Design Patterns

The Dynastic Cycle

II. The Multivariate Normal Distribution

Discrete Multivariate Analysis

Point Pattern Analysis

Multivariate Statistical Analysis

Multivariate Data Analysis: Overview and Applications

Decoding ENCODE

New Meal Pattern Requirements and Nutrition Standards

What Is Frequent Pattern Analysis?

Search Patterns

Combinatorial Pattern Matching

Chapter 12

Institute of Information Theory and Automation Introduction to Pattern Recognition

Introduction to Pattern Recognition Chapter 1 ( Duda et al.)

Get Your Plate in Shape