1 / 34

Hidden Process Models for Analyzing fMRI Data

Hidden Process Models for Analyzing fMRI Data. Rebecca Hutchinson Joint work with Tom Mitchell May 11, 2007 Student Seminar Series In partial fulfillment of the Speaking Requirement Carnegie Mellon University Computer Science Department. Introduction. Hidden Process Models (HPMs):

olympe
Download Presentation

Hidden Process Models for Analyzing fMRI Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hidden Process Modelsfor Analyzing fMRI Data Rebecca Hutchinson Joint work with Tom Mitchell May 11, 2007 Student Seminar Series In partial fulfillment of the Speaking Requirement Carnegie Mellon University Computer Science Department

  2. Introduction • Hidden Process Models (HPMs): • A new probabilistic model for time series data. • Designed for data generated by a collection of latent processes. • Potential domains: • Biological processes (e.g. synthesizing a protein) in gene expression time series. • Human processes (e.g. walking through a room) in distributed sensor network time series. • Cognitive processes (e.g. making a decision) in functional Magnetic Resonance Imaging time series.

  3. Process 1: Process P: d1 … dN d1 … dN t t … … … t t Prior knowledge: There are a total of 6 processes in this window of data. An instance of Process 1 begins in this window. An instance of Process P begins in this window. An instance of either Process 1 OR Process P begins in this window. d1 … dN

  4. Process 1: Process P: d1 … dN d1 … dN t t … … … t t Process 1 timings: … Process P timings: More questions: -Can we learn the parameters of these processes from the data (even when we don’t know when they occur)? -Would a different set of processes model the data better? d1 … dN

  5. Simple Case: Known Timing • If we know which processes occur when, we can estimate their shapes with the general linear model. • The timings generate a convolution matrix X: P p1 p2 p3 t=1 t=2 t=3 t=4 … 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 … … … T

  6. Simple Case: Known Timing D p1 p2 p3 D 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 … … … W(1) p1 = Y p2 W(2) T W(3) p3

  7. Challenge: Unknown Timing D p1 p2 p3 D 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 … … … W(1) p1 = Y p2 W(2) T W(3) p3 Uncertainty about the processes essentially makes the convolution matrix a random variable.

  8. Our Approach • Model of processes contains a probability distribution over when it occurs relative to a known event (called a timing landmark). • When predicting the underlying processes, use prior knowledge about timing to limit the hypothesis space.

  9. fMRI Data … Hemodynamic Response Features: 10,000 voxels, imaged every second. Training examples: 10-40 trials (task repetitions). Signal Amplitude Neural activity Time (seconds)

  10. Study: Pictures and Sentences Press Button View Picture Read Sentence • Task: Decide whether sentence describes picture correctly, indicate with button press. • 13 normal subjects, 40 trials per subject. • Sentences and pictures describe 3 symbols: *, +, and $, using ‘above’, ‘below’, ‘not above’, ‘not below’. • Images are acquired every 0.5 seconds. Read Sentence Fixation View Picture Rest t=0 4 sec. 8 sec.

  11. Goals for fMRI • To track cognitive processes over time. • Estimate process hemodynamic responses. • Estimate process timings. • Allowing processes that do not directly correspond to the stimuli timing is a key contribution of HPMs! • To compare hypotheses of cognitive behavior.

  12. HPM Modeling Assumptions • Model latent time series at process-level. • Process instances share parameters based on their process types. • Use prior knowledge from experiment design. • Sum process responses linearly.

  13. Process 1: ReadSentence Response signature W: Duration d: 11 sec. Offsets W: {0,1} P(): {q0,q1} Process 2: ViewPicture Response signature W: Duration d: 11 sec. Offsets W: {0,1} P(): {q0,q1} Processes of the HPM: v1 v2 v1 v2 Input stimulus : sentence picture Timing landmarks : Process instance:2 Process h: 2 Timing landmark: 2 Offset O: 1 (Start time: 2+ O) 1 2 One configuration c of process instances 1, 2, … k: (with prior fc) 1 2  Predicted mean: + N(0,s1) v1 v2 + N(0,s2)

  14. HPM Formalism HPM = <H,C,F,S> H = <h1,…,hH>, a set of processes (e.g. ReadSentence) h = <W,d,W,Q>, a process W = response signature d = process duration W = allowable offsets Q = multinomial parameters over values in W C = <c1,…, cC>, a set of configurations c = <p1,…,pL>, a set of process instances • = <h,l,O>, a process instance (e.g. ReadSentence(S1)) h = process ID • = timing landmark (e.g. stimulus presentation of S1) O = offset (takes values in Wh) • = <f1,…,fC>, priors over C S = <s1,…,sV>, standard deviation for each voxel

  15. HPMs: the graphical model Configuration c Timing Landmark l The set C of configurations constrains the joint distribution on {h(k),o(k)} " k. Process Type h Offset o Start Time s S p1,…,pk observed unobserved Yt,v t=[1,T], v=[1,V]

  16. Encoding Experiment Design Processes: Input stimulus : Constraints Encoded: h(p1) = {1,2} h(p2) = {1,2} h(p1) != h(p2) o(p1) = 0 o(p2) = 0 h(p3) = 3 o(p3) = {1,2} ReadSentence = 1 ViewPicture = 2 Timing landmarks : 1 2 Decide = 3 Configuration 1: Configuration 2: Configuration 3: Configuration 4:

  17. Inference • Over configurations • Choose the most likely configuration, where: • C=configuration, Y=observed data, D=input stimuli, HPM=model

  18. Learning • Parameters to learn: • Response signature W for each process • Timing distribution Q for each process • Standard deviation s for each voxel • Expectation-Maximization (EM) algorithm to estimate W and Q. • E step: estimate a probability distribution over configurations. • M step: update estimates of W (using reweighted least squares) and Q (using standard MLEs) based on the E step. • After convergence, use standard MLEs for s.

  19. Uncertain Timings • Convolution matrix models several choices for each time point. Configurations for each row: P S D t=1 t=1 t=2 t=2 … t=18 t=18 t=18 t=18 … 3,4 1,2 3,4 1,2 … 3 4 1 2 … 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 … … … 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 ... … … T’>T

  20. Uncertain Timings • Weight each row with probabilities from E-step. P S D Configurations: Weights: 3,4 1,2 3,4 1,2 … e1 e2 e3 e4 … 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 … … … Y = W e1 = P(C=3|Y,Wold,Qold,sold) + P(C=4|Y,Wold,Qold,sold)

  21. Learned models: S S P P D D S P D predicted D start time chosen by program as t+18 Learned HPM with 3 processes (S,P,D), and d=13sec. S S P P D? D? observed

  22. ViewPicture in Visual Cortex Offset q = P(Offset) 0 0.725 1 0.275

  23. ReadSentence in Visual Cortex Offset q = P(Offset) 0 0.625 1 0.375

  24. Decide in Visual Cortex Offset q = P(Offset) 0 0.075 1 0.025 2 0.025 3 0.025 4 0.225 5 0.625

  25. ViewPicture

  26. ReadSentence

  27. Seconds following the second stimulus Multinomial probabilities on these time points Decide

  28. Comparing Models 5-fold cross-validation, 1 subject P = ViewPicture S = ReadSentence S+ = ReadAffirmativeSentence S- = ReadNegatedSentence D = Decide D+ = DecideAfterAffirmative D- = DecideAfterNegated Dy = DecideYes Dn = DecideNo Dc = DecideConfusion B = Button ** - This HPM can also classify Dy vs. Dn with 92.0% accuracy. GNBC gets 53.9%. (using the window from the second stimulus to the end of the trial)

  29. Are we learning the right number of processes? • Use synthetic data where we know ground truth. • Generate training and test sets with 2/3/4 processes. • Train HPMs with 2/3/4 processes on each. • For each test set, select the HPM with the highest data log likelihood.

  30. Related Work • fMRI • General Linear Model (Dale99) • Must assume timing of process onset to estimate hemodynamic response. • Computer models of human cognition (Just99, Anderson04) • Predict fMRI data rather than learning parameters of processes from the data. • Machine Learning • Classification of windows of fMRI data (Cox03, Haxby01, Mitchell04) • Does not typically model overlapping hemodynamic responses. • Dynamic Bayes Networks (Murphy02, Ghahramani97) • HPM assumptions/constraints are difficult to encode in DBNs.

  31. Future Work • Incorporate spatial prior knowledge. E.g. share parameters across voxels (extending Niculescu05). • Smooth hemodynamic responses (e.g. Boynton96). • Improve algorithm complexities. • Apply to open cognitive science problems.

  32. Conclusions • Take-away messages: • HPMs are a probabilistic model for time series data generated by a collection of latent processes. • In the fMRI domain, HPMs can simultaneously estimate the hemodynamic response and localize the timing of cognitive processes.

  33. References John R. Anderson, Daniel Bothell, Michael D. Byrne, Scott Douglass, Christian Lebiere, and Yulin Qin. An integrated theory of the mind. Psychological Review, 111(4):1036–1060, 2004. http://act-r.psy.cmu.edu/about/. Geoffrey M. Boynton, Stephen A. Engel, Gary H. Glover, and David J. Heeger. Linear systems analysis of functional magnetic resonance imaging in human V1. The Journal of Neuroscience, 16(13):4207–4221, 1996. David D. Cox and Robert L. Savoy. Functional magnetic resonance imaging (fMRI) ”brain reading”: detecting and classifying distributed patterns of fMRI activity in human visual cortex. NeuroImage, 19:261–270, 2003. Anders M. Dale. Optimal experimental design for event-related fMRI. Human Brain Mapping, 8:109–114, 1999. Zoubin Ghahramani and Michael I. Jordan. Factorial hidden Markov models. Machine Learning, 29:245–275, 1997. James V. Haxby, M. Ida Gobbini, Maura L. Furey, Alumit Ishai, Jennifer L. Schouten, and Pietro Pietrini. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293:2425–2430, September 2001. Marcel Adam Just, Patricia A. Carpenter, and Sashank Varma. Computational modeling of high-level cognition and brain function. Human Brain Mapping, 8:128–136, 1999. http://www.ccbi.cmu.edu/project 10modeling4CAPS.htm. Tom M. Mitchell et al. Learning to decode cognitive states from brain images. Machine Learning, 57:145–175, 2004. Kevin P. Murphy. Dynamic bayesian networks. To appear in Probabilistic Graphical Models, M. Jordan, November 2002. Radu Stefan Niculescu. Exploiting Parameter Domain Knowledge for Learning in Bayesian Networks. PhD thesis, Carnegie Mellon University, July 2005. CMU-CS-05-147.

More Related