560 likes | 865 Views
NEUROIMAGING AND COMPUTATIONAL MODELING OF SYLLABLE SEQUENCE PRODUCTION. JASON W. BOHLAND DEPARTMENT OF COGNITIVE & NEURAL SYSTEMS GRADUATE SCHOOL OF ARTS AND SCIENCES BOSTON UNIVERSITY OCTOBER 23, 2006. FUNDED BY:
E N D
NEUROIMAGING AND COMPUTATIONAL MODELING OF SYLLABLE SEQUENCE PRODUCTION JASON W. BOHLAND DEPARTMENT OF COGNITIVE & NEURAL SYSTEMS GRADUATE SCHOOL OF ARTS AND SCIENCES BOSTON UNIVERSITY OCTOBER 23, 2006 FUNDED BY: National Institute on Deafness and other Communication Disorders [ R01 DC02852, F. Guenther PI ] National Science Foundation [ NSF SBE-0354378, S. Grossberg PI ]
Outline Background and Motivation An fMRI investigation of syllable sequence production Preliminary efforts with magnetoencephalography Computational neural modeling of speech sound sequencing Conclusions, future directions, and thanks
The brain must rapidly assemble and enact properly ordered and properly timed speech sound sequences. the “action syntax problem” (Lashley, 1951) These are composed from a finite alphabet of learnedspeech units. English has ~40-50 phonemes ~500 syllables account for ~80% of spoken English usage
Serial order in speech production Linguistic performance errors • obey structural constraints • imply hierarchical organization frame and content theories (Shattuck-Hufnagel, MacKay, MacNeilage, …) Chronometric data • sequence length, familiarity effects • imply parallel planning Co-articulation • carry-over and look-ahead in motor programming • imply parallel planning Conceptual / theoretical non-biological models from Levelt et al. (1999)
Controlling the serial production of speech motor programs “motor chunks” – but how big? Mental syllabary ( Crompton, 1981; Levelt, 1994 ) Syllable frequency effect is a phonetic effect ( Laganaro and Alario, 2006 ) This project: Extend the DIVA model to include explicit parallel representations of forthcoming utterances that interface with learned speech motor programs. from Guenther, Ghosh, and Tourville (2006)
Several brain regions have been implicated in sequencing and speech production • pre-SMA / SMA • [ e.g. Jonas, 1981,1987; Ziegler et al., 1997; Pai, 1999 ] • pre-frontal cortex / IFG / Broca’s Area • [ e.g. Mohr et al., 1978; Hillis, 2005 ] • Anterior Insula • [ e.g. Dronkers, 1996; Nagao et al., 1999; Tanji et al., 2001; Nestor et al., 2003 ] • Basal Ganglia • [ e.g. Pickett et al., 1998; Ho et al., 1998 ] • Cerebellum • [ e.g. Silveri et al., 1998; Riva, 1998 ]
Previous relevant functional imaging data is sparse and difficult to interpret ta No activation of SMA, insula compared to resting baseline. Cb activation only for “stra.” Inconsistent with the majority of overt production studies. Need to clarify! pataka from Dogil et al., 2002 ; first reported in Riecker et al., 2000 also see Shuster and Lemieux, 2004; Alario et al., 2006 Difficult to reconcile these results with the generative needs of speech.
Outline Background and Motivation An fMRI investigation of syllable sequence production Preliminary efforts with magnetoencephalography Computational neural modeling of speech sound sequencing Conclusions, future directions, and thanks
An fMRI investigation of syllable sequence production Bohland, J. W. and Guenther, F. H. (2006). NeuroImage.
An fMRI investigation of syllable sequence production Bohland, J. W. and Guenther, F. H. (2006). NeuroImage. 2 x 2 stimulus design [ + baseline ] syl controls the structural complexity of each syllable ( CV vs. CC(C)V ) - all syllables in an utterance have the same frame complexity seq controls the number of unique syllables that must be represented ( and maintained in order ) by the subject
The experimental task • The task was performed within a GO / NOGO paradigm • 2 x 2 x 2 factorial design “Event Triggered” imaging eliminates artifacts, makes speaking more natural
The experimental task ( a dramatization )
The experimental task ( a dramatization )
Athinoula A. Martinos Center for Biomedical Imaging Cortical surface reconstruction Automatic cortical parcellation Analysis methods SPM2 , SnPM2 , FreeSurfer , ROI Toolbox • Wellcome Department of Imaging Neuroscience, London, UK • Preprocessing • Rigid-body realignment of functional series • Co-registration of functional to anatomical images • Spatial normalization to MNI template • Isotropic Gaussian smoothing (8 mm FWHM) • Voxel-based analysis • General linear model estimation for each subject • Second-level analysis for population inference • University of Michigan Department of Biostatistics • Non-parametric second-level inference • Variance pooling (4x4x4 mm3) • Pseudo-T statistics • Combined voxel-level and cluster-level inference • Speech Lab, Cognitive & Neural Systems, Boston University • Region-level analyses • Tests for lateralization • Paired t-tests of effect sizes per hemisphere, per subject
syl Syllable Complexity ( ) ) q e s e CV - CV - CV CC ( C ) V - CC ( C ) V - CC ( C ) V ( c Simple n y ta - ta - ta stra - stra - stra t e i x u e q l e CV - CV - CV CC ( C ) V - CC ( C ) V - CC ( C ) V p Complex S m ka - ru - ti kla - stri - splu o C Simple Complex Main effects of sequence complexity Left inferior frontal sulcus area Rightinferior Cb Medial premotor regions aINS / FO junction Anterior thalamus / caudate Posterior parietal
syl Syllable Complexity ( ) ) q e s e CV - CV - CV CC ( C ) V - CC ( C ) V - CC ( C ) V ( c Simple n y ta - ta - ta stra - stra - stra t e i x u e q l e CV - CV - CV CC ( C ) V - CC ( C ) V - CC ( C ) V p Complex S m ka - ru - ti kla - stri - splu o C Simple Complex Main effects of syllable complexity Medial premotor regions (pre-SMA) Right superior paravermal cerebellum
Sequence and syllable complexity strongly interact e.g. the effect of sequence complexity is greater when the individual syllables are complex
Key imaging results Additional complexity led to additional activations of brain regions associated with controlling sequential behavior. The left IFS region responded to added sequence complexity. The effect was larger when the syllables were complex. The pre-SMA showed preferential responses for additional stimulus complexity, including a main effect for syllable structure The SMA was more active for overt speech production. Two regions were identified around the anterior insula – the more anterior responded to stimulus complexity, the more posterior to overt speech. Both were bilateral. The right inferior Cb responded to seq but showed no seq x syl interaction The anterior thalamus / caudate were additionally engaged for complex sequences. There may be a shift from caudate to anterior thalamus with increasing syllable complexity.
Outline Background and Motivation An fMRI investigation of syllable sequence production Preliminary efforts with magnetoencephalography Computational neural modeling of speech sound sequencing Conclusions, future directions, and thanks
Magnetoencephalography and speech production from Fishbine, B. Los Alamos Research Quarterly, Spring 2003
Magnetoencephalography and speech production Syllable sequence production 1 Subject – 4 experimental runs 180 trials x 3 conditions All trials used overt production Mean ISI = 6.5 s 1-, 2-, and 3-syllable non-lexical sequences (all CV syllables) e.g. bah, bah-tee, bah-tee-goo Simultaneous recording of surface electromyography (EMG) Temporalis, orbicularis oris (lip), orbicularis oculi (eye) EMG were filtered, rectified, and analyzed to determine movement (production) onset
Magnetoencephalography and speech production Methods: 157-channel axial gradiometer system 1 kHz sampling rate, 200 Hz HPF, 60 Hz notch filter Digitized head shape and “marker coil” measurements used to coregister MEG and MRI coordinate frames (surface-based) Simultaneous recording of MEG, EMG, vocal response, condition “triggers.” MEG forward model – overlapping spheres approach (Huang, Mosher, and Leahy, 1999) from Brainstorm Toolbox. Potential sources sampled at ~1500 cortical surface locations (vertices)
Approach: Analyze frequency spectrum during time period of interest. Magnetoencephalography and speech production Time period of primary interest: from GO signal to initiation of movement
Magnetoencephalography and speech production Significant components (p<0.05) Between the GO signal and onset of articulation, the strength of left prefrontal activity in the ~12-14 Hz frequency range (high alpha / low beta) provides a significant index to discriminate the three speech conditions.
Outline Background and Motivation An fMRI investigation of syllable sequence production Preliminary efforts with magnetoencephalography Computational neural modeling of speech sound sequencing Conclusions, future directions, and thanks
Models of serial order associative chaining models Represent order through item-item chaining positional models Represent order by explicit association of item with “slot” or with a time-varying signal ordinal models – Represent order by relative activation levels of representative units “item and order” -- Grossberg (1978a, 1978b); “competitive queuing” -- Houghton (1990); Bullock and Rhodes (2003).
from Averbeck, Chafee, Crowe, and Georgopoulos (2002) Prefrontal cell assemblies realize CQ-style planning
Modeling Starting Points Based on our fMRI results, previous clinical studies, and previous theoretical models, the following proposals are made: - The left inferior frontal sulcus area (IFS) codes for sub-syllabic phonological content (phonemes) - Cells in this region are tuned to specific phonemes and specific syllable positions - The pre-SMA codes for abstract syllable frames - A “planning loop” through the basal ganglia coordinates appropriate selection of phonemic content between these two areas - “Best matching” motor programs, in the Speech Sound Map, are chosen for performance via learned connections between the IFS representation and Speech Sound Map. The model explains how arbitrary utterances that fall within the speaker’s language rules can be represented in the brain, and how such utterances can be produced from a finite library of learned motor chunks, activated in the proper order.
Left inferior frontal sulcus cells code for phonemic content of forthcoming sounds Hypothesis: Order is represented by a primacy gradient within syllable positions phoneme position • Language specific phoneme set serves as a basis set for representing arbitrary speech sequences • Input to specific slots can be learned, or can be the result of a parsing process • ( e.g. when stimuli are read or heard ) • Use of positional representation effectively reduces the # of competing items in the queue ( improves SNR ) • The model uses 53 phonemes from the CELEX database ( Baayen et al., 1995 ) • 7 syllable positions with vowel (nucleus) in position 4. ( cf. Fudge, 1969; Hartley and Houghton, 1996 )
The model is described by a system of shunting differential equations (Grossberg, 1973, 1978a, 1978b) IFS specification - columns represent the same phoneme - this figure shows a portion of a single positional zone
pre-SMA codes for abstract syllable frames cf. MacNeilage, 1998 • Very few syllable frames are required to represent all English syllables. • 8 frames / 96% of all syllables • Choice of a syllable frame initiates the production process -- • Activates a serial chain of cells that represent the individual abstract syllable positions in that frame. • Competitive choice is gated by lack of activity in IFS choice region • e.g. the system is “ready” for a new syllable
BG Planning Loopselectively enables IFS syllable position zones cf. Brown, Bullock, and Grossberg (2004) Position-specific competitive “channels” through BG circuitry Channels compete through strong feed-forward inhibition via striatal interneurons Activating a channel disinhibits the thalamus, and gates on competition in IFS choice field. cf. “Action selection” models of basal ganglia ( e.g. Mink and Thach, 1993; Redgrave, 1999 )
Speech Sound Map encodes well-learned syllable and phoneme motor programs • Plan cells get input from constituent phonemes at corresponding positions • Hard-wired connections obey conservation of synaptic input • Win = 1 • Phoneme cells receive input from all possible syllable positions • SSM plan cells compete based on phonological match (cf. other GODIVA model plan layers) • Choice represents largest matching syllable program
Simulation (i) Time Time
Simulation (ii) Time Time
Beyond the basic model • Communication Disorders • McNeil on DIVA (2004) – • “while this model addresses phenomena that may be relevant to differential diagnosis of motor speech disorders in its current stage of development it has not been extended to make claims about the relationship between disrupted processing and speech errors in motor speech disorders.” • The GODIVA model makes predictions about the effects of damage to a particular component or mapping between components. • Apraxia of Speech could be caused by destruction or inefficiency of IFS choice buffer to SSM plan cell connections. (Selection of motor programs) • Phonological paraphasias could be caused by damage within the IFS plan field or corresponding BG planning loop. • The components described in the model, and also that generally show modulation due to stimulus complexity in the fMRI study, correspond strikingly well to those found to be abnormal in a morphometric study of the “KE” family.
Beyond the basic model • Speech Errors in normal subjects • With the addition of noise, constrained errors arise in the phonological choice process in IFS. • GODIVA accounts for the syllable position constraint, and movement distance gradient. • GODIVA predicts a syllable onset effect as observed in error data, but perhaps not of enough magnitude. • Phonological similarity effects may be (partially) achieved by a phonotopic organization within IFS ( e.g. Kohonen, 1988 ) where competition ( lateral inhibition ) falls off with physical distance.