Decoding Speech with ECoG – Computational Challenges

Decoding Speech with ECoG – Computational Challenges Chris Holdgraf Helen Wills Neuroscience Institute, UC Berkeley

Challenge in neuroscience • Neuroscience is a very broad field. It covers everything from gene expression, to a single neuron firing, to activity across the whole brain in humans. • As such, one must have a wide range of knowledge and a diverse set of techniques. • Often makes it hard to have the best domain-specific knowledge.

Mapping the world onto the brain • The trick is to fit some function that links brain activity with the outside world. • However, we also want it to be a function that is scientifically meaningful.

Neuroscience/Psychology and computation • Historically, there has been a focus on tightly-controlled experiments and simple questions. • Advances in imaging and electrophysiological methods have increased the quality and quantity of data.

Electrocorticography – a blend of temporal and spatial resolution • ECoG involves the application of electrodes directly to the surface of the brain. • This avoid many problems with EEG, while retaining the rich temporal precision of the signal.

Complex and noisy data requires careful methods • ECoG is only possible in those with some sort of pathology. Moreover, recording time is short. • Data driven methods – bad data in = bad models out.

Merging ECoG and Computational Methods • Might be possible to leverage the spatial precision of ECoG to decode the nature of this processing.

Challenge 1: GLMs in Neuroscience

Computational Challenge #1 How to fit a model that is both interpretable and a good fit for the electrode’s response. • The parameter space is increasingly complex for more hypotheses. • Oftentimes, this is paired with a limited dataset. Especially in ECoG. • Regularization and Feature Selection become very important

Want it simple? Use a GLM! Linear models allow us to predict some output with a model that is both interpretable and (relatively) easy to fit.

One problem with this… • However, the brain assuredly does not vary linearly in response to inputs from the outside world.

Basis functions • Instead, we can decompose an input stimulus as a combination of “basis functions” • Basically, this entails a non-linear transformation of your stimulus, so that fitting linear models to brain activity makes more sense.

Exploring the brain through basis functions “dog” “hat” “car” “man”

Fitting weights with gradient descent • We can find the values for these weights by following the typical least-squares regression approach. • Early stopping must be tuned carefully in order to regularize. • Full gradient descent • Coordinate gradient descent • Threshold gradient descent

An application of the GLM for neural decoding

Neural Decoding • If you can map stimuli onto brain activity, then you could also map brain activity onto stimuli. • Same approach, but now our inputs are values from the electrodes, and the output is sound. • Implications in Neural Prostheses and Brain Computer Interfaces Speech Decoding

Decoding with a linear model ReconstructedSpectrogram Decoding Model High Gamma Neural Signal = X Original Spectrogram

Decoding Listened Speech High Gamma (60-200 Hz) Pasley et al. Plos Biology, 2012

Speech Reconstruction from ECoG

Challenge 2: From model output to language

Challenge #2 Turn a noisy, variable spectrogram reconstruction into linguistic output. • Simpler methods are often not powerful enough to account for these small variations • How to take advantage of temporal correlations between words / phonemes? • How to accomplish this without a ton of data?

How to classify this output? Town Doubt Property Pencil

From model output to language • Borrow ideas from the speech recognition literature. • Currently using Dynamic Time Warping to match output spectrograms to words.

Dynamic Time Warping • Computedissimilaritymatrix betweenevery pair of elements • Findthe optimal path in order to minimize the overallaccumulateddistance • Effectivelywarps and realigns the twosignals

Current output workflow Reconstructed Spectrogram DTW

Where to go from here?

Improving the decoder fit • Clever methods of dealing with finite and noisy datasets • Finding better features (basis functions) • Interactions between features • Fitting more complicated models • Interactions between features • Nonlinear models are useful for engineering, but require much more data

Turning output into reconstructed language • Leverage the spectro-temporal statistics of language • Focus on a classification rather than arbitrary decoding /w/ /ch/ /ks/ /g/

The “Big Data” Angle • Right now, the field of ECoG is in a bit of a transition period • Excitement around using computational methods, but many labs (including my own) don’t have the infrastructure and culture to tackle “big data” problems. • That said, we do have the potential to collect increasingly large datasets, once we know what to do with them.

The Long-Term Goal Create a modeling framework that allows us to use ECoG to decode linguistic information.

Fellow Decoders Special thanks Frederic Theunissenand co. Jack Gallant and co. STRFLab Team Brian Stéphanie Gerv Eddie Peter

Decoding Speech with ECoG – Computational Challenges

Decoding Speech with ECoG – Computational Challenges

Presentation Transcript

Poetry

Speech perception

Challenges for the Message Passing Interface in the Petaflops Era

The Speech Mechanism

Computerized Speech Lab CSL

Equilibrium refinements in computational game theory

Speech Segregation

Speech Recognition and Understanding

Tutorial: Computational Voting Theory

Speech Recognition Introduction II

Maid Of Honor Speech Ideas

Part-of-Speech Tagging

Farewell Speech

Reported speech / Indirect speech

Part-of-speech tagging

Speech Segregation

Computational Modeling of Macromolecular Systems

Chapter 2 Speech Sounds

Computational Linguistics

Parts of Speech

Speech Recognition

Feature Computation: Representing the Speech Signal