680 likes | 828 Views
Brain Machine Interfaces: Modeling Strategies for Neural Signal Processing. Jose C. Principe, Ph.D. Distinguished Professor ECE, BME Computational NeuroEngineering Laboratory Electrical and Computer Engineering Department University of Florida www.cnel.ufl.edu principe@cnel.ufl.edu.
E N D
Brain Machine Interfaces: Modeling Strategies for Neural Signal Processing Jose C. Principe, Ph.D. Distinguished Professor ECE, BME Computational NeuroEngineering Laboratory Electrical and Computer Engineering Department University of Florida www.cnel.ufl.edu principe@cnel.ufl.edu
Brain Machine Interfaces (BMI) • A man made device that either substitutes a sensory input to the brain, repairs functional communication between brain regions or translates intention of movement.
Types of BMIs • Sensory (Input BMI): Providing sensory input to form percepts when natural systems are damaged. • Ex: Visual, Auditory Prosthesis • Motor (Output BMI): Converting motor intent to a command output (physical device, damaged limbs) • Ex: Prosthetic Arm Control • Cognitive BMI: Interpret internal neuronal state to deliever feedback to the neural population. • Ex: Epilepsy, DBS Prosthesis Computational Neuroscience and Technology developments are playing a larger role in the development of each of these areas.
General Architecture BCI (BMI)bypasses the brain’s normal pathways of peripheral nerves (and muscles) J.R. Wolpaw et al. 2002
The Fundamental Concept BRAIN MACHINE INTENT ACTION Decoding PERCEPT STIMULUS Coding Neural Interface Physical Interface Need to understand how brain processes information.
Levels of Abstraction for Neurotechnology • Brain is an extremely complex system • 1012 neurons • 1015 synapses • Specific interconnectivity
Tapping into the Nervous System • The choice and availability of brain signals and recording methods can greatly influence the ultimate performance of the BMI. The level of BMI performance may be attributed to selection of electrode technology, choice of model, and methods for extracting rate, frequency, or timing codes.
Coarse(mm) http://ida.first.fhg.de/projects/bci/bbci_official/
Florida Multiscale Signal Acquisition Develop a experimental paradigm with a nested hierarchy for studying neural population dynamics. Least Invasive EEG NRG IRB Approval for Human Studies ECoG NRG IACUC Approval for Animal Studies Microelectrodes Highest Resolution
Common BMI-BCI Methods • BMIs --- Invasive, work with intention of movement • Spike trains, field potentials, ECoG • Very specific, potentially better performance • BCIs --- Noninvasive, subjects must learn how to control their brain activity • EEG • Very small bandwidth
Computational NeuroScience • Integration of probabilistic models of information processing with the neurophysiological reality of brain anatomy, physiology and purpose. • Need to abstract the details of the “wetware” and ask what is the purpose of the function. Then quantify it in mathematical terms. • Difficult but very promising. One issue is that biological evolution is a legacy system! • BMI research is an example of a computational neuroscience approach.
How to put it together? • NeoCortical Brain Areas Related to Movement Posterior Parietal (PP) – Visual to motor transformation Premotor (PM) and Dorsal Premotor (PMD) - Planning and guidance (visual inputs) Primary Motor (M1) – Initiates muscle contraction
Ensemble Correlations – Local in Time – are Averaged with Global Models
Computational Models of Neural Intent • Two different levels of neurophysiology realism • Black Box models – no realism, function relation between input desired response • Generative Models – minimal realism, state space models using neuroscience elements
Signal Processing Approaches with Black Box Modeling Accessing 2 types of signals (cortical activity and behavior) leads us to a general class of I/O models. Data for these models are rate codes obtained by binning spikes on 100 msec windows. • Optimal FIR Filter – linear, feedforward • TDNN – nonlinear, feedforward • Multiple FIR filters – mixture of experts • RMLP – nonlinear, dynamic
Linear Model (Wiener-Hopf solution) Consider a set of spike counts from M neurons, and a hand position vector dC (C is the output dimension, C = 2 or 3). The spike count of each neuron is embedded by an L-tap discrete time-delay line. Then, the input vector for a linear model at a given time instance n is composed as x(n) = [x1(n), x1(n-1) … x1(n-L+1), x2(n) … xM(n-L+1)]T, xLM, where xi(n-j) denotes the spike count of neuron i at a time instance n-j. A linear model estimating hand position at time instance n from the embedded spike counts can be described as where yc is the c-coordinate of the estimated hand position by the model, wji is a weight on the connection from xi(n-j) to yc, and bc is a bias for the c-coordinate.
x1(n) yx(n) … … … z-1 z-1 z-1 z-1 yy(n) xM(n) yz(n) Linear Model (Wiener-Hopf solution) In a matrix form, we can rewrite the previous equation as where y is a C-dimensional output vector, and W is a weight matrix of dimension (LM+1)C. Each column of W consists of [w10c, w11c, w12c…, w1L-1c, w20c, w21c…, wM0c, …, wML-1c]T.
Linear Model (Wiener-Hopf solution) For the MIMO case, the weight matrix in the Wiener filter system is estimated by R is the correlation matrix of neural spike inputs with the dimension of (LM)(LM), where rij is the LL cross-correlation matrix between neurons i and j (i ≠ j), and rii is the LL autocorrelation matrix of neuron i. P is the (LM)C cross-correlation matrix between the neuronal bin count and hand position, where pic is the cross-correlation vector between neuron i and the c-coordinate of hand position. The estimated weights WWiener are optimal based on the assumption that the error is drawn from white Gaussian distribution and the data are stationary.
Linear Model (Wiener-Hopf solution) The predictor WWiener minimizes the mean square error (MSE) cost function, Each sub-block matrix rij can be further decomposed as where rij() represents the correlation between neurons i and j with time lag . Assuming that the random process xi(k) is ergodic for all i, we can utilize the time average operator to estimate the correlation function. In this case, the estimate of correlation between two neurons, rij(m-k), can be obtained by
Linear Model (Wiener-Hopf solution) The cross-correlation vector pic can be decomposed and estimated in the same way, substituting xj by the desired signal cj. From the equations, it can be seen that rij(m-k) is equal to rji(k-m). Since these two correlation estimates are positioned at the opposite side of the diagonal entries of R, the equality leads to a symmetric R. The symmetric matrix R, then, can be inverted effectively by using the Cholesky factorization. This factorization reduces the computational complexity for the inverse of R from O(N3) using Gaussian elimination to O(N2) where N is the number of parameters.
Optimal Linear Model • Normalized LMS with weight decay is a simple starting point. • Four multiplies, one divide and two adds per weight update • Ten tap embedding with 105 neurons • For 1-D topology contains 1,050 parameters (3,150) • Alternatively, the Wiener solution
Time-Delay Neural Network (TDNN) • The first layer is a bank of linear filters followed by a nonlinearity. • The number of delays to span I second • y(n)= Σ wf(Σwx(n)) • Trained with backpropagation • Topology contains a ten tap embedding and five hidden PEs– 5,255 weights (1-D) Principe, UF
Multiple Switching Local Models • Multiple adaptive filters that compete to win the modeling of a signal segment. • Structure is trained all together with normalized LMS/weight decay • Needs to be adapted for input-output modeling. • We selected 10 FIR experts of order 10 (105 input channels) d(n)
Recurrent Multilayer Perceptron (RMLP) – Nonlinear “Black Box” • Spatially recurrent dynamical systems • Memory is created by feeding back the states of the hidden PEs. • Feedback allows for continuous representations on multiple timescales. • If unfolded into a TDNN it can be shown to be a universal mapper in Rn • Trained with backpropagation through time
Motor Tasks Performed Data Task 1 • 2 Owl monkeys – Belle, Carmen • 2 Rhesus monkeys – Aurora, Ivy • 54-192 sorted cells • Cortices sampled: PP, M1, PMd, S1, SMA • Neuronal activity rate and behavior is time synchronized and downsampled to 10Hz Task 2
Model Building Techniques • Train the adaptive system with neuronal firing rates (100 msec) as the input and hand position as the desired signal. • Training - 20,000 samples (~33 minutes of neuronal firing) • Freeze weights and present novel neuronal data. • Testing - 3,000 samples – (5 minutes of neuronal firing)
Signal to error ratio (dB) Correlation Coefficient (average) (max) (average) (max) LMS 0.8706 7.5097 0.6373 0.9528 Kalman 0.8987 8.8942 0.6137 0.9442 TDNN 1.1270 3.6090 0.4723 0.8525 Local Linear 1.4489 23.0830 0.7443 0.9748 RNN 1.6101 32.3934 0.6483 0.9852 Results (Belle) • Based on 5 minutes of test data, computed over 4 sec windows (training on 30 minutes)
Physiologic Interpretation • When the fitting error is above chance, a sensitivity analysis can be performed by computing the Jacobian of the output vector with respect to each neuronal input i • This calculation indicates which inputs (neurons) are most important for modulating the output/trajectory of the model.
Computing Sensitivities Through the Models Identify the neurons that affect the output the most. Feedforward Linear Eq. Feedforward RMLP Eqs. General form of Linear Sensitivity General form of RMLP Sensitivity
Data Analysis : The Effect of Sensitive Neurons on Performance Decay trend appears in all animals and behavioral paradigms
Directional Tuning vs. Sensitivity of ranked cells Tuning Sensitivity Significance: Sensitivity analysis through trained models automatically delivers deeply tuned cells that span the space.
Reaching Movement Segmentation How does each cortical area contribute to the reconstruction of this movement?
Cortical Contributions Belle Day 2 Train 15 separate RMLPs with every combination of cortical input.
Is there enough information in spike trains for modeling movement? • Analysis is based on the time embedded model • Correlation with desired is based on a linear filter output for each neuron • Utilize a non-stationary tracking algorithm • Parameters are updated by LMS • Build a spatial filter • Adaptive in real time • Sparse structure based on regularization for enables selection
z-1 z-1 z-1 z-1 Architecture x1(n) w11 y1(n) // c1 w1L y2(n) c2 … … xM(n) cM wM1 yM(n) // wML Adapted by on-line LAR (Kim et. al., MLSP, 2004) Adapted by LMS
x1 k . . . y r= y-X = y Find argmaxi |xiTr| i=0 xj xj xk r= y-X = y-xjj Adjustj s.t. k, |xkTr|=|xiTr| r= y-(xjj+ xkk) Adjustj & k s.t. q, |xqTr|=|xkTr|=|xiTr| j j Training Algorithms • Tap weights for every time lag is updated by LMS • Then, the spatial filter coefficients are obtained by on-line version of least angle regression (LAR) (Efron et. al. 2004)
Application to BMI Data – Neuronal Subset Selection Hand Trajectory (z) Early Part Neuronal Channel Index Late Part
Generative Models for BMIs • Use partial information about the physiological system, normally in the form of states. • They can be either applied to binned data or to spike trains directly. • Here we will only cover the spike train implementations. Difficulty of spike train Analysis: Spike trains are point processes, i.e. all the information is contained in the timing of events, not in the amplitude fo the signals!
Goal Build an adaptive signal processing framework for BMI decoding in the spike domain. • Features of Spike domain analysis • Binning window size is not a concern • Preserve the randomness of the neuron behavior. • Provide more understanding of neuron physiology (tuning) and interactions at the cell assembly level • Infer kinematics online • Deal with nonstationary • More computation with millisecond time resolution
Time-series model State cont. observ. Prediction Updating P(state|observation) Recursive Bayesian Approach
Recursive Bayesian approach • State space representation First equation (system model) defines a first order Markov process. Second equation (observation model) defines the likelihood of the observations p(zt|xt) . The problem is completely defined by the prior distribution p(x0). Although the posterior distribution p(x0:t|u1:t,z1:t) constitutes the complete solution, the filtering density p(xt|u1:t, z1:t) is normally used for on-line problems. The general solution methodology is to integrate over the unknown variables (marginalization).
Recursive Bayesian approach • There are two stages to update the filtering density: • Prediction (Chapman Kolmogorov) System model p(xt|xt-1) propagates into the future the posterior density • Update Uses Bayes rule to update the filtering density. The following equations are needed in the solution.
Kalman filter for BMI decoding For Gaussian noises and linear prediction and observation models, there is an analytic solution called the Kalman Filter. Continuous Observation Linear Neuron tuning function Kinematic State Firing rate Linear Gaussian Prediction P(state|observation) Updating [Wu et al. 2006]
Particle Filter for BMI decoding In general the integrals need to be approximated by sums using Monte Carlo integration with a set of samples drawn from the posterior distribution of the model parameters. Continuous Observation Exponential Neuron tuning function Kinematic State Firing rate Linear nonGaussian Prediction P(state|observation) Updating [Brockwell et al. 2004]
v x k-1 k-1 x k F k ( ) = , = ( ) , n z H x k k k k State estimation framework for BMI decoding in spike domain Tuning function Neural Tuning function Kinematics state Multi-spike trains observation Decoding Kinematic dynamic model Key Idea: work with the probability of spike firing which is a continuous random variable
Adaptive algorithm for point processes Poisson Model nonlinear Point process Neuron tuning function Kinematic State spike train Linear Gaussian Prediction P(state|observation) Updating [Brown et al. 2001]
Monte Carlo Sequential estimation for point process nonlinear Point process Neuron tuning function Kinematic State spike train nonLinear nonGaussian Prediction P(state|observation) Updating Sequential Estimate PDF [Wang et al. 2006]