880 likes | 891 Views
This course explores the encoding process and decoding population codes in computational neuroscience, including quantifying information using Shannon and Fisher information. Topics also include basis functions and optimal computation.
E N D
Population CodingAlexandre PougetOkinawa Computational Neuroscience CourseOkinawa, Japan November 2004
Outline • Definition • The encoding process • Decoding population codes • Quantifying information: Shannon and Fisher information • Basis functions and optimal computation
Outline • Definition • The encoding process • Decoding population codes • Quantifying information: Shannon and Fisher information • Basis functions and optimal computation
Code: number of spikes 10 Receptive field s: Direction of motion Response Stimulus
10 7 4 8 Receptive field s: Direction of motion Trial 1 Trial 2 Trial 3 Trial 4 Stimulus
Variance of the noise, si(0)2 Variance, si(s)2, can depend on the input Mean activity fi(0) Tuning curve fi(s) Encoded variable (s)
Tuning curves and noise Example of tuning curves: Retinal location, orientation, depth, color, eye movements, arm movements, numbers… etc.
Population Codes 100 100 s? 80 80 60 60 Activity Activity 40 40 20 20 0 0 -100 0 100 -100 0 100 Direction (deg) Preferred Direction (deg) Tuning Curves Pattern of activity (r)
Bayesian approach We want to recover P(s|r). Using Bayes theorem, we have:
Bayesian approach Bayes rule:
likelihood of s prior distribution over s prior distribution over r posterior distribution over s Bayesian approach We want to recover P(s|r). Using Bayes theorem, we have:
Bayesian approach If we are to do any type of computation with population codes, we need a probabilistic model of how the activity are generated, p(r|s), i.e., we need to model the encoding process.
P(ri|s=-60) P(ri|s=-60) P(ri|s=0) Activity distribution
Tuning curves and noise The activity (# of spikes per second) of a neuron can be written as: where fi(q) is the mean activity of the neuron (the tuning curve) and ni is a noise with zero mean. If the noise is gaussian, then:
Probability distributions and activity • The noise is a random variable which can be characterized by a conditional probability distribution, P(ni|s). • The distributions of the activity P(ri|s). and the noise differ only by their means (E[ni]=0, E[ri]=fi(s)).
Examples of activity distributions Gaussian noise with fixed variance Gaussian noise with variance equal to the mean
Poisson distribution: The variance of a Poisson distribution is equal to its mean.
Comparison of Poisson vs Gaussian noise with variance equal to the mean 0.09 0.08 0.07 0.06 0.05 Probability 0.04 0.03 0.02 0.01 0 0 20 40 60 80 100 120 140 Activity (spike/sec)
Population of neurons Gaussian noise with fixed variance
Population of neurons Gaussian noise with arbitrary covariance matrix S:
Outline • Definition • The encoding process • Decoding population codes • Quantifying information: Shannon and Fisher information • Basis functions and optimal computation
Population Codes 100 100 s? 80 80 60 60 Activity Activity 40 40 20 20 0 0 -100 0 100 -100 0 100 Direction (deg) Preferred Direction (deg) Tuning Curves Pattern of activity (r)
Estimation theory: come up with a single value estimate from r Nature of the problem In response to a stimulus with unknown value s, you observe a pattern of activity r. What can you say about s given r? Bayesian approach: recover p(s|r) (the posterior distribution)
Decoder Encoder (nervous system) 100 80 60 40 20 0 -100 0 100 Preferred orientation Estimation Theory Activity vector: r
r1 r200 r2 Trial 1 Trial 2 Trial 200 Encoder (nervous system) Decoder Decoder Decoder Encoder (nervous system) Encoder (nervous system) . . . 100 100 100 80 80 80 60 60 60 40 40 40 20 20 20 0 0 0 -100 -100 -100 0 0 0 100 100 100 Preferred retinal location Preferred retinal location Preferred retinal location
Estimation Theory Decoder Encoder 100 80 60 40 20 0 -100 0 100 Preferred retinal location If , the estimate is said to be unbiased If is as small as possible, the estimate is said to be efficient r
Estimation theory • A common measure of decoding performance is the mean square error between the estimate and the true value • This error can be decomposed as:
Efficient Estimators The smallest achievable variance for an unbiased estimator is known as the Cramer-Rao bound, sCR2. An efficient estimator is such that In general :
Decoder Encoder (nervous system) 100 80 60 40 20 0 -100 0 100 Preferred orientation Estimation Theory Activity vector: r Examples of decoders
Voting Methods Optimal Linear Estimator
Linear Estimators X and Y must be zero mean Trust cells that have small variances and large covariances
Voting Methods Optimal Linear Estimator
Linear in ri/Sjrj Weights set to si Voting Methods Optimal Linear Estimator Center of Mass
Center of Mass/Population Vector • The center of mass is optimal (unbiased and efficient) iff: The tuning curves are gaussian with a zero baseline, uniformly distributed and the noise follows a Poisson distribution • In general, the center of mass has a large bias and a large variance
Voting Methods Optimal Linear Estimator Center of Mass Population Vector
P riPi s Population Vector
Linear in ri Weights set to Pi Nonlinear step Voting Methods Optimal Linear Estimator Center of Mass Population Vector
Population Vector Typically, Population vector is not the optimal linear estimator.
Population Vector • Population vector is optimal iff: The tuning curves are cosine, uniformly distributed and the noise follows a normal distribution with fixed variance • In most cases, the population vector is biased and has a large variance
Noise distribution Maximum Likelihood The maximum likelihood estimate is the value of s maximizing the likelihood P(r|s). Therefore, we seek such that: is unbiased and efficient.
100 80 60 Activity 40 20 0 -100 0 100 Preferred Direction (deg) Pattern of activity (r) Maximum Likelihood 100 80 60 Activity 40 20 0 -100 0 100 Direction (deg) Tuning Curves
100 80 60 Activity 40 20 0 -100 0 100 Preferred Direction (deg) Maximum Likelihood Template
100 80 60 40 20 -100 0 100 Maximum Likelihood Activity 0 Template Preferred Direction (deg)
ML and template matching Maximum likelihood is a template matching procedure BUT the metric used is not always the Euclidean distance, it depends on the noise distribution.
Maximum Likelihood The maximum likelihood estimate is the value of s maximizing the likelihood P(r|s). Therefore, we seek such that:
Distance measure: Template matching Maximum Likelihood If the noise is gaussian and independent Therefore and the estimate is given by:
100 80 60 40 20 -100 0 100 Maximum Likelihood Activity 0 Preferred Direction (deg)