350 likes | 588 Views
Computational and Physiological Models Part 1. Ekaterina Lomakina Computational P sychiatry S eminar: Computational Neuropharmacology 7 March, 2014. Entropy and the theory of life.
Computational and Physiological Models Part 1 Ekaterina Lomakina Computational Psychiatry Seminar: Computational Neuropharmacology 7 March, 2014
Entropy and the theory of life “The general struggle for existence of animate beings is not a struggle for raw materials – these, for organisms, are air, water and soil, all abundantly available – nor for energy which exists in plenty in any body in the form of heat, but a struggle for [negative] entropy, which becomes available through the transition of energy from the hot sun to the cold earth.” Ludwig Boltzmann, 1875 “[...] if I had been catering for them [physicists] alone I should have let the discussion turn on free energy instead. It is the more familiar notion in this context. But this highly technical term seemed linguistically too near to energy for making the average reader alive to the contrast between the two things.” Erwin Schrödinger, 1944
Let’s go a bit more formally – 1 • The defining characteristic of biological systems is that they maintain their states and form in the face of a constantly changing environment • Mathematically, this means that the probability of sensory states in which biological agent can be present must have low entropy. • Entropyis also the average self information or surprise. Low surprisemeans that agent is likely to be in one of the very few states, so observing each of this states would cause low surprise(both emotionally and mathematically). • So to stay ‘alive’ biological agents must therefore minimize the long-term average surprise to ensure that their sensory entropy remains low. Minimize entropy Minimize surprise
Let’s go a bit more formally – 2 • So system must avoid surprise. But how? It does not know everything what is going to happen… • System can only evaluate what is available to it: it’s sensory experience and it’s own properties, i.e. energy efficiency and robustness. • That’s where free-energy comes into play. • Thermodynamical (Helmholtz) free-energy is the work obtainable from a closed system at a constant temperature. FE = Energy + Entropy • Statistical Free-energy is a negative sum of the predictive accuracy of the model (energy of the model) and the entropy of the model. • As we show now free-energy can be seen as an upper-bound for surprise, which however system can optimize. Minimize entropy Minimize surprise Minimize free-energy
And a bit more mathematically – 1 Let’s look at the log version Bayes formula, where s(t)are the sensory states and ϑare representational parameters of the agent: Surprise! If we would have known the true generative model of the sensorium then we would have been able to compute model evidence of the model or it’s average surprise. It is also called sometimes the evidence of the agent’s self existence.
And a bit more mathematically – 2 • However, the true model is almost always unavailable or at least computationally expensive to compute. Which means that brain is unlikely to be able to perform such an operation. • Instead, one can propose simpler model q and optimize it to be as similar to the true model as possible. • D is the KL divergence between proposed model q and true model p, which is always positive. • The smaller D becomes (the closer q gets to p) the lower becomes F and the closer F gets to model evidence. Thus F becomes the upper-bound for surprise.
And a bit more mathematically – 3 • However, we can’t optimize D directly as it requires knowledge of the true model p. But after some magic… • This now can be efficiently optimized, using Variational Bayes technics. • They (usually) provide us with fast and efficient update and decision rules which are likely to be implementable within brain, contrary to sampling or numerical integration. Expected energy (accuracy) Entropy (complexity)
And one last bit of formulas • Free energy can be minimized in two ways: • By changing the mental representation (optimizing it such that it better explains data and becomes more compact) • By performing actions which would reduce potential surprise.
Bayesian brain hypothesis • The Bayesian brain hypothesis uses Bayesian probability theory to formulate perception as a constructive process based on internal or generative models, where brain is presented as an inference machine that actively predicts and explains its sensations. • Probabilistic model generates predictions, against which sensory samples are tested to update beliefs about their causes. • The brain is an inference engine that is trying to optimize probabilistic representations of what caused its sensory input. • This optimization can be finessed using a (variational free-energy) bound on surprise. Accuracy Complexity p(x|ϑ) – generative model (likelihood) p(ϑ) – prior beliefs Probabilistic Causality P(ϑ|x) – posterior belief about causes Sensations X Update beliefs
Bayesian brain hypothesis Two key questions in this hypothesis are: • How to choose the form of generative model and the choice of prior beliefs? Answer: to use hierarchical models in which the priors themselves are optimized. • How to choose the form of the proposed model or distribution q? Answer: it can take take any form driving the choice of optimization procedure, but the simplest assumption as it to be Gaussian (Laplace approximation). Then minimizing free energy simply explains away prediction error.
Bayesian brain hypothesis • In case of Gaussian assumption the scheme is known as predictive coding. • It is a popular framework for understanding neuronal message passing among different levels of cortical hierarchies. • This scheme has been used to explain many features of early visual responses and can plausibly explain repetition suppression and mismatch responses in electrophysiology.
The principle of efficient coding • The principle of efficient coding suggests that the brain optimizes the mutual information (that is, the mutual predictability) between the sensory states and its internal representation, under constraints on the efficiency of those representations. • The infomax principle says that neuronal activity should encode sensory information in an efficient and parsimonious fashion Efficient coding Representing variables Sensory states
The principle of efficient coding • The infomax principle might be presented as a special case of the free-energy principle, which arises when we ignore uncertainty in probabilistic representations. • The infomax principle can be understood in terms of the decomposition of free energy into complexity and accuracy: mutual information is optimized when conditional expectations maximize accuracy (or minimize prediction error), and efficiency is assured by minimizing complexity. Complexity Accuracy Coding length Mutual information
The cell assembly theory • ‘Cells that fire together wire together’. • Conditional expectations about states of the world are encoded by synaptic activity. • Learning under the free-energy principle is the the optimization of the connection strengths in hierarchical models of the sensory states. • It appears that a gradient descent on free energy is formally identical to Hebbianplasticity.
The cell assembly theory • When the predictions and prediction errors are highly correlated, the connection strength increases, so that predictions can suppress prediction errors more efficiently. • Synaptic gain of prediction error units is modulated by the precision of units. • The most obvious candidates for controlling gain are classical neuromodulators like dopamine and acetylcholine.
Neural Darwinism • This theory focuses on how selection and reinforcement of action policies is performed within the boundaries of cell assembly theory. • Only neuronal assembly which increase evolutionary value get reinforcement through the interaction with the environment (natural selection). • Plasticity is thus modulated through value. • Neuronal value systems reinforce connections to themselves, thereby enabling the brain to label a sensory state as valuable if, and only if, it leads to another valuable state. • This theory has deep connections with reinforcement learning and related approaches in engineering, such as dynamic programming and temporal difference models. Epigenetic mechanisms Primary repertoire of neuronal connections Experience-dependent plasticity Secondary repertoire of neuronal connections
Neural Darwinism • Value is inversely proportional to surprise: the probability of an agent being in a particular state increases with the value of that state. • The evolutionary value of an agent is the negative surprise averaged over all the states it experiences, which is simply its negative entropy. • Prior expectations (that is, the primary repertoire) can prescribe a small number of attractive states with innate value. They can also affect the way world is sampled, i.e. force agent to explore until states with innate value are founded. • Neural Darwinism exploits the selective processesin order to explain brain evolution. • Free energy formulation considers the optimization of ensemble or population dynamics in terms of entropy and surprise. Complexity Accuracy Priors on small amount of innate ‘good’ states + on exploration ‘Survival’ or rate change of value
Optimal control theory • Optimal control theory describes how optimal actions should be selected to optimize expected cost. • Free energy is an upper bound on expected cost. • According to the principle of optimality cost is the rate of change of value, which depends on changes in sensory states. • Optimized policy ensures that the next state is the most valuable of the available states. • Priors specify small amount of fixed-point attractors, and when the states arrive at the fixed point, value will stop changing and cost will be minimized. Additional priors on motion through state space enforce exploration until an attractive state is found. • Action under the free-energy principle is meant to suppress sensory prediction errors that depend on predicted (expected or desired) movement trajectories. Complexity Accuracy Priors on small amount of fixed-point attractors where system should arrive + on motion Rate of change of value
Overview of free-energy principle • Many global theories of brain function can be united under a free energy principle. • The commonality is that brain optimizes a (free-energy) bound on surprise or its complement, value. • This manifests as perception (so as to change predictions) or action (so as to change the sensations that are predicted). • Crucially, these predictions depend on prior expectations (that furnish policies), which are optimized at different (somatic and evolutionary) timescales and define what is valuable.
Dopamine as a prediction error encoder • Measures show that dopamine sensitively react to the reward itself or to the conditional stimuli predicting reward, when such a connection is learnt. • However dopamine level decreases in case of lack of predicted reward. • That gave rise to the hypothesis that dopamine encodes prediction error. • Optimal control theory might provide a framework to explain.
Temporal difference algorithm • The computational goal of learning is to maximize expected discounted reward V (value function proportional to surprise). • It can be computed dynamically knowing rewards at the previous time points. • Prediction error δ(TD error) can be defined as a linear combination of reward and change in surprise. • M1 and M2 are two cortical modalities which input (as a derivative of value V(t)) arrives at VTA. Reward r(t) also converges on the VTA. • VTA output as a prediction error is taken as a simple linear sum informing structures constructing the prediction.
Representing a stimulus through time • The ability to predict in time within the model is encoded as events happening at the different time points being different stimuli. • The predicted value is then modeled as a weighted linear combination of the sensory input. • The weights are updated using the information about the prediction error.
Simulations • The conditional stimuli were presented at time step 10 and time step 20 followed by reward on time step 60. The prediction of the model. • Absence of reward on one of the intermediate trials causes a large negative fluctuation of the prediction error. • However overall model is able to learn well the dependency between conditional stimulus and reward, while also blocking the repetition of redundant in terms of information secondary stimulus. • The behavior of the prediction error mimics accuratly the measured dopamine response in monkeys in similar situation.
Hierarchical Gaussian filtering • One particular model within Bayesian Brain hypothesis is Hierarchical Gaussian filtering. • It consists of a hierarchical Bayesian model of learning through perception. • The response model can deal with states and inputs that are discrete or continuous, uni- or multivariate, and as well as with deterministic and probabilistic relationships between environment and perception. • Parameters of the model can account for individual differences between agents and can be used to simulate and explain maladaptive behavior. Volatility Learning Probability Perception Input
Hierarchical Gaussian filtering • This is the simplest example (with univariate, binary deterministic response model). • Each layer of the model performs Gaussian random walk with parameters guiding layer coupling and width of the Gaussian walks. • As many layers as needed can be added on top. • Priors on the parameters χ = {κ, ω, ϑ} allow to perform full Bayesian inference. • Inverting this model corresponds to optimizing the posterior densities over the unknown (hidden) states x = {x1, x2, x3} and parameters χ. This corresponds to perceptual inference and learning, respectively.
Inversion under free energy principle • Exact inference of such a model would involve expensive numerical optimization. • Instead was used Variational Bayesian approach (which involves minimization of free energy). • The key assumption which was made is a factorization of the recognition distribution q (mean-field approximation)
Update equations The resulting update equations are not only efficient and easy to compute but also resemble results from the field of reinforcement learning. Rescorla-Wagner model: prediction(k) - prediction(k-1) =learning rate x prediction error
Precision updates • Variances σ1, σ2and σ3are also updated on every step in form of the precision (inverse variance) • The precision updates account for two type of uncertainty ‘informational’ (the lack of knowledge about x2)and ‘environmental’ (the volatility on the third level). • It has been proposed that dopamine might encode the value of prediction error, i.e., the precision-weighting of prediction errors. This is encoded by parameters κ and ω.
Meaning and the effect of the parameters Reference scenario: ϑ = 0.5, ω = −2.2, κ = 1.4
Meaning and the effect of the parameters Reduced ϑ = 0.05 (unchanged ω = −2.2, κ = 1.4)
Meaning and the effect of the parameters Reduced ω = −4 (unchanged ϑ = 0.5, κ = 1.4)
Meaning and the effect of the parameters Reduced κ = 0.2 (unchanged ϑ = 0.5, ω = −2.2)
Potential for real data Conventional observed behavioural data fail to detect difference between subjects However computational model reveal striking difference of the underlying mechanisms fraction of correct responses[running average] healthy participant volatility trial-wise reward[running average] probability trials prodromal schizophrenia volatility reaction time [s] probability 2 4 6 trials
Outcome • Using free-energy principle we derive fast & efficient update equations which can be implemented by brain. • The resulting update equations present a clear connection to the field of reinforcement learning. • Parameterization of these equations accounts for individual differences and can model the whole variety of maladaptive behavior. • There is evidence that some of the parameters may correspond to neuromodulators, in particular dopamine.
Conclusions • Computational models inspired by expert knowledge about the brain can provide us with powerful insights how brain works in healthy and maladaptive ways. • Free-energy principle provides a powerful framework, which generalizes many of the existent theories about the brain functioning. • However, “all models are wrong but some are useful”. To prove the correctness of model we have to see whether it predicts real data – more next week. • Neuromodulators can be predicted to explain certain parameters of the models. I.e. dopamine seems to play key role in the processes of reward-driven learning. Studies with pharmacological manipulations together with computational models can provide a deeper mechanistical insight on the particular role of it.