1 / 27

Computational Aspects of Emotion in Adaptive Behavior

Computational Aspects of Emotion in Adaptive Behavior . Joost Broekens, Walter Kosters, Fons Verbeek LIACS, Leiden University, The Netherlands. Joost Broekens, LIACS, Leiden University, The Netherlands. Overview. Emotion & Information Processing. Adaptive agents: reactive, cognitive,

ady
Download Presentation

Computational Aspects of Emotion in Adaptive Behavior

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Aspects of Emotion in Adaptive Behavior Joost Broekens, Walter Kosters, Fons Verbeek LIACS, Leiden University, The Netherlands.

  2. Joost Broekens, LIACS, Leiden University, The Netherlands. Overview • Emotion & Information Processing. • Adaptive agents: • reactive, • cognitive, • emotion-modulated cognitive agents. • Experiment: Pleasure regulates information processing. • Future work.

  3. Joost Broekens, LIACS, Leiden University, The Netherlands. Emotion: communication medium, decision heuristic and modulator. • Common emotions: fear, anger, happiness, sadness, surprise, disgust. • Short episode triggered by an (internal/external) event composed of • subjective feelings, • inclinations to act (action preparation, action tendency (Frijda)), • facial expressions, • cognitive evaluation, and • physiological arousal (heartbeat, alertness). • Emotion: communication medium. • Communicate internal state (Biological & Sociological evidence: Darwin, Ekman). • Emotion: decision-heuristic relating events to goals, needs, desires, beliefs of an agent. • Result of evaluation of personal relevance, helps decision-making (Neurological & cognitive evidence: Damasio, appraisal theory). • Emotion: influences information processing. • Neurocomputational & cognitive evidence: Doya and Frijda, Manstead and Bem.

  4. Joost Broekens, LIACS, Leiden University, The Netherlands. Emotion & Information Processing • BiologyEmotion; internal drives, homeostasis, hardwired reactions • CognitionEmotion; cognitive emotion elicitation: • Emotions result from the interpretation of our world in relation to our goals, needs, desires, beliefs, etc. (Appraisal Theory, Frijda, Lazarus, Arnolds, etc.). • Emotionbehavior; emotion influences adaptive behavior: • emotion as drive, • emotion as source of information, • emotion as modulator of cognitive processes. • Relates to different types of (views on aspects of) adaptive agents: • reactive, • cognitive, • emotion-modulated cognitive agents.

  5. Joost Broekens, LIACS, Leiden University, The Netherlands. Emotions and reactive agents • Reactive agents: • have predefined behaviors, • learn new behavior based on instrumental conditioning, and • select behaviors based on this learned model and based on internal drives (motivations). • Emotion influences behavior: • can be such an internal drive, and • can trigger typical behaviors (fight / flight). • Computational models that study emotion within this context (drive/motivation) (Avila-Garcia and Cãnamero, 2004; Cãnamero, 1997; Velasquez, 1998).

  6. Joost Broekens, LIACS, Leiden University, The Netherlands. Emotion and cognitive agents • Cognitive agents are reactive agents plus: • Internally represented knowledge used in • planning and reasoning, and an • Attention mechanism guiding perception and action, • etc... • Emotion influences behavior: • is a source of (explicit) information used in reasoning (knowledge), and • can (implicitly) modulate information processing (systemic influence). • Computational models in which emotion is used as information (e.g. Botelho and Coelho).

  7. Joost Broekens, LIACS, Leiden University, The Netherlands. Thinking: Internal Simulation of Behavior • Internal simulation of behavior • Covertly execute and evaluate potential interaction using sensory-motor substrates (Hesslow, 2002; Damasio; Cotterill, 2001), but see also • “interaction potentialities” (Bickhard), and • “state anticipation” (Butz, Sigaud, Gérard, 2003). • Existing mechanisms are basis for simulation • Evolutionary continuity! • Our basis for information processing

  8. Joost Broekens, LIACS, Leiden University, The Netherlands. Emotion modulates information processing • Emotion influences thinking and behavior at multiple levels of cognitive complexity (Frijda, Manstead and Bem, 2000; Damasio, 1994; Davidson, 2000; Berridge, 2003; Rolls, 2000). • Emotion is integrated at multiple levels of processing &higher levels of processingconscious, reflective reasoningnot always existed evolutionary advantage to integration of emotion at lower levels can be expected; levels close to reward systems, and behavioral control. • If thinking is internal simulation of behavior, these low-level integration mechanisms should also learn us about the influence of emotion on higher-level cognitive mechanisms, e.g., on attention. • In this research we focus on the low-level influence of emotion on information processing in simulated adaptive agents. • We use emotion as a metalearning parameter (Doya, 2000). • Emotion: pleasure and arousal (Russell, 2003).

  9. Joost Broekens, LIACS, Leiden University, The Netherlands. Experiment: Can pleasure regulate information processing such that this provides an adaptive advantage for the agent?

  10. Joost Broekens, LIACS, Leiden University, The Netherlands. Pleasure regulates information processing Cognitive influence simulated reinforcement simulated interaction pleasure Interaction-selection Emotion process interaction predicted interactions action Reactive behavior percept RLmodel RL model Perception Action-selection reinforcement stimulus ENVIRONMENT

  11. Joost Broekens, LIACS, Leiden University, The Netherlands. Learning • The agent learns to interact with the environment through Reinforcement Learning (instrumental conditioning). • Agent’s actions are rewarded or punished. • Learns value-state predictions of potential next states. • Uses these predictions to determine what next action to do. • Basics of the model are based on (Sutton and Barto, 1998). • Learns through continuous interaction. • Learns based on perception-action pairs.

  12. Joost Broekens, LIACS, Leiden University, The Netherlands. Reward: propagate back to beginning, using a mechanism that solves the temporal credit assignment problem (i.e., find actions responsible for reward). Learning: reinforcement example

  13. Joost Broekens, LIACS, Leiden University, The Netherlands. Cognitive influence simulated reinforcement simulated interaction pleasure Interaction-selection Emotion process interaction predicted interactions action Reactive behavior percept Distributed-state RL model Perception Action-selection reinforcement stimulus ENVIRONMENT Action-Selection

  14. Joost Broekens, LIACS, Leiden University, The Netherlands. Action-Selection • Value-state predictions are transformed into action-values. • Action-selection is based on these action values. • Choose an action from the set of action-value pairs stochastically (e.g. using a Boltzmann distribution) • Action-selection responsible for exploration vs. exploitation behavior.

  15. Joost Broekens, LIACS, Leiden University, The Netherlands. Cognitive influence simulated reinforcement simulated interaction pleasure Interaction-selection Emotion process interaction predicted interactions action Reactive behavior percept Distributed-state RL model Perception Action-selection reinforcement stimulus ENVIRONMENT Our agent’s cognitive part (based on internal simulation of behavior)

  16. Joost Broekens, LIACS, Leiden University, The Netherlands. Simulation: action-selection bias At every step, instead of action-selection, select a subset of predicted interactions from reinforcement learning model  feed back to RL model. • Interaction-selection: select a subset of predicted interactions. • Simulate-and-bias-predicted-benefit: feed back to model as if a real interaction. 3. Action-selection: select the next action using the action-selection mechanism explained earlier based on the now biased action values. Cognitive influence simulated reinforcement simulated interaction pleasure Interaction-selection Emotion process interaction predicted interactions action Reactive behavior percept Hierarchical-state RL model Perception Action-selection reinforcement ENVIRONMENT stimulus

  17. Joost Broekens, LIACS, Leiden University, The Netherlands. Simulation: example • Action list before simulation (!hypothetical example!): • {up=0.2, down=-0.5, right=-1, left=-1} • Action-selection would have selected “up”, • With Boltzmann high probability for “up”. • Simulate all interactions. • Propagate back the predicted values by simulating interaction with environment. • Effect is a “value look-ahead” of 1 step. • Action list after simulation: • {up=0.1, down=0.5, right=-1, left=-1} • Action-selection selects “down”. • In this example simulating all predicted interactions helps . Roadblock r=-.5

  18. Joost Broekens, LIACS, Leiden University, The Netherlands. But: Simulating Everything is not Always Best • Even apart from fact that simulating everything costs mental effort. • Earlier experiments (Broekens, 2005) showed that • simulation has benefit, especially when many interactions are simulated. This is not surprising (better heuristic). However, • in some cases less simulation resulted in better learning. • Dynamic relation between environment and simulation “strategy” (i.e. simulation threshold: percentage of all predicted interactions to be simulated). • Emotion as metalearning to adapt amount of internal simulation? (Doya, 2002) • Pleasure is an indication of the current performance of the agent (Clore and Gasper, 2000). Also, • high pleasure top down thinking, andlow pleasure bottom up thinking (Fiedler and Bless, 2000).

  19. Joost Broekens, LIACS, Leiden University, The Netherlands. Cognitive influence simulated reinforcement simulated interaction pleasure Interaction-selection Emotion process interaction predicted interactions action Reactive behavior percept Distributed-state RL model Perception Action-selection reinforcement stimulus ENVIRONMENT Pleasure Modulates Simulation

  20. Joost Broekens, LIACS, Leiden University, The Netherlands. Pleasure Modulates Simulation • Many theories of emotion. • We use core-affect (or activation-valence)theory of emotion as basis. • Two fundamental factors, pleasure and arousal (Russell, 2003). • Pleasure relates to emotional valence, and • arousal relates to action-readiness, or activity. • In this study we model pleasure as simulation threshold. • We use pleasure to dynamically adapt the amount of interactions that are simulated. It is thus used as a dynamic simulation threshold. • We study the indirect effect of emotion as a metalearning parameter affecting information processing that on its turn influences action-selection.

  21. Joost Broekens, LIACS, Leiden University, The Netherlands. Pleasure Modulates Simulation • Pleasure quantification: indication of current performance relative to what the agent is used to. • Tried to capture this by the normalized difference between the short term average reinforcement signal and the long term average reinforcement signal: • Continuous pleasure feedback: • High pleasure, going well? Continue strategy, goal directed thinking. • > ep, high threshold, simulate predicted best interactions, • Low pleasure? Look broader, pay more attention to all predicted interactions. • < ep, low threshold, simulate many interactions. This is the only formula in the presentation! Cognitive influence simulated reinforcement simulated interaction pleasure, ep Interaction-selection Emotion process interaction predicted interactions action Reactive behavior percept Hierarchical-state RL model Perception Action-selection reinforcement ENVIRONMENT stimulus

  22. Joost Broekens, LIACS, Leiden University, The Netherlands. Experimental setup • To measure adaptive effect of pleasure-modulated simulation: force agent to adapt to new task. • First the agent has 128 trials to learn task 1, then • switch environment to new task, 128 trials to learn task 2. • Repeat for many different parameter settings (e.g. the window of the long and short term average reinforcement signals, the learning rate, etc…) • Pleasure predictions: • Pleasure increases to value near 1 (agent gets better at task) • then slowly converges down to .5. (agent gets used to task) • At switch: pleasure drops, (new task, drop in performance) • then increases to value near 1, and converges down to .5 (agent gets used to new task)

  23. Joost Broekens, LIACS, Leiden University, The Netherlands. Results • Performance of pleasure-modulated simulation is comparable with simulating ALL / Best 50% predicted interactions (static simulation threshold), but, using only 30% / 70% of the mental resources.

  24. Joost Broekens, LIACS, Leiden University, The Netherlands. Results • Some settings even have a significantly better performance at lower mental cost. • Predicted pleasure curve was confirmed

  25. Joost Broekens, LIACS, Leiden University, The Netherlands. Some conclusions • Can pleasure regulate information processing such that this provides an adaptive advantage for the agent? • Yes. • Simple pleasure feedback can be used to determine how broad an agent should internally simulate potential behavior. • Agent’s performance is comparable and mental effort decreases. • Since we introduce few new mechanism for simulationresults are relevant to the understanding of the evolutionary plausibility of the simulation hypothesis, as increased individual adaptation at lower cost is an evolutionary advantageous feature. • Our results provide clues of a relation between the simulation hypothesis and emotion theory.

  26. Joost Broekens, LIACS, Leiden University, The Netherlands. Future work. • Use emotion to modulate: • action-selection distribution (Doya, 2002), and • interaction-selection distribution (e.g. temperature of Boltzmann, threshold of our AS mechanism). • Interplay between covert interaction (simulation) and overt interaction (action-selection). • Simulate the best interaction, but chose an action stochastically, see also (Gadanho, 2003):  Gives extra “drive” to certain actions. • The inverse? Seems rational too: • Simulate bad actions for “mental (covert) exploration”, choose best actions for “overt exploitation”. • Early experiments do not (yet) show clear benefit. • Use arousal factor as feed-back • Could arousal modify amount of energy available for information processing, and thereby provide a bound for the amount of simulation? • Arousal resulting from low-level evaluation of familiarity and suddenness (e.g. Scherer).

  27. Joost Broekens, LIACS, Leiden University, The Netherlands. Questions?

More Related