Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior?

Seeing Patterns in Randomness:Irrational Superstition orAdaptive Behavior? Angela J. Yu University of California, San Diego March 9, 2010

“Irrational” Probabilistic Reasoning in Humans • “hot hand” • 2AFC: sequential effects (rep/alt) (Gillovich, Vallon, & Tversky, 1985) (Wilke & Barrett, 2009) (Soetens, Boer, & Hueting, 1985) Random stimulus sequence: 1 2 2 2 2 2 1 2 2 2 2 2 1 1 2 1 2 1 … 1 2 1 2

Trials repetitions alternations O oo o o oO O oO o O O… fast fast slow slow “Superstitious” Predictions Subjects are “superstitious” when viewing randomized stimuli • Subjects slower & more error-prone when local pattern is violated • Patterns are by chance, not predictive of next stimulus • Such “superstitious” behavior is apparently sub-optimal

t-3 t-2 t-1 t “Graded” Superstition (Cho et al, 2002) (Soetens et al, 1985) [o o O O O] RARR = or [O O o o o] RT Hypothesis: Sequential adjustments may be adaptive for changing environments. ER

Outline • “Ideal predictor” in a fixed vs. changing world • Exponential forgetting normative and descriptive • Optimal Bayes or exponential filter? • Neural implementation of prediction/learning

… ? I. Fixed Belief Model (FBM) hidden bias ? observed stimuli R (1) A (0) R (1)

II. Dynamic Belief Model (DBM) ? .3 .3 .8 changing bias observed stimuli ? R (1) A (0) R (1)

FBM Subject’s Response to Random Inputs What the FBM subject should believe about the bias of the coin, given a sequence of observations: R R A R R R A R bias 

FBM Subject’s Response to Random Inputs What the FBM subject should believe about the bias of the coin, given a long sequence of observations: R R A R A A R A A R A… A R bias 

DBM Subject’s Response to Random Inputs What the DBM subject should believe about the bias of the coin, given a long sequence of observations: R R A R A A R A A R A… A R bias 

FBM: belief distrib. over  DBM: belief distrib. over  Probability Probability Simulated trials Simulated trials Randomized Stimuli: FBM > DBM Given a sequence of truly random data (= .5) … Driven by transient patterns Driven by long-term average

FBM: posterior over  DBM: posterior over  Probability Probability Simulated trials Simulated trials “Natural Environment”: DBM > FBM In a changing world, where  undergoes un-signaled changes … Adapt rapidly to changes Adapt poorly to changes

Human Data (data from Cho et al, 2002) FBM P(stimulus) RT DBM P(stimulus) Persistence of Sequential Effects • Sequential effects persist in data • DBM produces R/A asymmetry • Subjects=DBM (changing world)

Optimal Prediction What subjects need to compute Generative Model What subjects need toknow Bayesian Computations in Neurons? Too hard to represent, too hard to compute!

(Sugrue, Corrado, & Newsome, 2004) Simpler Alternative for Neural Computation? Inspiration: exponential forgetting in tracking true changes

Linear regression: R/A R/A Human Data (re-analysis of Cho et al) Coefficients Trials into the Past Exponential Forgetting in Behavior Exponential discounting is a good descriptive model

Linear regression: R/A R/A DBM Prediction Coefficients Trials into the Past Exponential Forgetting Approximates DBM Exponential discounting is a good normative model

 = .95  = .77 Probability Simulated trials Simulated trials Discount Rate vs. Assumed Rate of Change … DBM

 = .77  = .57 Reverse-engineering Subjects’ Assumptions DBM Simulation Human Data Coefficients  = .57 Coefficients  = .57 Trials into the Past Trials into the Past  = p(t=t-1)   2/3   changes once every four trials

nonlinear Bayesian computations 3-param model 1-param linear model Quality of approximation  vs. .57  .77  Analytical Approximation

Repetition Trials Subjects’ RT vs. Model Stimulus Probability R A R R R R …

Repetition Trials Subjects’ RT vs. Model Stimulus Probability R A R R R R … RT

Alternation Trials Repetition Trials Subjects’ RT vs. Model Stimulus Probability R A R R R R … RT

Subjects’ RT vs. Model Stimulus Probability Repetition vs. Alternation Trials

DBM 2 1 Multiple-Timescale Interactions Optimal discrimination (Wald, 1947) • discrete time, SPRT • continuous-time, DDM (Yu, NIPS 2007) (Frazier & Yu, NIPS 2008) (Gold & Shadlen, Neuron 2002)

RT hist <RT> Timesteps Bias: P(s1) 0 tanh x Bias: P(s1) x SPRT/DDM & Linear Effect of Prior on RT

Empirical RT vs. Stim Probability Predicted RT vs. Stim Probability <RT> Bias: P(s1) SPRT/DDM & Linear Effect of Prior on RT

bias • Perceptual decision-making (Grice, 1972; Smith, 1995; Cook & Maunsell, 2002; Busmeyer & Townsend, 1993; McClelland, 1993; Bogacz et al, 2006; Yu, 2007; …) • Trial-to-trial interactions (Kim & Myung, 1995; Dayan & Yu, 2003; Simen, Cohen & Holmes, 2006; Mozer, Kinoshita, & Shettel, 2007; …) input recurrent Neural Implementation of Prediction Leaky-integrating neuron: = 1/2 (1-) 1/3  2/3 

(Yu & Dayan, Neuron, 2000) NE: Unexpected Uncertainty bias Trials input recurrent Neuromodulation & Dynamic Filters Leaky-integrating neuron: Norepinephrine (NE) (Hasselmo, Wyble, & Wallenstein 1996; Kobayashi, 2000)

… … Bayesian Learning .3 .3 .9 Iteratively compute joint posterior … … 0 0 1 Marginal posterior over  Marginal posterior over  Learning the Value of  Humans (Behrens et al, 2007) and rats (Gallistel & Latham, 1999) may encode meta-changes in the rate of change, 

error gradient learning rate Neural Parameter Learning? • Neurons don’t need to represent probabilities explicitly • Just need to estimate  • Stochastic gradient descent (-rule)

Bayesian Learning Stochastic Gradient Descent Trials Trials Learning Results

Summary • H: “Superstition” reflects adaptation to changing world • Exponential “memory” near-optimal & fits behavior; linear RT • Neurobiology: leaky integration, stochastic -rule, neuromodulation • Random sequence and changing biases hard to distinguish • Questions: multiple outcomes? Explicit versus implicit prediction?

Unlearning Temporal Correlation is Slow Marginal posterior over  Probability Marginal posterior over  Probability Trials (see Bialek, 2005)

Ex: visual illusions (Adelson, 1995) Insight from Brain’s “Mistakes”

lightness depth context Insight from Brain’s “Mistakes” Ex: visual illusions (Adelson, 1995) Neural computation specialized for natural problems

Exact inference is non-linear Linear approximation Empirical distribution Discount Rate vs. Assumed Rate of Change Iterative form of linear exponential

Optimal Prediction (Bayes’ Rule) Generative Model (what subject “knows”) Posterior Bayesian Inference 1: repetition 0: alternation

Generative Model (what subject “knows”) Optimal Prediction (Bayes’ Rule) Bayesian Inference

Human memory Natural (language) statistics (Anderson & Schooler, 1991) Hierarchical Chinese Restaurant Process (Teh, 2006) … 10 7 4  Power-Law Decay of Memory Stationary process!

Stroop Eriksen GREEN SSHSS Ties Across Time, Space, and Modality Sequential effects RT (Yu, Dayan, Cohen, JEP: HPP 2008) (Liu, Yu, & Holmes, Neur Comp 2008) time space modality

DBM R PFC A Sequential Effects  Perceptual Discrimination Optimal discrimination (Wald, 1947) • discrete time, SPRT • continuous-time, DDM (Yu & Dayan, NIPS 2005) (Yu, NIPS 2007) (Frazier & Yu, NIPS 2008) (Gold & Glimcher, Neuron 2002)

Monkey F (Sugrue, Corrado, & Newsome, 2004) Coefficients Trials into past Monkey G = .72 = .63 Coefficients Trials into past Exponential Discounting for Changing Rewards

Monkey F Human Coefficients Trials into past Monkey G = .72 = .63 Coefficients Trials into past Human & Monkey Share Assumptions? Monkey ! ≈  = .80  = .68

Simulation Results Learning via stochastic -rule Trials

Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior?

Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior?

Presentation Transcript

Normal EEG Patterns

Cultural Patterns

Immunology: Basic Principles of Adaptive Immunity and Immunizations

A stimulus is a(an)

Two-level Adaptive Branch Prediction

Police Culture

Fish Behavior

Elemental Design Patterns

Settlement Patterns

Design Patterns

Patterns for Cloud Computing

Introduction to Design Patterns

Vineland Adaptive Behavior Scales – 2 nd Ed.

Design Patterns

Java Design Patterns

Functional Behavior Assessments: Understanding and Intervening on Maladaptive Behavior

Biology

Design Patterns (DP)

Chapter 5

The Value of Adaptive Behavior in Promoting Wellness and Beyond

Patterns and Anti-Patterns of Tech Leadership

Patterns for the People