1 / 52

Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior?

Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior?. Angela J. Yu University of California, San Diego March 9, 2010. “Irrational” Probabilistic Reasoning in Humans. “hot hand” 2AFC: sequential effects (rep/alt). (Gillovich, Vallon, & Tversky, 1985).

kaia
Download Presentation

Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Seeing Patterns in Randomness:Irrational Superstition orAdaptive Behavior? Angela J. Yu University of California, San Diego March 9, 2010

  2. “Irrational” Probabilistic Reasoning in Humans • “hot hand” • 2AFC: sequential effects (rep/alt) (Gillovich, Vallon, & Tversky, 1985) (Wilke & Barrett, 2009) (Soetens, Boer, & Hueting, 1985) Random stimulus sequence: 1 2 2 2 2 2 1 2 2 2 2 2 1 1 2 1 2 1 … 1 2 1 2

  3. Trials repetitions alternations O oo o o oO O oO o O O… fast fast slow slow “Superstitious” Predictions Subjects are “superstitious” when viewing randomized stimuli • Subjects slower & more error-prone when local pattern is violated • Patterns are by chance, not predictive of next stimulus • Such “superstitious” behavior is apparently sub-optimal

  4. t-3 t-2 t-1 t “Graded” Superstition (Cho et al, 2002) (Soetens et al, 1985) [o o O O O] RARR = or [O O o o o] RT Hypothesis: Sequential adjustments may be adaptive for changing environments. ER

  5. Outline • “Ideal predictor” in a fixed vs. changing world • Exponential forgetting normative and descriptive • Optimal Bayes or exponential filter? • Neural implementation of prediction/learning

  6. ? I. Fixed Belief Model (FBM) hidden bias ? observed stimuli R (1) A (0) R (1)

  7. II. Dynamic Belief Model (DBM) ? .3 .3 .8 changing bias observed stimuli ? R (1) A (0) R (1)

  8. FBM Subject’s Response to Random Inputs What the FBM subject should believe about the bias of the coin, given a sequence of observations: R R A R R R A R bias 

  9. FBM Subject’s Response to Random Inputs What the FBM subject should believe about the bias of the coin, given a long sequence of observations: R R A R A A R A A R A… A R bias 

  10. DBM Subject’s Response to Random Inputs What the DBM subject should believe about the bias of the coin, given a long sequence of observations: R R A R A A R A A R A… A R bias 

  11. FBM: belief distrib. over  DBM: belief distrib. over  Probability Probability Simulated trials Simulated trials Randomized Stimuli: FBM > DBM Given a sequence of truly random data (= .5) … Driven by transient patterns Driven by long-term average

  12. FBM: posterior over  DBM: posterior over  Probability Probability Simulated trials Simulated trials “Natural Environment”: DBM > FBM In a changing world, where  undergoes un-signaled changes … Adapt rapidly to changes Adapt poorly to changes

  13. Human Data (data from Cho et al, 2002) FBM P(stimulus) RT DBM P(stimulus) Persistence of Sequential Effects • Sequential effects persist in data • DBM produces R/A asymmetry • Subjects=DBM (changing world)

  14. Outline • “Ideal predictor” in a fixed vs. changing world • Exponential forgetting normative and descriptive • Optimal Bayes or exponential filter? • Neural implementation of prediction/learning

  15. Optimal Prediction What subjects need to compute Generative Model What subjects need toknow Bayesian Computations in Neurons? Too hard to represent, too hard to compute!

  16. (Sugrue, Corrado, & Newsome, 2004) Simpler Alternative for Neural Computation? Inspiration: exponential forgetting in tracking true changes

  17. Linear regression: R/A R/A Human Data (re-analysis of Cho et al) Coefficients Trials into the Past Exponential Forgetting in Behavior Exponential discounting is a good descriptive model

  18. Linear regression: R/A R/A DBM Prediction Coefficients Trials into the Past Exponential Forgetting Approximates DBM Exponential discounting is a good normative model

  19.  = .95  = .77 Probability Simulated trials Simulated trials Discount Rate vs. Assumed Rate of Change … DBM

  20.  = .77  = .57 Reverse-engineering Subjects’ Assumptions DBM Simulation Human Data Coefficients  = .57 Coefficients  = .57 Trials into the Past Trials into the Past  = p(t=t-1)   2/3   changes once every four trials

  21. nonlinear Bayesian computations 3-param model 1-param linear model Quality of approximation  vs. .57  .77  Analytical Approximation

  22. Outline • “Ideal predictor” in a fixed vs. changing world • Exponential forgetting normative and descriptive • Optimal Bayes or exponential filter? • Neural implementation of prediction/learning

  23. Repetition Trials Subjects’ RT vs. Model Stimulus Probability R A R R R R …

  24. Repetition Trials Subjects’ RT vs. Model Stimulus Probability R A R R R R … RT

  25. Alternation Trials Repetition Trials Subjects’ RT vs. Model Stimulus Probability R A R R R R … RT

  26. Subjects’ RT vs. Model Stimulus Probability Repetition vs. Alternation Trials

  27. DBM 2 1 Multiple-Timescale Interactions Optimal discrimination (Wald, 1947) • discrete time, SPRT • continuous-time, DDM (Yu, NIPS 2007) (Frazier & Yu, NIPS 2008) (Gold & Shadlen, Neuron 2002)

  28. RT hist <RT> Timesteps Bias: P(s1) 0 tanh x Bias: P(s1) x SPRT/DDM & Linear Effect of Prior on RT

  29. Empirical RT vs. Stim Probability Predicted RT vs. Stim Probability <RT> Bias: P(s1) SPRT/DDM & Linear Effect of Prior on RT

  30. Outline • “Ideal predictor” in a fixed vs. changing world • Exponential forgetting normative and descriptive • Optimal Bayes or exponential filter? • Neural implementation of prediction/learning

  31. bias • Perceptual decision-making (Grice, 1972; Smith, 1995; Cook & Maunsell, 2002; Busmeyer & Townsend, 1993; McClelland, 1993; Bogacz et al, 2006; Yu, 2007; …) • Trial-to-trial interactions (Kim & Myung, 1995; Dayan & Yu, 2003; Simen, Cohen & Holmes, 2006; Mozer, Kinoshita, & Shettel, 2007; …) input recurrent Neural Implementation of Prediction Leaky-integrating neuron: = 1/2 (1-) 1/3  2/3 

  32. (Yu & Dayan, Neuron, 2000) NE: Unexpected Uncertainty bias Trials input recurrent Neuromodulation & Dynamic Filters Leaky-integrating neuron: Norepinephrine (NE) (Hasselmo, Wyble, & Wallenstein 1996; Kobayashi, 2000)

  33. … Bayesian Learning .3 .3 .9 Iteratively compute joint posterior … … 0 0 1 Marginal posterior over  Marginal posterior over  Learning the Value of  Humans (Behrens et al, 2007) and rats (Gallistel & Latham, 1999) may encode meta-changes in the rate of change, 

  34. error gradient learning rate Neural Parameter Learning? • Neurons don’t need to represent probabilities explicitly • Just need to estimate  • Stochastic gradient descent (-rule)

  35. Bayesian Learning Stochastic Gradient Descent Trials Trials Learning Results

  36. Summary • H: “Superstition” reflects adaptation to changing world • Exponential “memory” near-optimal & fits behavior; linear RT • Neurobiology: leaky integration, stochastic -rule, neuromodulation • Random sequence and changing biases hard to distinguish • Questions: multiple outcomes? Explicit versus implicit prediction?

  37. Unlearning Temporal Correlation is Slow Marginal posterior over  Probability Marginal posterior over  Probability Trials (see Bialek, 2005)

  38. Ex: visual illusions (Adelson, 1995) Insight from Brain’s “Mistakes”

  39. lightness depth context Insight from Brain’s “Mistakes” Ex: visual illusions (Adelson, 1995) Neural computation specialized for natural problems

  40. Exact inference is non-linear Linear approximation Empirical distribution Discount Rate vs. Assumed Rate of Change Iterative form of linear exponential

  41. Optimal Prediction (Bayes’ Rule) Generative Model (what subject “knows”) Posterior Bayesian Inference 1: repetition 0: alternation

  42. Generative Model (what subject “knows”) Optimal Prediction (Bayes’ Rule) Bayesian Inference

  43. Human memory Natural (language) statistics (Anderson & Schooler, 1991) Hierarchical Chinese Restaurant Process (Teh, 2006) … 10 7 4  Power-Law Decay of Memory Stationary process!

  44. Stroop Eriksen GREEN SSHSS Ties Across Time, Space, and Modality Sequential effects RT (Yu, Dayan, Cohen, JEP: HPP 2008) (Liu, Yu, & Holmes, Neur Comp 2008) time space modality

  45. DBM R PFC A Sequential Effects  Perceptual Discrimination Optimal discrimination (Wald, 1947) • discrete time, SPRT • continuous-time, DDM (Yu & Dayan, NIPS 2005) (Yu, NIPS 2007) (Frazier & Yu, NIPS 2008) (Gold & Glimcher, Neuron 2002)

  46. Monkey F (Sugrue, Corrado, & Newsome, 2004) Coefficients Trials into past Monkey G = .72 = .63 Coefficients Trials into past Exponential Discounting for Changing Rewards

  47. Monkey F Human Coefficients Trials into past Monkey G = .72 = .63 Coefficients Trials into past Human & Monkey Share Assumptions? Monkey ! ≈  = .80  = .68

  48. Simulation Results Learning via stochastic -rule Trials

More Related