370 likes | 515 Views
Exploring cultural transmission by iterated learning. Tom Griffiths Brown University. Mike Kalish University of Louisiana. With thanks to: Anu Asnaani, Brian Christian, and Alana Firl. Cultural transmission. Most knowledge is based on secondhand data
E N D
Exploring cultural transmission by iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana With thanks to: Anu Asnaani, Brian Christian, and Alana Firl
Cultural transmission • Most knowledge is based on secondhand data • Some things can only be learned from others • cultural knowledge transmitted across generations • What are the consequences of learners learning from other learners?
Iterated learning(Kirby, 2001) Each learner sees data, forms a hypothesis, produces the data given to the next learner
Objects of iterated learning • Knowledge communicated through data • Examples: • religious concepts • social norms • myths and legends • causal theories • language
Analyzing iterated learning PL(h|d) PL(h|d) PP(d|h) PP(d|h) PL(h|d): probability of inferring hypothesis h from data d PP(d|h): probability of generating data d from hypothesis h
Complex algorithms Simple algorithms Analytic results Simulations Analyzing iterated learning What are the consequences of iterated learning? ? Komarova, Niyogi, & Nowak (2002) Brighton (2002) Kirby (2001) Smith, Kirby, & Brighton (2003)
Bayesian inference • Rational procedure for updating beliefs • Foundation of many learning algorithms • Widely used for language learning Reverend Thomas Bayes
Likelihood Prior probability Posterior probability Sum over space of hypotheses Bayes’ theorem h: hypothesis d: data
Learners are Bayesian agents Iterated Bayesian learning PL(h|d) PL(h|d) PP(d|h) PP(d|h)
Markov chains • Variables x(t+1) independent of history given x(t) • Converges to a stationary distribution under easily checked conditions for ergodicity x x x x x x x x Transition matrix T = P(x(t+1)|x(t))
Stationary distributions • Stationary distribution: • In matrix form • is the first eigenvector of the matrix T • Second eigenvalue sets rate of convergence
d0 d1 h1 h2 h3 d2 PL(h|d) PP(d|h) PL(h|d) PP(d|h) PL(h|d) A Markov chain on hypotheses h1 h2 h3 d PP(d|h)PL(h|d) d PP(d|h)PL(h|d) A Markov chain on data d0 d1 d2 h PL(h|d) PP(d|h) h PL(h|d) PP(d|h) h PL(h|d) PP(d|h) A Markov chain on hypothesis-data pairs h1,d1 h2 ,d2 h3 ,d3 PL(h|d) PP(d|h) PL(h|d) PP(d|h) Analyzing iterated learning
Stationary distributions • Markov chain on h converges to the prior, p(h) • Markov chain on d converges to the “prior predictive distribution” • Markov chain on (h,d) is a Gibbs sampler for
Implications • The probability that the nth learner entertains the hypothesis h approaches p(h) as n • Convergence to the prior occurs regardless of: • the properties of the hypotheses themselves • the amount or structure of the data transmitted • The consequences of iterated learning are determined entirely by the biases of the learners
Identifying inductive biases • Many problems in cognitive science can be formulated as problems of induction • learning languages, concepts, and causal relations • Such problems are not solvable without bias (e.g., Goodman, 1955; Kearns & Vazirani, 1994; Vapnik, 1995) • What biases guide human inductive inferences? If iterated learning converges to the prior, then it may provide a method for investigating biases
Serial reproduction(Bartlett, 1932) • Participants see stimuli, then reproduce them from memory • Reproductions of one participant are stimuli for the next • Stimuli were interesting, rather than controlled • e.g., “War of the Ghosts”
data hypotheses Iterated function learning(heavy lifting by Mike Kalish) • Each learner sees a set of (x,y) pairs • Makes predictions of y for new x values • Predictions are data for the next learner
Stimulus Feedback Response Slider Function learning experiments Examine iterated learning with different initial data
Initial data Iteration 1 2 3 4 5 6 7 8 9
Iterated concept learning(heavy lifting by Brian Christian) • Each learner sees examples from a species • Identifies species of four amoebae • Iterated learning is run within-subjects hypotheses data
Two positive examples data (d) hypotheses (h)
m: # of amoebae in the set d (= 2) |h|: # of amoebae in the set h (= 4) Posterior is renormalized prior What is the prior? Bayesian model(Tenenbaum, 1999; Tenenbaum & Griffiths, 2001) d: 2 amoebae h: set of 4 amoebae
color Classes of concepts(Shepard, Hovland, & Jenkins, 1958) size shape Class 1 Class 2 Class 3 Class 4 Class 5 Class 6
Class 1 Class 1 Class 2 Class 2 Class 3 Class 3 Class 4 Class 4 Class 5 Class 5 Class 6 Class 6 Experiment design (for each subject) 6 iterated learning chains 6 independent learning “chains”
Estimating the prior data (d) hypotheses (h)
Estimating the prior Bayesian model Prior Human subjects 0.861 Class 1 Class 2 0.087 0.009 Class 3 0.002 Class 4 0.013 Class 5 Class 6 0.028 r = 0.952
Two positive examples(n = 20) Human learners Bayesian model Probability Probability Iteration Iteration
Two positive examples(n = 20) Human learners Probability Bayesian model
Three positive examples data (d) hypotheses (h)
Three positive examples(n = 20) Human learners Bayesian model Probability Probability Iteration Iteration
Three positive examples(n = 20) Human learners Bayesian model
Conclusions • Consequences of iterated learning with Bayesian learners determined by the biases of the learners • Consistent results are obtained with human learners • Provides an explanation for cultural universals… • universal properties are probable under the prior • a direct connection between mind and culture • …and a novel method for evaluating the inductive biases that guide human learning
Discovering the biases of models Generic neural network:
Discovering the biases of models EXAM (Delosh, Busemeyer, & McDaniel, 1997):
Discovering the biases of models POLE (Kalish, Lewandowsky, & Kruschke, 2004):