Dynamical Cognition 2010: New Approach to Some Tough Old Problems

Dynamical Cognition 2010: New Approach to Some Tough Old Problems • Simon D. Levy • Washington & Lee University • Lexington, Virginia, USA

Inspiration, 1985-1995

Inspiration, 1995-present • [I]t turns out that we don’t think the way we think we think! ... The scientific evidence coming in all around us is clear: Symbolic conscious reasoning, which is extracted through protocol analysis from serial verbal introspection, is a myth. • − J. Pollack (2005) • [W]hat kinds of things suggested by the architecture of the brain, if we modeled them mathematically, could give some properties that we associate with mind? • − P. Kanerva (2009) “ ... a fresh coat of paint on old rotting theories.” − B. MacLennan (1991)

What is Mind?

The Need for New Representational Principles • Ecological affordances (Gibson 1979); exploiting the environment (Clark 1998) • Distributed/Connectionist Representations (PDP 1986) • Holographic Representations (Gabor 1971; Plate 2003) • Fractals / Attractors / Dynamical Systems (Tabor 2000; Levy & Pollack 2001)

Pitfalls to Avoid • 1. The “Short Circuit” (Localist Connectionist) Approach • Traditional models of phenomenon X (language) use entities A, B, C, ... (Noun Phrase, Phoneme, ...) • We wish to model X in a more biologically realistic way. • Therefore our model of X will have a neuron (pool) for A, one for B, one for C, etc.

a.k.a. The Reese’s Peanut Butter Cup Model

E.g. Neural Blackboard Model (van der Velde & de Kamps 2006)

Benefits of Localism (Page 2000) • Transparent (one node, one concept) • Supports lateral inhibition / winner-takes all

A B C Lateral Inhibition (WTA) L2 L1

Problems with Localism • Philosophical problem: “fresh coat of paint on old rotting theories” (MacLennan 1991): what new insights does “neuro-X” provide? • Engineering problem: need to recruit new hardware for each new concept/combination leads to combinatorial explosion (Stewart & Eliasmith 2008)

The Appeal of Distributed Representations(Rumelhart & McClelland 1986)

WALKED WALK

ROARED ROAR

SPOKE SPEAK

WENT GO

Mary won’t give John the time of day. ignores(mary, john)

Challenges (Jackendoff 2002)

I. The Binding Problem + ? ? ? ?

? II. The Problem of Two + ? ?

III. The Problem of Variables ignores(X, Y) X won’t give Y the time of day.

Vector Symbolic Architectures(Plate 1991; Kanerva 1994; Gayler 1998)

Tensor Product Binding(Smolensky 1990)

Binding

Bundling + =

Unbinding (query)

Lossy

Cleanup Hebbian / Hopfield / Attractor Net

Reduction(Holographic; Plate 2003)

Reduction(Binary; Kanerva 1994,Gayler 1998)

Composition / Recursion

john Variables X

Scaling Up • With many (> 10K) dimensions, get • Astronomically large # of mutually orthogonal vectors (symbols) • Surprising robustness to noise

Pitfalls to Avoid 2. The Homunculus problem, a.k.a. Ghost in the Machine (Ryle 1949) In cognitive modeling, the homunculus is the researcher: supervises learning, hand-builds representations, etc.

Banishing the Homunculus

Step I: Automatic Variable Substitution • If A is a vector over {+1,-1}, then A*A = vector of 1’s (multiplicative identity) • Supports substitution of anything for anything: everything (names, individuals, structures, propositions) can be a variable!

“What is the Dollar of Mexico?” (Kanerva 2009) • Let X = <country>, Y = <currency>, A = <USA>, B = <Mexico> • Then A = X*U + Y*D, B = X*M + Y*P • D*A*B = • D*(X*U + Y*D) * (X*M + Y*P) = • (D*X*U + D*Y*D) * (X*M + Y*P) = • (D*X*U + Y) * (X*M + Y*P) = • D*X*U*X*M + D*X*U*Y*P + Y*X*M + Y*Y*P = • P + noise

Learning Grammatical Constructions from a Single Example (Levy 2010) • Given • Meaning: kiss(mary, john) • Form: Mary kissed John • Lexicon: kiss/kiss, mary/Mary, ... • What is the form for hit(bill, fred) ?

Learning Grammatical Constructions from a Single Example (Levy 2010) (ACTION*KISS + AGENT*MARY + PATIENT*JOHN) * (P1*Mary + P2*kissed + P3*John) * (KISS*kissed + MAY*Mary + JOHN*John + BILL*Bill + FRED*Fred + HIT*hit) * (ACTION*HIT + AGENT*BILL + PATIENT*FRED) = .... = (P1*Bill + P2*hit + P3*Fred) + noise

A P Q B C R D S Step II: Distributed “Lateral Inhibition” • Analogical mapping as holistic graph isomorphsm (Gayler & Levy 2009) cf. Pelillo (1999)

A P Q B R S C D Possibilities x: A*P + A*Q + A*R + A*S + ... + D*S Evidence w: A*B*P*Q + A*B*P*R +...+ B*C*Q*R + .. + C*D*R*S x*w = A*Q + B*R + ... + A*P + ... + D*S What kind of “program” could work with these “data structures” to yield a single consistent mapping?

Replicator Equations Starting at some initial state (typically just xi = 1/N corresponding to all xi being equally supported as part of the solution), x can be obtained through iterative application of the following equation: where and w is a linear function of the adjacency matrix of the association graph (“evidence matrix”).

Replicator Equations • Origins in Evolutionary Game Theory (Maynard Smith 1982) • xi is a strategy (belief in a strategy) • πi is the overall payoff from that strategy • wij is the utility of playing strategy i against strategy j • Can be interpreted as a continuous inference equation whose discrete-time version has a formal similarity to Bayesian inference (Harper 2009)

Localist Implementation Results (Pelillo 1999)

VSA “Lateral Inhibition” Circuit (Levy & Gayler 2009) xt ∧ w πt /∑ xt+1 cleanup * c∧ c

Dynamical Cognition 2010: New Approach to Some Tough Old Problems