Attendee questionnaire

Attendee questionnaire • Name • Affiliation/status • Area of study/research • For each of these subjects: • Linguistics (Optimality Theory) • Computation (connectionism/neural networks) • Philosophy (symbolic/connectionist debate) • Psychology (infant phonology) please indicate your relative level of • interest (for these lectures) [1 = least, 5 = most] • background [1 = none, 5 = expert] Thank you

Optimality in Cognition and Grammar Paul Smolensky Cognitive Science Department, Johns Hopkins University Plan of lectures • Cognitive architecture • Symbols and neurons • Symbols in neural networks • Optimization in neural networks • Optimization in grammar I: HG  OT • Optimization in grammar II: OT • OT in neural networks

Cognitive architecture • Central dogma of cognitive science: Cognition is computation • But what type of computation? • What exactly is computation, and what work must it do in cognitive science?

Computation • Functions, cognitive • Pixels  objects  locations [low- to high-level vision] • Sound stream  word string [phonetics + …] • Word string  parse tree [syntax] • Underlying form  surface form [phonology] • petit copain: /pətit + kopɛ̃/  [pə.ti.ko.pɛ̃] • petit ami: /pətit + ami/  [pə.ti.ta.mi] • Reduction of complex procedures for evaluating functions to combinations of primitive operations • Computational architecture: • Operations: primitives + combinators • Data

Symbolic Computation • Computational architecture: • Operations: primitives + combinators • Data • The Pure Symbolic Architecture (PSA) • Data: strings, (binary) trees, graphs, … • Operations • Primitives • Concatenate (string, tree) = cons • First-member(string); left-subtree(tree) = ex0 • Combinators • Composition: f(x)=def g(h(x))) • IF(x = A) THEN … ELSE …

Aux V by  Passive LF V A P P A ƒPassive Few leaders are admired by George  admire(George, few leaders) ƒ(s) = cons(ex1(ex0(ex1(s))),cons(ex1(ex1(ex1(s))), ex0(s))) • But for cognition, need a reduction to a very different computational architecture

The cognitive architecture: The connectionist hypothesis • Representations: Distributed activation patterns • Primitive operations (e.g.) • Multiplication of activations by synaptic weights • Summation of weighted activation values • Non-linear transfer functions At the lowest computational level of the mind/brain PDP Computation • Combination: Massive parallelism

Criticism of PDP (e.g., neuroscientists) • Much too simple • Misguided. Relevant complaint: • Much too complex • Target of computational reduction must be within the scope of neural computation. • Confusion between two questions

The cognitive questionfor neuroscience What is the function of each component of the nervous system? Our question is quite different.

The neural question for cognitive science How are complex cognitive functions computed by a mass of numerical processors like neurons—each very simple, slow, and imprecise relative to the components that have traditionally been used to construct powerful, general-purpose computational systems? How does the structure arise that enables such a medium to achieve cognitive computation?

The ICS Hypothesis The Integrated Connectionist/Symbolic Cognitive Architecture (ICS) • In higher cognitive domains, representations and fuctions are well approximated by symbolic computation • The Connectionist Hypothesis is correct • Thus, cognitive theory must supply a computational reduction of symbolic functions to PDP computation

F G E B Agent Patient D C Output Input B D C Aux F by Patient E G W Agent PassiveNet

F Aux V by G B Passive LF Patient D C Output V P A P A Input B D C Aux F by Patient E G W Agent The ICS Isomorphism Tensor product representations Tensorial networks 

Within-level compositionality W = Wcons0[Wex1Wex0Wex1] +Wcons1[Wcons0(Wex1Wex1Wex1)+Wcons1(Wex0)] ƒ(s) = cons(ex1(ex0(ex1(s))),cons(ex1(ex1(ex1(s))),ex0(s))) Between-level reduction

Levels

ƒ G “dogs” σ σ σ k     k k æ t æ æ t t Processing (Learning) The ICS Architecture dog+s dgz A

Processing I: Activation • Computational neuroscience • Key sources • Hopfield 1982, 1984 • Cohen and Grossberg 1983 • Hinton and Sejnowski 1983, 1986 • Smolensky 1983, 1986 • Geman and Geman 1984 • Golden 1986, 1988

–λ (–0.9) a1 a2 i1 (0.6) i2 (0.5) Processing I: Activation Processing — spreading activation — is optimization: Harmony maximization

ƒ G σ σ σ k     k k æ t æ æ t t The ICS Architecture cat kæt A

–λ (–0.9) a1 a2 i1 (0.6) i2 (0.5) Processing II: Optimization • Cognitive psychology • Key sources: • Hinton & Anderson 1981 • Rumelhart, McClelland, & the PDP Group 1986 Processing — spreading activation — is optimization: Harmony maximization

a1 and a2must not be simultaneously active (strength: λ) Harmony maximization is satisfaction of parallel, violable well-formedness constraints –λ (–0.9) a1 a2 a1must be active (strength: 0.6) a2must be active (strength: 0.5) CONFLICT i1 (0.6) i2 (0.5) Optimal compromise: 0.79 –0.21 Processing II: Optimization Processing — spreading activation — is optimization: Harmony maximization

Processing II: Optimization • The search for an optimal state can employ randomness • Equations for units’ activation values have random terms • pr(a) ∝eH(a)/T • T (‘temperature’) ~ randomness  0 during search • Boltzmann Machine (Hinton and Sejnowski 1983, 1986); Harmony Theory (Smolensky 1983, 1986)

ƒ G σ σ σ k     k k æ t æ æ t t The ICS Architecture cat kæt A

Two Fundamental Questions Harmony maximization is satisfaction of parallel, violable constraints 2. What are the constraints? Knowledge representation Prior question: 1. What are the activation patterns — data structures — mental representations — evaluated by these constraints?

Representation • Symbolic theory • Complex symbol structures • Generative linguistics (Chomsky & Halle ’68 …) • Particular linguistic representations • Markedness Theory (Jakobson, Trubetzkoy, ’30s …) • Good (well-formed) linguistic representations • Connectionism(PDP) • Distributed activation patterns • ICS • realization of (higher-level) complex symbolic structures in distributed patterns of activation over (lower-level) units (‘tensor product representations’ etc.) • will employ ‘local representations’ as well

σ σ k k æ t æ t σ/rε k/r0 æ/r01 t/r11 [σ k [æ t]] Representation

i i, j, k∊{A, B, X, Y} jk Depth 0 Depth 1 ⑤ ⑨ ① Filler vectors:A, B, X, Y ⑩ ② ⑥ ⑦ ⑪ ③ ⑧ ⑫ ④ ⊗ Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Tensor Product Representations • Representations:

i i i, j, k∊{A, B, X, Y} i, j, k∊{A, B, X, Y} • Representations: jk jk Depth 0 Depth 1 Depth 0 Depth 1 ⑤ ⑨ ① ⑤ ⑨ ① ① Filler vectors:A, B, X, Y Filler vectors:A, B, X, Y ⑩ ② ⑥ ⑩ ② ② ⑥ ⑦ ⑪ ③ ⑦ ⑪ ③ ③ ⑧ ⑫ ④ ⑧ ⑫ ④ ④ ⊗ Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Role vectors:rε = 1r0 = (1 1) r1 = (1 –1) Tensor Product Representations ⊗

i i i, j, k∊{A, B, X, Y} i, j, k∊{A, B, X, Y} • Representations: jk jk Depth 0 Depth 1 Depth 0 Depth 1 ⑤ ⑨ ① ⑤ ⑨ ① ① Filler vectors:A, B, X, Y Filler vectors:A, B, X, Y ⑩ ② ⑥ ⑩ ② ② ⑥ ⑦ ⑪ ③ ⑦ ⑪ ③ ③ ⑧ ⑫ ④ ⑧ ⑫ ④ ④ ⊗ ⊗ Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Tensor Product Representations

Local tree realizations • Representations:

Attendee questionnaire

Attendee questionnaire

Presentation Transcript

Webinar Attendee Information

Attendee Profile

GoToWebinar Attendee Interface

Questionnaire

New Attendee Welcome Breakfast

attendee

New Attendee Welcome Breakfast

IMAT Attendee Instructions

Questionnaire

NEW ATTENDEE WELCOME BREAKFAST

GoToWebinar Attendee Panel

Attendee Brochure

QUESTIONNAIRE

New Attendee Orientation

Attendee Brochure

NEW ATTENDEE ORIENTATION

Questionnaire

Attendee Expectations!

IMAT Attendee Instructions