1 / 62

Optimality in Cognition and Grammar

Optimality in Cognition and Grammar. Paul Smolensky Cognitive Science Department, Johns Hopkins University Plan of lectures Cognitive architecture: Symbols & optimization in neural networks Optimization in grammar: HG  OT From numerical to algebraic optimization in grammar

Download Presentation

Optimality in Cognition and Grammar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimality in Cognition and Grammar Paul Smolensky Cognitive Science Department, Johns Hopkins University Plan of lectures • Cognitive architecture: Symbols & optimization in neural networks • Optimization in grammar: HG  OTFrom numerical to algebraic optimization in grammar • OT and nativismThe initial state & neural/genomic encoding of UG • ?

  2. The ICS Hypothesis The Integrated Connectionist/Symbolic Cognitive Architecture (ICS) • In higher cognitive domains, representations and fuctions are well approximated by symbolic computation • The Connectionist Hypothesis is correct • Thus, cognitive theory must supply a computational reduction of symbolic functions to PDP computation

  3. Levels

  4. ƒ  G kæt [σk[æt]] A σ σ σ σ k k     k k æ æ t t æ æ t t The ICS Architecture

  5. σ σ k k æ t æ t σ/rε k/r0 æ/r01 t/r11 [σ k [æ t]] Representation

  6. ⊗ Role vectors:rε = (1; 0 0) r0 = (0; 1 1) r1 = (0; 1 1) Role vectors:rε = (1; 0 0) r0 = (0; 1 1) r1 = (0; 1 1) i i i, j, k∊{A, B, X, Y} i, j, k∊{A, B, X, Y} jk jk Depth 0 Depth 1 Depth 1 ⑤ ⑨ ① ⑤ ⑨ ① ① Filler vectors:A, B, X, Y Filler vectors:A, B, X, Y ⑩ ② ⑥ ⑩ ② ② ⑥ ⑦ ⑪ ③ ⑦ ⑪ ③ ③ ⑧ ⑫ ④ ⑧ ⑫ ④ ④ Tensor Product Representations • Representations: Depth 0 ⊗

  7. Local tree realizations • Representations:

  8. F Aux V by G B Passive LF Patient D C Output V P A P A Input B D C Aux F by Patient E G W Agent The ICS Isomorphism Tensor product representations Tensorial networks 

  9. Tensor Product Representations

  10. recipient giver give-obj John Mary book = Filler  Formal Role Binding by Synchrony =  s = r1 [fbook + fgive-obj]+ r3 [fMary + frecipient] + r2 [fgiver + fJohn] r1 [fbook + fgive-obj] time give(John, book, Mary)(Shastri & Ajjanagadde 1993) [Tesar & Smolensky 1994]

  11. ƒ  G kæt [σk[æt]] A σ σ σ k     k k æ t æ æ t t The ICS Architecture

  12. Two Fundamental Questions Harmony maximization is satisfaction of parallel, violable constraints 2. What are the constraints? Knowledge representation Prior question: 1. What are the activation patterns — data structures — mental representations — evaluated by these constraints?

  13. σ σ k k æ t æ t σ/rε k/r0 æ/r01 t/r11 [σ k [æ t]] Representation

  14. Two Fundamental Questions Harmony maximization is satisfaction of parallel, violable constraints 2. What are the constraints? Knowledge representation Prior question: 1. What are the activation patterns — data structures — mental representations — evaluated by these constraints?

  15. σ k æ t *violation ‘cat’ W a[σk [æ t ]] * Constraints NOCODA: A syllable has no coda [Maori/French/English] * H(a[σk [æ t]]) = –sNOCODA < 0

  16. ƒ  G σ σ σ k    k k æ t æ æ t t The ICS Architecture kæt [σk[æt]] A

  17. ƒ  G Constraint Interaction ?? σ σ σ k   k k æ t æ æ t t The ICS Architecture kæt [σk[æt]] A

  18. Constraint Interaction I • ICS  Grammatical theory • Harmonic Grammar • Legendre, Miyata, Smolensky 1990 et seq.

  19. σ H = H k æ t = H(k ,σ) > 0 H(σ, t) < 0 NOCODACoda/t ONSETOnset/k = Constraint Interaction I The grammar generates the representation that maximizes H: this best-satisfies the constraints, given their differential strengths Any formal language can be so generated.

  20. ƒ Constraint Interaction I: HG σ σ σ k   k k æ t æ æ t t The ICS Architecture  G kæt [σk[æt]] A

  21. Top-down X Y X Y X Y Bottom-up A B B A A B B A A B B A Harmonic Grammar Parser • Simple, comprehensible network • Simple grammar G • X → A B Y → B A • Language Processing: Completion

  22. W Simple Network Parser • Fully self-connected, symmetric network • Like previously shown network … … Except with 12 units; representations and connections shown below

  23. Harmonic Grammar Parser H(Y, —A) > 0H(Y, B—) > 0 • Weight matrix for Y → B A

  24. Harmonic Grammar Parser • Weight matrix for X → A B

  25. Harmonic Grammar Parser • Weight matrix for entire grammar G

  26. X Y A B B A Bottom-up Processing

  27. X Y A B B A Top-down Processing

  28. Scaling up • Not yet … • Still conceptual obstacles to surmount

  29. Explaining Productivity • Approaching full-scale parsing of formal languages by neural-network Harmony maximization • Have other networks (like PassiveNet) that provably compute recursive functions !productive competence • How to explain?

  30. 1. Structured representations

  31. + 2. Structured connections

  32. = Proof of Productivity • Productive behavior follows mathematically from combining • the combinatorial structure of the vectorial representations encoding inputs & outputs and • the combinatorial structure of the weight matrices encoding knowledge

  33. Functions Semantics + + PSA Processes Processes Explaining Productivity I PSA & ICS Intra-level decomposition:[A B] ⇝{A, B} Inter-level decomposition:[A B] ⇝{1,0,1,…,1} ICS

  34. PSA Processes ICS Processes Explaining Productivity II Functions Semantics ICS & PSA Intra-level decomposition:G⇝{XAB, YBA} + Inter-level decomposition:W(G )⇝{1,0,1,0;…}

  35. ƒ  G kæt [σk[æt]] A σ σ σ k k k æ t æ æ t t The ICS Architecture

  36. ƒ  G kæt [σk[æt]] A Constraint InteractionII σ σ σ k k k æ t æ æ t t The ICS Architecture

  37. Constraint Interaction II: OT • ICS  Grammatical theory • Optimality Theory • Prince & Smolensky 1991, 1993/2004

  38. Constraint Interaction II: OT • Differential strength encoded in strict domination hierarchies (≫): • Every constraint has complete priority over all lower-ranked constraints (combined) • Approximate numerical encoding employs special (exponentially growing) weights • “Grammars can’t count”

  39. Constraint Interaction II: OT • “Grammars can’t count” • Stress is on the initial heavy syllable iff the number of light syllables n obeys No way, man

  40. Constraint Interaction II: OT • Differential strength encoded in strict domination hierarchies (≫) • Constraints are universal(Con) • Candidate outputs are universal (Gen) • Human grammars differ only in how these constraints are ranked • ‘factorial typology’ • First true contender for a formal theory of cross-linguistic typology • 1st innovation of OT: constraint ranking • 2nd innovation: ‘Faithfulness’

  41. The Faithfulness/Markedness Dialectic • ‘cat’: /kat/  kæt*NOCODA— why? • FAITHFULNESSrequires pronunciation = lexical form • MARKEDNESS often opposes it • Markedness-Faithfulness dialectic diversity • English: FAITH≫ NOCODA • Polynesian: NOCODA≫ FAITH(~French) • Another markedness constraint M: • Nasal Place Agreement [‘Assimilation’] (NPA): ŋg ≻ŋb, ŋd velar nd ≻ md, ŋd coronal mb ≻nb, ŋb labial

  42. ƒ  G kæt [σk[æt]] A Constraint Interaction II: OT σ σ σ k k k æ t æ æ t t The ICS Architecture

  43. Optimality Theory • Diversity of contributions to theoretical linguistics • Phonology & phonetics • Syntax • Semantics & pragmatics • … e.g., following lectures. Now: • Can strict domination be explained by connectionism?

  44. Case study • Syllabification in Berber • Plan • Data, then: OT grammar Harmonic Grammar Network

  45. Syllabification in Berber • Dell & Elmedlaoui, 1985: Imdlawn Tashlhit Berber • Syllable nucleus can be any segment • But driven by universal preference for nuclei to be highest-sonority segments

  46. Berber syllable nuclei have maximal sonority

  47. OT Grammar: BrbrOT HNUC A syllable nucleus is sonorous ONSET A syllable has an onset Strict Domination Prince & Smolensky ’93/04

  48. Harmonic Grammar: BrbrHG • HNUC A syllable nucleus is sonorous Nucleus of sonoritys: Harmony = 2s1 s {1, 2, …, 8} ~ {t, d, f, z, n, l, i, a} • ONSET *VV Harmony = 28 • Theorem. The global Harmony maxima are the correct Berber core syllabifications [of Dell & Elmedlaoui; no sonority plateaux, as in OT analysis, here & henceforth]

  49. ONSET HNUC BrbrNet realizes BrbrHG

  50. BrbrNet’s Global Harmony Maximum is the correct parse • Contrasts with Goldsmith’s Dynamic Linear Models (Goldsmith & Larson ’90; Prince ’93) For a given input string, a state of BrbrNet is a global Harmony maximum if and only if it realizes the syllabification produced by the serial Dell-Elmedlaoui algorithm

More Related