810 likes | 970 Views
The Harmonic Mind. Paul Smolensky Cognitive Science Department Johns Hopkins University. with:. G é raldine Legendre Alan Prince. Peter Jusczyk Donald Mathis Melanie Soderstrom. A Mystery ‘Co’-laborator. Personal Firsts thanks to SPP.
E N D
The Harmonic Mind Paul Smolensky Cognitive Science Department Johns Hopkins University with: Géraldine Legendre Alan Prince Peter Jusczyk Donald Mathis Melanie Soderstrom A Mystery ‘Co’-laborator
Personal Firsts thanks to SPP • First invited talk! (& first visit to JHU, 1986) • First public confessional: midnight thoughts of a worried connectionist (UNC, 1988) • First generative syntax talk (Memphis, 1994) • First attempt at stand-up comedy (Columbia, 2000) • First rendition of a 900-page book as a graphical synopsis in Powerpoint (1 minute from now)
Advertisement The Harmonic Mind: From neural computation to optimality-theoretic grammar Paul Smolensky & Géraldine Legendre • Blackwell 2002 (??) • Develop the Integrated Connectionist/Symbolic (ICS) Cognitive Architecture • Case study in formalist multidisciplinary cognitive science
Talk Plan • ‘Sketch’ the ICS cognitive architecture, pointing to contributions from/to traditional disciplines • Topics of direct philosophical relevance • Explanation of the productivity of cognition • Nativism • Theoretical work • Symbolic • Connectionist • Experimental work
Mystery Quote #1 “ Smolensky has recently been spending a lot of his time trying to show that, vivid first impressions to the contrary notwithstanding, some sort of connectionist cognitive architecture can indeed account for compositionality, productivity, systematicity, and the like. It turns out to be rather a long story … 185 pages … are devoted to Smolensky’s telling of it, and there appears to be no end in sight. It seems it takes a lot of squeezing to get this stone to bleed.”
–λ (–0.9) a1 a2 i1 (0.6) i2 (0.5) Processing I: Activation • Computational neuroscience ICS • Key sources • Hopfield 1982, 1984 • Cohen and Grossberg 1983 • Hinton and Sejnowski 1983, 1986 • Smolensky 1983, 1986 • Geman and Geman 1984 • Golden 1986, 1988 Processing — spreading activation — is optimization: Harmony maximization
a1 and a2must not be simultaneously active (strength: λ) Harmony maximization is satisfaction of parallel, violable constraints –λ (–0.9) a1 a2 a1must be active (strength: 0.6) a2must be active (strength: 0.5) i1 (0.6) i2 (0.5) Optimal compromise: 0.79 –0.21 Processing II: Optimization • Cognitive psychology ICS • Key sources: • Hinton & Anderson 1981 • Rumelhart, McClelland, & the PDP Group 1986
Representation • Symbolic theory ICS • Complex symbol structures • Generative linguistics ICS • Particular linguistic representations • PDP connectionism ICS • Distributed activation patterns • ICS: • realization of (higher-level) complex symbolic structures in distributed patterns of activation over (lower-level) units (‘tensor product representations’ etc.)
σ k æ t σ/rε k/r0 æ/r01 t/r11 [σ k [æ t]] Representation
Constraints • Linguistics (markedness theory) ICS • ICS Generative linguistics: Optimality Theory • Key sources: • Prince & Smolensky 1993 [ms.; Rutgers report] • McCarthy & Prince 1993 [ms.] • Texts: Archangeli & Langendoen 1997, Kager 1999, McCarthy 2001 • Electronic archive: rutgers/ruccs/roa.html Met in SPP Debate, 1988!
σ k æ t *violation *violation W a[σk [æ t ]] * Constraints NOCODA: A syllable has no coda * H(a[σk [æ t]) = –sNOCODA < 0
Constraint Interaction I • ICS Grammatical theory • Harmonic Grammar • Legendre, Miyata, Smolensky 1990 et seq.
σ H k = H æ t = H(k/,σ) H(σ,\t) NOCODACoda/t ONSETOnset/k = Constraint Interaction I The grammar generates the representation that maximizes H: this best-satisfies the constraints, given their differential strengths Any formal language can be so generated.
Top-down X Y X Y X Y Bottom-up A B B A A B B A A B B A Harmonic Grammar Parser • Simple, comprehensible network • Simple grammar G • X → A B Y → B A • Language Parsing
i i, j, k∊{A, B, X, Y} jk Depth 0 Depth 1 ⑤ ⑨ ① Filler vectors:A, B, X, Y ⑩ ② ⑥ ⑦ ⑪ ③ ⑧ ⑫ ④ ⊗ Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Harmonic Grammar Parser • Representations:
Harmonic Grammar Parser • Representations:
Harmonic Grammar Parser H(Y, B—) > 0H(Y, —A) > 0 • Weight matrix for Y → B A
Harmonic Grammar Parser • Weight matrix for X → A B
Harmonic Grammar Parser • Weight matrix for entire grammar G
Explaining Productivity • Full-scale parsing of formal languages by neural-network Harmony maximization: productive competence • How to explain?
= Proof of Productivity • Productive behavior follows mathematically from combining • the combinatorial structure of the vectorial representations encoding inputs & outputs and • the combinatorial structure of the weight matrices encoding knowledge
Mystery Quote #2 “ Paul Smolensky has recently announced that the problem of explaining the compositionality of concepts within a connectionist framework is solved in principle. … This sounds suspiciously like the offer of a free lunch, and it turns out, upon examination, that there is nothing to it.”
Semantics ICS & GOFAI + + GOFAI Processes ICS Processes Explaining Productivity I Intra-level decomposition:[A B] {A, B} Inter-level decomposition:[A B] {1,0,1,…1}
Semantics ICS & GOFAI + GOFAI Processes ICS Processes Explaining Productivity II Intra-level decomposition:G {XAB, YBA} Inter-level decomposition:[A B] {1,0,1,…1}
Mystery Quote #3 • “ … even after all those pages, Smolensky hasn’t so much as made a start on constructing an alternative to the Classical account of the compositionality phenomena.”
Constraint Interaction II: OT • ICS Grammatical theory • Optimality Theory • Prince & Smolensky 1993
Constraint Interaction II: OT • Differential strength encoded in strict domination hierarchies: • Every constraint has complete priority over all lower-ranked constraints (combined) • = ‘Take-the-best’ heuristic (Hertwig, today) • constraint cue • ranking cue validity • Decision-theoretic justification for OT? • Approximate numerical encoding employs special (exponentially growing) weights
Constraint Interaction II: OT • “Grammars can’t count” • Stress is on the initial heavy syllable iff the number of light syllables n obeys No way, man
Constraint Interaction II: OT • Constraints are universal • Human grammars differ only in how these constraints are ranked • ‘factorial typology’ • First true contender for a formal theory of cross-linguistic typology
The Faithfulness / Markedness Dialectic • ‘cat’: /kat/ kæt*NOCODA— why? • FAITHFULNESSrequires identity • MARKEDNESS often opposes it • Markedness-Faithfulness dialectic diversity • English: NOCODA≫ FAITH • Polynesian: FAITH≫ NOCODA(~French) • Another markedness constraint M: • Nasal Place Agreement [‘Assimilation’] (NPA): mb ≻nb, ŋb nd ≻ md, ŋd ŋg ≻ŋb, ŋd labial coronal velar
Nativism I: Learnability • Learning algorithm • Provably correct and efficient (under strong assumptions) • Sources: • Tesar 1995 et seq. • Tesar & Smolensky 1993, …, 2000 • If you hear A when you expected to hear E, minimally demote each constraint violated by A below a constraint violated by E
☺☞ Constraint Demotion Learning • If you hear A when you expected to hear E, minimally demote each constraint violated by A below a constraint violated by E Correctly handles difficult case: multiple violations in E
Nativism I: Learnability • M ≫ F is learnable with /in+possible/→impossible • ‘not’ = in- except when followed by … • “exception that proves the rule”: M = NPA • M ≫ F is not learnable from data if there are no ‘exceptions’ (alternations) of this sort, e.g., if no affixes and all underlying morphemes have mp: √M and√F, no M vs. F conflict, no evidence for their ranking • Thus must have M ≫ F in the initial state, ℌ0
Nativism II: Experimental Test • Collaborators • Peter Jusczyk • Theresa Allocco • (Elliott Moreton, Karen Arnold) • Linking hypothesis: More harmonic phonological stimuli ⇒ Longer listening time • More harmonic: • √M ≻ *M, when equal on F • √F ≻ *F, when equal on M • When must chose one or the other, more harmonic to satisfy M: M ≫ F • M = Nasal Place Assimilation (NPA)
Nativism III: UGenome • Can we combine • Connectionist realization of harmonic grammar • OT’s characterization of UG to examine the biological plausibility of UG as innate knowledge? • Collaborators • Melanie Soderstrom • Donald Mathis
Nativism III: UGenome • The game: take a first shot at a concrete example of a genetic encoding of UG in a Language Acquisition Device • Introduce an ‘abstract genome’ notion parallel to (and encoding) ‘abstract neural network’ • Is connectionist empiricism clearly more biologically plausible than symbolic nativism? No!
The Problem • No concrete examples of such a LAD exist • Even highly simplified cases pose a hard problem: How can genes— which regulate production of proteins — encode symbolic principles of grammar? • Test preparation: Syllable Theory
Basic syllabification: Function • ƒ: /underlying form/ [surface form] • Plural form of dish: • /dš+s/[.d.š z.] • /CVCC/ [.CV.CV C.]
Basic syllabification: Function • ƒ: /underlying form/ [surface form] • Plural form of dish: • /dš+s/[.d.š z.] • /CVCC/ [.CV.CV C.] • Basic CV Syllable Structure Theory • Prince & Smolensky 1993: Chapter 6 • ‘Basic’ — No more than one segment per syllable position: .(C)V(C).
Basic syllabification: Function • ƒ: /underlying form/ [surface form] • Plural form of dish: • /dš+s/[.d.š z.] • /CVCC/ [.CV.CV C.] • Basic CV Syllable Structure Theory • Correspondence Theory • McCarthy & Prince 1995 (‘M&P’) • /C1V2C3C4/ [.C1V2.C3 V C4]
Syllabification: Constraints (Con) • PARSE: Every element in the input corresponds to an element in the output — “no deletion” [M&P: ‘MAX’]
Syllabification: Constraints (Con) • PARSE: Every element in the input corresponds to an element in the output • FILLV/C: Every output V/C segment corresponds to an input V/C segment [every syllable position in the output is filled by an input segment] — “no insertion/epenthesis” [M&P: ‘DEP’]