690 likes | 1.01k Views
Grammar and Cognition. 1.What is the system of knowledge? 2.How does this system of knowledge arise in the mind/brain? 3.How is this knowledge put to use? 4.What are the physical mechanisms that serve as the material basis for this system of knowledge and for the use of this knowledge?
E N D
1. Jakobson's Grand Unified Theory of Linguistic Cognition Paul Smolensky
Cognitive Science Department
Johns Hopkins University
2. Grammar and Cognition 1. What is the system of knowledge?
2. How does this system of knowledge arise in the mind/brain?
3. How is this knowledge put to use?
4. What are the physical mechanisms that serve as the material basis for this system of knowledge and for the use of this knowledge?
(Chomsky 88; p. 3) Managua Lectures
Lofty goals, but its seems quite an open question whether linguistic theory will be up to answering these questions Managua Lectures
Lofty goals, but its seems quite an open question whether linguistic theory will be up to answering these questions
3. Advertisement The complete story, forthcoming (2003) Blackwell:
The harmonic mind: From neural computation to optimality-theoretic grammar
Smolensky & Legendre
4. A Grand Unified Theory for the cognitive science of language is enabled by Markedness:
Avoid a
? Structure
Alternations eliminate a
Typology: Inventories lack a
? Acquisition
a is acquired late
? Processing
a is processed poorly
? Neural
Brain damage most easily disrupts a Jakobsons Program In particular, Jakobson argued that a grand disrupts a.
The question behind much of the research Ill discuss is, can this program of grand unification be formally carried out within Optimality Theory?In particular, Jakobson argued that a grand disrupts a.
The question behind much of the research Ill discuss is, can this program of grand unification be formally carried out within Optimality Theory?
5. Structure Acquisition Use Neural Realization ? Theoretical. OT (Prince & Smolensky 91, 93):
Construct formal grammars directly from markedness principles
General formalism/ framework for grammars: phonology, syntax, semantics; GB/LFG/
Strongly universalist: inherent typology
6. Theoretical Formal structure enables OT-general:
Learning algorithms
Constraint Demotion: Provably correct and efficient (when part of a general decomposition of the grammar learning problem)
Tesar 1995 et seq.
Tesar & Smolensky 1993, , 2000
Gradual Learning Algorithm
Boersma 1998 et seq. Structure Acquisition Use Neural Realization
7. Structure Acquisition Use Neural Realization Theoretical
Theorems regarding the computational complexity of algorithms for processing with OT grammars
Tesar 94 et seq.
Ellison 94
Eisner 97 et seq.
Frank & Satta 98
Karttunen 98
8. Structure Acquisition Use Neural Realization Theoretical OT derives from the theory of abstract neural (connectionist) networks
via Harmonic Grammar (Legendre, Miyata, Smolensky 90)
For moderate complexity, now have general formalisms for realizing
complex symbol structures as distributed patterns of activity over abstract neurons
structure-sensitive constraints/rules as distributed patterns of strengths of abstract synaptic connections
optimization of Harmony
9. Program Structure
? OT
Constructs formal grammars directly from markedness principles
Strongly universalist: inherent typology
? OT allows completely formal markedness-based explanation of highly complex data
Acquisition
? Initial state predictions explored through behavioral experiments with infants
Neural Realization
? Construction of a miniature, concrete LAD
10. ? The Great Dialectic Phonological representations serve two masters on the other hand, phonological representations must interface to the lexicon, which contains a fixed invariant form; this underlying form must be recoverable from the representation, and requiring that the representation match the lexical form is the job that Faithfulness constraints perform to uphold this second crucial half of the dialectic. on the other hand, phonological representations must interface to the lexicon, which contains a fixed invariant form; this underlying form must be recoverable from the representation, and requiring that the representation match the lexical form is the job that Faithfulness constraints perform to uphold this second crucial half of the dialectic.
11. OT from Markedness Theory MARKEDNESS constraints: *a: No a
FAITHFULNESS constraints
Fa demands that /input/ ? [output] leave a unchanged (McCarthy & Prince 95)
Fa controls when a is avoided (and how)
Interaction of violable constraints: Ranking
a is avoided when *a Fa
a is tolerated when Fa *a
M1 M2: combines multiple markedness dimensions
12. OT from Markedness Theory MARKEDNESS constraints: *a
FAITHFULNESS constraints: Fa
Interaction of violable constraints: Ranking
a is avoided when *a Fa
a is tolerated when Fa *a
M1 M2: combines multiple markedness dimensions
Typology: All cross-linguistic variation results from differences in ranking in how the dialectic is resolved (and in how multiple markedness dimensions are combined) Whether a is avoided varies cross-linguistically, i.e., relative rankings of * a and F a must vary. The most restrictive assumption is that this is the only way grammars may differ; OTs theory of typology is simply that the possible languages are precisely those arising from some ranking of the universal constraints Con.Whether a is avoided varies cross-linguistically, i.e., relative rankings of * a and F a must vary. The most restrictive assumption is that this is the only way grammars may differ; OTs theory of typology is simply that the possible languages are precisely those arising from some ranking of the universal constraints Con.
13. OT from Markedness Theory MARKEDNESS constraints
FAITHFULNESS constraints
Interaction of violable constraints: Ranking
Typology: All cross-linguistic variation results from differences in ranking in resolution of the dialectic
Harmony = MARKEDNESS + FAITHFULNESS
A formally viable successor to Minimize Markedness is OTs Maximize Harmony (among competitors) OT defines grammatical as optimally satisfying the ranked constraints, including both Markedness and Faithfulness in the well-formedness measure called Harmony. Maximize Harmony is a formally viable basis for grammar, filling in the crucial half of the dialectic missing in its progenitor, Minimize Markedness.OT defines grammatical as optimally satisfying the ranked constraints, including both Markedness and Faithfulness in the well-formedness measure called Harmony. Maximize Harmony is a formally viable basis for grammar, filling in the crucial half of the dialectic missing in its progenitor, Minimize Markedness.
14. ? Structure Explanatory goals achieved by OT
Individual grammars are literally and formally constructed directly from universal markedness principles
Inherent Typology :
Within the analysis of phenomenon F in language L is inherent a typology of F across all languages To close out this review, let me highlight two explanatory goals achieved by OT. First, .. Second, OT exhibits what I call inherent typology: in the analysis To provide an analysis of F in L, we must hypothesize a set of structures and constraints, at which point we have automatically hypothesized a universal typology for F: all and only the systems that arise by re-ranking our proposed constraints. This is quite a strong version of the goal of couching all linguistic explanation in universal principles.
(1a) 5 minTo close out this review, let me highlight two explanatory goals achieved by OT. First, .. Second, OT exhibits what I call inherent typology: in the analysis To provide an analysis of F in L, we must hypothesize a set of structures and constraints, at which point we have automatically hypothesized a universal typology for F: all and only the systems that arise by re-ranking our proposed constraints. This is quite a strong version of the goal of couching all linguistic explanation in universal principles.
(1a) 5 min
15. Program Structure
? OT
Constructs formal grammars directly from markedness principles
Strongly universalist: inherent typology
? OT allows completely formal markedness-based explanation of highly complex data --- Friday
Acquisition
? Initial state predictions explored through behavioral experiments with infants
Neural Realization
? Construction of a miniature, concrete LAD
16. ? Structure: Summary OT builds formal grammars directly from markedness: MARK, with FAITH
Friday:
Inventories consistent with markedness relations are formally the result of OT with local conjunction
Even highly complex patterns can be explained purely with simple markedness constraints: all complexity is in constraints interaction through ranking and conjunction: Lango ATR vowel harmony Lango: 18 minutes!
(1): 31 mins
(1): 29 mins!Lango: 18 minutes!
(1): 31 mins
(1): 29 mins!
17. Program Structure
? OT
Constructs formal grammars directly from markedness principles
Strongly universalist: inherent typology
? OT allows completely formal markedness-based explanation of highly complex data
Acquisition
? Initial state predictions explored through behavioral experiments with infants
Neural Realization
? Construction of a miniature, concrete LAD
18. Nativism I: Learnability Learning algorithm
Provably correct and efficient (under strong assumptions)
Sources:
Tesar 1995 et seq.
Tesar & Smolensky 1993, , 2000
If you hear A when you expected to hear E, increase the Harmony of A above that of E by minimally demoting each constraint violated by A below a constraint violated by E
20. Nativism I: Learnability M F is learnable with /in+possible/?impossible
not = in- except when followed by
exception that proves the rule, M = NPA
M F is not learnable from data if there are no exceptions (alternations) of this sort, e.g., if lexicon produces only inputs with mp, never np: then ?M and ?F, no M vs. F conflict, no evidence for their ranking
Thus must have M F in the initial state, H0
21. The Initial State OT-general: MARKEDNESS FAITHFULNESS
Learnability demands (Richness of the Base)
(Alan Prince, p.c., 93; Smolensky 96a)
? Child production: restricted to the unmarked
? Child comprehension: not so restricted
(Smolensky 96b)
22. Nativism II: Experimental Test
23. Nativism II: Experimental Test Linking hypothesis:
More harmonic phonological stimuli ? Longer listening time
More harmonic:
?M ? *M, when equal on F
?F ? *F, when equal on M
When must chose one or the other, more harmonic to satisfy M: M F
M = Nasal Place Assimilation (NPA)
24. Experimental Paradigm
25. 4.5 Months (NPA)
27. 4.5 Months (NPA)
28. 4.5 Months (NPA)
29. Program Structure
? OT
Constructs formal grammars directly from markedness principles
Strongly universalist: inherent typology
? OT allows completely formal markedness-based explanation of highly complex data
Acquisition
? Initial state predictions explored through behavioral experiments with infants
Neural Realization
? Construction of a miniature, concrete LAD
30. The question The nativist hypothesis, central to generative linguistic theory:
Grammatical principles respected by all human languages are encoded in the genome.
Questions:
Evolutionary theory: How could this happen?
Empirical question: Did this happen?
Today: What concretely could it mean for a genome to encode innate knowledge of universal grammar?
31. UGenomics The game: Take a first shot at a concrete example of a genetic encoding of UG in a Language Acquisition Device
Proteins ? Universal grammatical principles ? SUSPEND DISBELIEF FOR 15 MINUTES!SUSPEND DISBELIEF FOR 15 MINUTES!
32. UGenomics The game: Take a first shot at a concrete example of a genetic encoding of UG in a Language Acquisition Device
Proteins ? Universal grammatical principles ? SUSPEND DISBELIEF FOR 15 MINUTES!SUSPEND DISBELIEF FOR 15 MINUTES!
34. UGenome for CV Theory Three levels
Abstract symbolic: Basic CV Theory
Abstract neural: CVNet
Abstract genomic: CVGenome
35. UGenomics: Symbolic Level Three levels
Abstract symbolic: Basic CV Theory
Abstract neural: CVNet
Abstract genomic: CVGenome
37. Basic syllabification: Function Basic CV Syllable Structure Theory
Basic No more than one segment per syllable position: .(C)V(C).
: /underlying form/ ? [surface form]
/CVCC/ ? [.CV.C V C.] /pd+d/?[pd?d]
Correspondence Theory
McCarthy & Prince 1995 (M&P)
/C1V2C3C4/ ? [.C1V2.C3 V C4]
38. Why basic CV syllabification? : underlying ? surface linguistic forms
Forms simple but combinatorially productive
Well-known universals; typical typology
Mini-component of real natural language grammars
A (perhaps the) canonical model of universal grammar in OT
39. PARSE: Every element in the input corresponds to an element in the output
ONSET: No V without a preceding C
etc. Syllabification: Constraints (Con)
40. UGenomics: Neural Level Three levels
Abstract symbolic: Basic CV Theory
Abstract neural: CVNet
Abstract genomic: CVGenome
42. CVNet Architecture /C1 C2/ ? [C1 V C2]
43. Connection substructure
44. PARSE All connection coefficients are +2
45. ONSET All connection coefficients are ?1
46. Crucial Open Question(Truth in Advertising) Relation between strict domination and neural networks?
47. CVNet Dynamics Boltzmann machine/Harmony network
Hinton & Sejnowski 83 et seq. ; Smolensky 83 et seq.
stochastic activation-spreading algorithm: higher Harmony ? more probable
CVNet innovation: connections realize fixed symbol-level constraints with variable strengths
learning: modification of Boltzmann machine algorithm to new architecture
48. Learning Behavior A simplified system can be solved analytically
Learning algorithm turns out to
Dsi(?) = e [# violations of constrainti P? ]
49. UGenomics: Genome Level Three levels
Abstract symbolic: Basic CV Theory
Abstract neural: CVNet
Abstract genomic: CVGenome
51. Connectivity geometry Assume 3-d grid geometry
52. ONSET VO segment: N&S S VO
53. Connectivity: PARSE Input units grow south and connect
54. To be encoded How many different kinds of units are there?
What information is necessary (from the source units point of view) to identify the location of a target unit, and the strength of the connection with it?
How are constraints initially specified?
How are they maintained through the learning process?
55. Unit types Input units C V
Output units C V x
Correspondence units C V
7 distinct unit types
Each represented in a distinct sub-region of the abstract genome
Help ourselves to implicit machinery to spell out these sub-regions as distinct cell types, located in grid as illustrated
56. Direction of projection growth Topographic organizations widely attested throughout neural structures
Activity-dependent growth a possible alternative
Orientation information (axes)
Chemical gradients during development
Cell age a possible alternative
57. Projection parameters Direction
Extent
Local
Non-local
Target unit type
Strength of connections encoded separately
58. Connectivity Genome Contributions from ONSET and PARSE:
59. CVGenome: Connectivity
60. Encoding connection strength For each constraint ?i , need to embody
Constraint strength si
Connection coefficients (F ? ? cell types)
Product of these is contribution of ?i to the F ? ? connection weight
61. Processing
62. Development
63. Learning
64. CVGenome: Connection Coefficients
65. Abstract Gene Map
66. UGenomics Realization of processing and learning algorithms in abstract molecular biology, using the types of interactions known to be biologically possible and genetically encodable (4): 17 min(4): 17 min
67. UGenomics Host of questions to address
Will this really work?
Can it be generalized to distributed nets?
Is the number of genes [77=0.26%] plausible?
Are the mechanisms truly biologically plausible?
Is it evolvable? (4): 17 min(4): 17 min
68. Hopeful Conclusion Progress is possible toward a Grand Unified Theory of the cognitive science of language
addressing the structure, acquisition, use, and neural realization of knowledge of language
strongly governed by universal grammar
with markedness as the unifying principle
as formalized in Optimality Theory at the symbolic level
and realized via Harmony Theory in abstract neural nets which are potentially encodable genetically
69. Hopeful Conclusion Progress is possible toward a Grand Unified Theory of the cognitive science of language