Jakobsons Grand Unified Theory of Linguistic Cognition

1. Jakobson's Grand Unified Theory of Linguistic Cognition Paul Smolensky Cognitive Science Department Johns Hopkins University

2. Grammar and Cognition 1. What is the system of knowledge? 2. How does this system of knowledge arise in the mind/brain? 3. How is this knowledge put to use? 4. What are the physical mechanisms that serve as the material basis for this system of knowledge and for the use of this knowledge? (Chomsky �88; p. 3) Managua Lectures � Lofty goals, but its seems quite an open question whether linguistic theory will be up to answering these questions Managua Lectures � Lofty goals, but its seems quite an open question whether linguistic theory will be up to answering these questions

3. Advertisement The complete story, forthcoming (2003) Blackwell: The harmonic mind: From neural computation to optimality-theoretic grammar Smolensky & Legendre

4. A Grand Unified Theory for the cognitive science of language is enabled by Markedness: Avoid a ? Structure Alternations eliminate a Typology: Inventories lack a ? Acquisition a is acquired late ? Processing a is processed poorly ? Neural Brain damage most easily disrupts a Jakobson�s Program In particular, Jakobson argued that a grand � disrupts a. The question behind much of the research I�ll discuss is, can this program of grand unification be formally carried out within Optimality Theory?In particular, Jakobson argued that a grand � disrupts a. The question behind much of the research I�ll discuss is, can this program of grand unification be formally carried out within Optimality Theory?

5. Structure Acquisition Use Neural Realization ? Theoretical. OT (Prince & Smolensky �91, �93): Construct formal grammars directly from markedness principles General formalism/ framework for grammars: phonology, syntax, semantics; GB/LFG/� Strongly universalist: inherent typology

6. Theoretical Formal structure enables OT-general: Learning algorithms Constraint Demotion: Provably correct and efficient (when part of a general decomposition of the grammar learning problem) Tesar 1995 et seq. Tesar & Smolensky 1993, �, 2000 Gradual Learning Algorithm Boersma 1998 et seq. Structure Acquisition Use Neural Realization

7. Structure Acquisition Use Neural Realization Theoretical Theorems regarding the computational complexity of algorithms for processing with OT grammars Tesar �94 et seq. Ellison �94 Eisner �97 et seq. Frank & Satta �98 Karttunen �98

8. Structure Acquisition Use Neural Realization Theoretical OT derives from the theory of abstract neural (connectionist) networks via Harmonic Grammar (Legendre, Miyata, Smolensky �90) For moderate complexity, now have general formalisms for realizing complex symbol structures as distributed patterns of activity over abstract neurons structure-sensitive constraints/rules as distributed patterns of strengths of abstract synaptic connections optimization of Harmony

9. Program Structure ? OT Constructs formal grammars directly from markedness principles Strongly universalist: inherent typology ? OT allows completely formal markedness-based explanation of highly complex data Acquisition ? Initial state predictions explored through behavioral experiments with infants Neural Realization ? Construction of a miniature, concrete LAD

10. ? The Great Dialectic Phonological representations serve two masters � on the other hand, phonological representations must interface to the lexicon, which contains a fixed invariant form; this underlying form must be recoverable from the representation, and requiring that the representation match the lexical form is the job that Faithfulness constraints perform to uphold this second crucial half of the dialectic. � on the other hand, phonological representations must interface to the lexicon, which contains a fixed invariant form; this underlying form must be recoverable from the representation, and requiring that the representation match the lexical form is the job that Faithfulness constraints perform to uphold this second crucial half of the dialectic.

11. OT from Markedness Theory MARKEDNESS constraints: *a: No a FAITHFULNESS constraints Fa demands that /input/ ? [output] leave a unchanged (McCarthy & Prince �95) Fa controls when a is avoided (and how) Interaction of violable constraints: Ranking a is avoided when *a � Fa a is tolerated when Fa � *a M1 � M2: combines multiple markedness dimensions

12. OT from Markedness Theory MARKEDNESS constraints: *a FAITHFULNESS constraints: Fa Interaction of violable constraints: Ranking a is avoided when *a � Fa a is tolerated when Fa � *a M1 � M2: combines multiple markedness dimensions Typology: All cross-linguistic variation results from differences in ranking � in how the dialectic is resolved (and in how multiple markedness dimensions are combined) Whether a is avoided varies cross-linguistically, i.e., relative rankings of * a and F a must vary. The most restrictive assumption is that this is the only way grammars may differ; OT�s theory of typology is simply that the possible languages are precisely those arising from some ranking of the universal constraints Con.Whether a is avoided varies cross-linguistically, i.e., relative rankings of * a and F a must vary. The most restrictive assumption is that this is the only way grammars may differ; OT�s theory of typology is simply that the possible languages are precisely those arising from some ranking of the universal constraints Con.

13. OT from Markedness Theory MARKEDNESS constraints FAITHFULNESS constraints Interaction of violable constraints: Ranking Typology: All cross-linguistic variation results from differences in ranking � in resolution of the dialectic Harmony = MARKEDNESS + FAITHFULNESS A formally viable successor to Minimize Markedness is OT�s Maximize Harmony (among competitors) OT defines grammatical as optimally satisfying the ranked constraints, including both Markedness and Faithfulness in the well-formedness measure called Harmony. Maximize Harmony is a formally viable basis for grammar, filling in the crucial half of the dialectic missing in its progenitor, Minimize Markedness.OT defines grammatical as optimally satisfying the ranked constraints, including both Markedness and Faithfulness in the well-formedness measure called Harmony. Maximize Harmony is a formally viable basis for grammar, filling in the crucial half of the dialectic missing in its progenitor, Minimize Markedness.

14. ? Structure Explanatory goals achieved by OT Individual grammars are literally and formally constructed directly from universal markedness principles Inherent Typology : Within the analysis of phenomenon F in language L is inherent a typology of F across all languages To close out this review, let me highlight two explanatory goals achieved by OT. First, .. Second, OT exhibits what I call �inherent typology�: in the analysis � To provide an analysis of F in L, we must hypothesize a set of structures and constraints, at which point we have automatically hypothesized a universal typology for F: all and only the systems that arise by re-ranking our proposed constraints. This is quite a strong version of the goal of couching all linguistic explanation in universal principles. (1a) 5 minTo close out this review, let me highlight two explanatory goals achieved by OT. First, .. Second, OT exhibits what I call �inherent typology�: in the analysis � To provide an analysis of F in L, we must hypothesize a set of structures and constraints, at which point we have automatically hypothesized a universal typology for F: all and only the systems that arise by re-ranking our proposed constraints. This is quite a strong version of the goal of couching all linguistic explanation in universal principles. (1a) 5 min

15. Program Structure ? OT Constructs formal grammars directly from markedness principles Strongly universalist: inherent typology ? OT allows completely formal markedness-based explanation of highly complex data --- Friday Acquisition ? Initial state predictions explored through behavioral experiments with infants Neural Realization ? Construction of a miniature, concrete LAD

16. ? Structure: Summary OT builds formal grammars directly from markedness: MARK, with FAITH Friday: Inventories consistent with markedness relations are formally the result of OT with local conjunction Even highly complex patterns can be explained purely with simple markedness constraints: all complexity is in constraints� interaction through ranking and conjunction: Lango ATR vowel harmony Lango: 18 minutes! (1): 31 mins (1): 29 mins!Lango: 18 minutes! (1): 31 mins (1): 29 mins!

17. Program Structure ? OT Constructs formal grammars directly from markedness principles Strongly universalist: inherent typology ? OT allows completely formal markedness-based explanation of highly complex data Acquisition ? Initial state predictions explored through behavioral experiments with infants Neural Realization ? Construction of a miniature, concrete LAD

18. Nativism I: Learnability Learning algorithm Provably correct and efficient (under strong assumptions) Sources: Tesar 1995 et seq. Tesar & Smolensky 1993, �, 2000 If you hear A when you expected to hear E, increase the Harmony of A above that of E by minimally demoting each constraint violated by A below a constraint violated by E

20. Nativism I: Learnability M � F is learnable with /in+possible/?impossible �not� = in- except when followed by � �exception that proves the rule, M = NPA� M � F is not learnable from data if there are no �exceptions� (alternations) of this sort, e.g., if lexicon produces only inputs with mp, never np: then ?M and ?F, no M vs. F conflict, no evidence for their ranking Thus must have M � F in the initial state, H0

21. The Initial State OT-general: MARKEDNESS � FAITHFULNESS Learnability demands (Richness of the Base) (Alan Prince, p.c., �93; Smolensky �96a) ? Child production: restricted to the unmarked ? Child comprehension: not so restricted (Smolensky �96b)

22. Nativism II: Experimental Test

23. Nativism II: Experimental Test Linking hypothesis: More harmonic phonological stimuli ? Longer listening time More harmonic: ?M ? *M, when equal on F ?F ? *F, when equal on M When must chose one or the other, more harmonic to satisfy M: M � F M = Nasal Place Assimilation (NPA)

24. Experimental Paradigm

25. 4.5 Months (NPA)

27. 4.5 Months (NPA)

28. 4.5 Months (NPA)

29. Program Structure ? OT Constructs formal grammars directly from markedness principles Strongly universalist: inherent typology ? OT allows completely formal markedness-based explanation of highly complex data Acquisition ? Initial state predictions explored through behavioral experiments with infants Neural Realization ? Construction of a miniature, concrete LAD

30. The question The nativist hypothesis, central to generative linguistic theory: Grammatical principles respected by all human languages are encoded in the genome. Questions: Evolutionary theory: How could this happen? Empirical question: Did this happen? Today: What � concretely � could it mean for a genome to encode innate knowledge of universal grammar?

31. UGenomics The game: Take a first shot at a concrete example of a genetic encoding of UG in a Language Acquisition Device � Proteins ? Universal grammatical principles ? SUSPEND DISBELIEF FOR 15 MINUTES!SUSPEND DISBELIEF FOR 15 MINUTES!

32. UGenomics The game: Take a first shot at a concrete example of a genetic encoding of UG in a Language Acquisition Device � Proteins ? Universal grammatical principles ? SUSPEND DISBELIEF FOR 15 MINUTES!SUSPEND DISBELIEF FOR 15 MINUTES!

34. UGenome for CV Theory Three levels Abstract symbolic: Basic CV Theory Abstract neural: CVNet Abstract genomic: CVGenome

35. UGenomics: Symbolic Level Three levels Abstract symbolic: Basic CV Theory Abstract neural: CVNet Abstract genomic: CVGenome

37. Basic syllabification: Function Basic CV Syllable Structure Theory �Basic� � No more than one segment per syllable position: .(C)V(C). �: /underlying form/ ? [surface form] /CVCC/ ? [.CV.C V C.] /p�d+d/?[p�d?d] Correspondence Theory McCarthy & Prince 1995 (�M&P�) /C1V2C3C4/ ? [.C1V2.C3 V C4]

38. Why basic CV syllabification? �: underlying ? surface linguistic forms Forms simple but combinatorially productive Well-known universals; typical typology Mini-component of real natural language grammars A (perhaps the) canonical model of universal grammar in OT

39. PARSE: Every element in the input corresponds to an element in the output ONSET: No V without a preceding C etc. Syllabification: Constraints (Con)

40. UGenomics: Neural Level Three levels Abstract symbolic: Basic CV Theory Abstract neural: CVNet Abstract genomic: CVGenome

42. CVNet Architecture /C1 C2/ ? [C1 V C2]

43. Connection substructure

44. PARSE All connection coefficients are +2

45. ONSET All connection coefficients are ?1

46. Crucial Open Question(Truth in Advertising) Relation between strict domination and neural networks?

47. CVNet Dynamics Boltzmann machine/Harmony network Hinton & Sejnowski �83 et seq. ; Smolensky �83 et seq. stochastic activation-spreading algorithm: higher Harmony ? more probable CVNet innovation: connections realize fixed symbol-level constraints with variable strengths learning: modification of Boltzmann machine algorithm to new architecture

48. Learning Behavior A simplified system can be solved analytically Learning algorithm turns out to � Dsi(?) = e [# violations of constrainti P? ]

49. UGenomics: Genome Level Three levels Abstract symbolic: Basic CV Theory Abstract neural: CVNet Abstract genomic: CVGenome

51. Connectivity geometry Assume 3-d grid geometry

52. ONSET VO segment: N&S S VO

53. Connectivity: PARSE Input units grow south and connect

54. To be encoded How many different kinds of units are there? What information is necessary (from the source unit�s point of view) to identify the location of a target unit, and the strength of the connection with it? How are constraints initially specified? How are they maintained through the learning process?

55. Unit types Input units C V Output units C V x Correspondence units C V 7 distinct unit types Each represented in a distinct sub-region of the abstract genome �Help ourselves� to implicit machinery to spell out these sub-regions as distinct cell types, located in grid as illustrated

56. Direction of projection growth Topographic organizations widely attested throughout neural structures Activity-dependent growth a possible alternative Orientation information (axes) Chemical gradients during development Cell age a possible alternative

57. Projection parameters Direction Extent Local Non-local Target unit type Strength of connections encoded separately

58. Connectivity Genome Contributions from ONSET and PARSE:

59. CVGenome: Connectivity

60. Encoding connection strength For each constraint ?i , need to �embody� Constraint strength si Connection coefficients (F ? ? cell types) Product of these is contribution of ?i to the F ? ? connection weight

61. Processing

62. Development

63. Learning

64. CVGenome: Connection Coefficients

65. Abstract Gene Map

66. UGenomics Realization of processing and learning algorithms in �abstract molecular biology�, using the types of interactions known to be biologically possible and genetically encodable (4): 17 min(4): 17 min

67. UGenomics Host of questions to address Will this really work? Can it be generalized to distributed nets? Is the number of genes [77=0.26%] plausible? Are the mechanisms truly biologically plausible? Is it evolvable? (4): 17 min(4): 17 min

68. Hopeful Conclusion Progress is possible toward a Grand Unified Theory of the cognitive science of language addressing the structure, acquisition, use, and neural realization of knowledge of language strongly governed by universal grammar with markedness as the unifying principle as formalized in Optimality Theory at the symbolic level and realized via Harmony Theory in abstract neural nets which are potentially encodable genetically

69. Hopeful Conclusion Progress is possible toward a Grand Unified Theory of the cognitive science of language

Jakobsons Grand Unified Theory of Linguistic Cognition

Jakobsons Grand Unified Theory of Linguistic Cognition

Presentation Transcript

Piagetian Theory of Cognition

The guts of a GUT: Elements of a Grand Unified Theory of Growth

Grand Unified Theory, Running Coupling Constants and the Story of our Universe

Linguistic Theory

Linguistic Theory

Linguistic Theory

Linguistic Theory

Linguistic Theory

Linguistic Theory

Comprehensive Unified Learning Theory

Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Linguistic Theory

Kids, Cats and Concepts: Toward a Grand Unified Theory of Thinking

Linguistic Theory

Unified growth theory

Linguistic Theory

The guts of a GUT: Elements of a Grand Unified Theory of Growth

A Unified Theory of Information

The Unified Theory of Pseudorandomness

Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Survey of Linguistic Method and Theory