1 / 15

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013. Class 1. Language acquisition theory: Origins and issues. Today’s class - Introduction. A broad sweep of ‘learnability’ research, as background.

eyounger
Download Presentation

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013 Class 1. Language acquisition theory: Origins and issues

  2. Today’s class - Introduction • A broad sweep of ‘learnability’ research, as background. • Ultimate goal of this research : How is it possible for an infant, in just 5 or 6 years, to master the intricate system that is a human language? An extraordinary feat! • Origins and early discoveries • Interdisciplinary methods • Moving toward a closer relation with linguistics and psycholinguistics • In following classes we will go back over this material and consider it more carefully. • Feel free to ask as many questions as you like. But don’t feel pressured to understand everything today.

  3. Real children, real language acquisition

  4. E-children, models of language acquisition How do children do it?

  5. Goals & methods • Ultimate goal = a psychological model of the step by step processes by which all normally developing children create a rich mental grammar, based on what they hear people saying. • For adults, it’s a serious challenge  very mixed outcomes.For toddlers, essentially perfect outcomes guaranteed. How? • Interdisciplinary methods. All of these are needed: Linguistics – what is a mental grammar like?Developmental psychology – what stages of knowledge? Psycholinguistics – how do children parse & comprehend sentences they hear? Corpus search – what sentences do they hear? Computational linguistics – simulation expts to test models.  Formal learning theory – what could be learned from input? 5 5

  6. Early days: Mathematical theorems • Mathematical proofs concerning the relation between properties of languages and of the grammars that generate them. The ‘Chomsky hierarchy’. (Chomsky & Miller 1963) • The first learnability studies were also mathematical. No attempt at psychological plausibility. Just: How in principle could grammars be fitted to sets of sentences? • Proofs based on abstractions, far from actual human languages. Grammars designated as g1, g2, g3…; languages as constituted of an arbitrary set of sentences s1, s2, s3…. • Gold’s Theorem (1967): A proof that no possible learning algorithm could acquire languages in the Chomsky hierarchy on the basis of ‘text presentation’ only (i.e., a sample of the sentences in the language; no info about non-sentences).<Warning: This is an extreme oversimplification!> 6 6

  7. Psychological implications:innate knowledge • Gold’s Theorem resonated with psychologists who had observed that children receive little or no negative evidence (about ungrammatical sentences in the target language). • Mostly anecdotes at first – examples in Class 3. But later, systematic documentation of the uninformativeness of adult responses to child errors. (Marcus 1993) • Gold’s Theorem thus fueled the conviction among many linguists & psycholinguists that learners must have innate information about what human languages are like, to compensate for missing information in their input. • Chomsky called it Universal Grammar (UG). UG has been a basis for much linguistic theory ever since. • More generally: the Poverty of the Stimulus = relevant evidence of all kinds is absent. (Classes 3 and 4.) 7 7

  8. The Subset Problem – still central to language acquisition theory • Gold noted that without negative evidence: If a language Lx (a set of sentences) is a proper subset of a language Ly, then a learner that hypothesized Ly when the target was in fact Lx, would have made an error not curable by any subsequent input. • Example: Lx = English; Ly = a language like English but with scrambling. • For target English, a child guessing Ly would make word-order errors (e.g. Jim fish likes) detectable to English listeners, but not by the learner herself listening to positive examples by others! • This doesn’t happen! If it did, languages would grow larger and larger over time, as one person’s superset error would be input to later learners. But this is not the case. What prevents it? Ly Lx 8 8

  9. How to solve the Subset Problem? • Since children rarely if ever make superset errors, our models of learning must include some factor that prevents them. • Gold’s proposal: The possible grammar hypotheses are priority- ordered such that every ‘subset grammar’ takes precedence over all of its ‘superset grammars’. • Learners would needinnate knowledgeof this ordering. And an innate strategy of systematically testing grammars in order of priority. The learner can move on from Li to Li+1 only if Li fails on an input sentence. • This ‘enumeration’ approach works! But it has since been rejected as linguistically unrealistic. It predicts that some languages are acquired vastly more slowly than others, just because they appear late in the enumeration – regardless of how distinctive their sentences are. (Pinker 1979) • We’ll consider other possible solutions in Class 5. 9 9

  10. Moving closer to linguistics… • Wexler & Culicover (1980) developed a learning model that incorporated: linguistic theory (Extended Standard Theory, Chomsky 1973) psychological resource limits (i.e., maximum 3-clause sentences needed for acquisition, “degree-2 embedding”). • In EST linguistics, surface structures of sentences were derived from deep structures by long derivations, with cyclic application of many transformational rules. • To achieve their proof of learnability, W&C had to make some very strong assumptions: that deep structures are universal; and that a unique association with meaning provides the learner with the deep structure of each input sentence. • Thus, only the transformational rules needed to be learned.

  11. Acquiring T-rules (W&C) • Learner hears a sentence, guesses its meaning from context, represents the DS, applies currently hypothesized T-rules to generate a surface string of words. If it matches the input string, retain the current grammar hypothesis.  • If the strings mis-match, the learner either: deletes one T-rule from the grammar, at random; or: adds at random any one T-rule that creates a match. • Psychologically implausible! Also, the resulting grammar might be less adequate than the one it replaced (e.g., deletion of a needed T-rule).  • Much trial-and-error needed to arrive at the target grammar. • W&C proved (heroically!) that such grammars could be reliably acquired in this fashion, but only if the T-rules were constrained in certain ways.

  12. Close convergence with linguistic theory • Very exciting! These constraints proven necessary for learnability had theoretical linguistic credentials. • W&C’s Binary Principle disallowed transformational operations over more than two adjacent clauses. Very like the Subjacency Condition that Chomsky had argued for on purely linguistic grounds. • W&C’s Freezing Principle incorporated several other limits on transformational applications already observed by linguists. • To be useful, learners must innately know these constraints. So these parallels with linguistic theory were welcomed as support for highly specific innate linguistic knowledge. • Though not very psychologically realistic(!), this work paved the way toward linguistically-informed learning models. • But it had a short life.Published in 1980. Then in 1981… !!

  13. 1981: No more transformational rules! • Through the 1970s, linguists had been streamlining the original-style T-rules, each with its own built-in constraints. Instead, general constraints on rule application (as in W&C’s learnability proof). Encouraging for W&C, but… • End result was one single T-rule: Move α (later, Affect α). This means: any possible transformational operation is free to apply at any point in a derivation except where it is blocked by the general constraints (e.g., Subjacency). • Consequence: No way to register variation between lang-uages in terms of different sets of transformational rules. • The newly introduced parameters took on this role. • The syntactic component was now held to be fully innate and universal, except for a finite number of choice points.

  14. New look: Syntax acquisition as setting parameters • The Headedness parameter: Are syntactic phrases head-initial (e.g., in VP, the verb precedes its object) or head-final (the verb follows the object)? • The Wh-movement parameter: Does a Wh-phrase move to the top (the Complementizer projection) of a clause or does it remain in situ? • Now, all the learner has to do is detect the correct setting of each of a finite number of parameters, based on the input. • Parameter values were said to be ‘triggered’ by the learner’s encounter with a distinctive property of an input sentence. • This would greatly reduce the workload of data-processing, and would help address the Poverty of Stimulus problem.

  15. Parameter setting as flipping switches Chomsky never provided a specific implementation of parametric triggering. He often employed the metaphor of setting switches. In next class (Wednesday), we’ll see how this metaphor might be cashed out as specific psychological computations. Today we covered the early days: Applied Math  Linguistics We’ll move on through: Computer science  Psycholinguistics Focus will be on how to set syntactic parameters. READING: Please read for Wednesday (Class 2)The 7-page encyclopedia entry: “Principles and parameters theory and language acquisition” by Snyder & Lillo-Martin. More examples of syntactic parameters. Related to child performance data and learning models. Well-presented. 15

More Related