1 / 142

Linguistic Computation for Language Learners

Linguistic Computation for Language Learners. Defining the Learning Problem. The output of learning is complex Examples: that- t , wanna contraction, parasitic gaps, reconstruction, etc. etc. The output of learning is hard to observe Crucial input for learning is hard to observe

mccarthy
Download Presentation

Linguistic Computation for Language Learners

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linguistic Computation for Language Learners

  2. Defining the Learning Problem • The output of learning is complex • Examples: that-t, wanna contraction, parasitic gaps, reconstruction, etc. etc. • The output of learning is hard to observe • Crucial input for learning is hard to observe • It’s noisy (on both sides of the ear) • It’s dissimilar from what must be learned • It’s rare • Yet learning is robust • We should be able to describe the learning problem at multiple grains of analysis, just like the output of learning

  3. Obvious variation • English verbs precede their objects (ate the pizza)Japanese verbs follow their objects (piza-otabeta) • English distinguishes the vowels in sheep and shipSpanish does not • All Russian verbs encode aspect (± completed action)English verbs do not • etc. etc. etc.

  4. English - okRussian - not ok English - badItalian - ok Impossible inall languages Not-so-obvious variation • Example 1: Pronoun Interpretation • While John was reading the book, he ate an apple. • While he was reading the book, John ate an apple. • John ate an apple while he was reading the book. • He ate an apple while John was reading the book. • Example 2: Constraints on questions • What do you think Sally ate ___? • What do you think that Sally ate ___? • Who do you think ___ ate the donut? • Who do you think that ___ ate the donut?

  5. Language typology & learning • The Big Idea: identifying constraints on language variation and explaining the success of language learning are essentially the same problem • Universals: properties that are common to all human languages do not need to be learned • Co-variation: clusters of non-universal properties that consistently co-occur in a language reflect a single underlying trait (and so those properties do not need to be learned individually) • Ensuring learning success: any non-obvious property that must be learned should be part of cluster that includes an ‘obvious’ property, thereby ensuring reliable learning • Current status …

  6. Possibly Universal • Principle C • While John was reading the book he ate an apple. • While he was reading the book John ate an apple. • John ate an apple while he was reading the book. • *He ate an apple while John was reading the book. • Sally thinks that she is the best dancer. • *She thinks that Sally is the best dancer.

  7. English - okRussian - not ok Impossible inall languages Universals • Example 1: Pronoun Interpretation • While John was reading the book, he ate an apple. • While he was reading the book, John ate an apple. • John ate an apple while he was reading the book. • He ate an apple while John was reading the book. LATE EARLY 3 year olds inRussian & English Kazanina & Phillips 2001 2.5 yr olds: Lukyanenko et al. 2015

  8. English - badItalian - ok Co-variation • Example 2: Constraints on questions • What do you think Sally ate ___? • What do you think that Sally ate ___? • Who do you think ___ ate the donut? • Who do you think that ___ ate the donut? • This variation is linked to the possibility of post-verbal subjectse.g., telefono Pavarotti (‘Pavarotti called’)Languages that allow post-verbal subjects also allow the red examplePost-verbal subjects are easy for learners to observe

  9. Indirect learning argument A. Learning Happens P1. That-teffects are reliable, grammatical effects for individual speakers P2. Members of a speech community agree [convergence] C1. Need to explain consensus P3. There is variation between communities C2a. Hard-coding the surface phenomenon is not viable C2b. Experience must explain the consensus [learning] B. Not Directly P4. Evidence for the surface pattern could come from P4a. Explicit instruction P4b. Distribution in input that reflects the community consensus P4c. Input contains cues that are informative to more ‘selective’ learner P5. None of the above works a. no relevant feedback. b. relevant input is absent/misleading, … C. Epiphenomenon of Observables P6. Acceptable that-tsentences reflect alternative structural parse P7. There are readily observable correlates of the alternative parse

  10. English - badItalian - ok Co-variation • Example 2: Constraints on questions • What do you think Sally ate ___? • What do you think that Sally ate ___? • Who do you think ___ ate the donut? • Who do you think that ___ ate the donut? • This variation is linked to the possibility of post-verbal subjectse.g., telefono Pavarotti (‘Pavarotti called’)Languages that allow post-verbal subjects also allow the red examplePost-verbal subjects are easy for learners to observe /11308 159 2 13 0

  11. Null Subject Parameter • Cluster of related properties vary together (Rizzi, Chomsky) • Null subjects • Lack of expletive subjects (‘It is raining’, ‘It is clear that it’s icy outside’) • Post-verbal subjects • Lack of that-trace effects • Suggestion: since the members of this cluster are not independent properties of language – they are reflections of a single underlying trait – a learner need only master one of them in order to know the status of all of them.

  12. But does it work? • Parameters that work … • Parameter learning mechanisms … • Evidence of parameter-setting in learning …

  13. So far … • Language learning & language typologyInvariance & co-variation (~ principles & parameters)Hard-to-observe variation must be linked to easy-to-observe variation • Universals don’t need to be learned • Clarification #1: could reflect domain-specific or domain-general properties of humans • Clarification #2: learning-as-experience/practice ≠ learning-as-choosing (cf. walking) • Co-varying properties reflect a shared trait • Clarification #1: we need to guard against accidental co-variation • Clarification #2: ‘shared trait’ usually understood as representational unit, but could involve different connections, e.g., property X makes it possible to learn Y

  14. Next … • Scope of language variation • Null subject parameter & that-trace effects • A valley in Sweden vs. The World • Some properties that are hard-to-observe, yet seem to vary • Verbs, scope • Consistent vs. inconsistent variation • Understanding hard-to-observe variation: island constraints • Reducing variation to other properties • Quantitative measures of variability

  15. Micro-variation • Greatly expanded database of language informationWorldwide typological surveysDense regional dialect projects • Reliable clusters are harder to find.Not good news for learners. • Large-scale studies biased towards more easy-to-observe phenomena • Important challenge: does variation in ‘non-obvious’ properties show micro-variation?More rigid constraints in domains where learning is more difficult?Testing semantic variation.

  16. Null subjects • Gilligan 1987 (USC PhD): survey of 102 languages Newmeyer: “These results are not very heartening for […] Rizzi’s theory”

  17. Roberts & Holmberg 2005

  18. In a valley in Sweden … • Mainland Scandinavian: Danish, Swedish, NorwegianInsular Scandinavian: Icelandic, Faroese, Medieval MSc.; Älvdalen dialect of Swedish • Cluster of properties distinguishes these two groups • Null non-referential subjects in ISc. (‘Now have __ come many students’) • Non-nominative subjects in ISc. • ‘Stylistic fronting’ in ISc. (‘Forth has come that fished has been illegally’) • Verb-raising across negation in ISc. • Richer subject-verb agreement in ISc. • Like MSc.: modern English, modern French (~)Like ISc.: Old French, Middle English, Yiddish

  19. Älvdalen

  20. So far … • Structure of cross-linguistic variation is structure of the learning problemUniversals … Clusters/Parameters … Isolated facts. • Idea: clusters of surface properties reflect a shared underlying trait • Many questions about the prospects of identifying reliable clusters (microvariation).Failure to find reliable clusters does not dissolve the question of hard-to-observe properties • Evidence for microvariation & clusters may be misleading, due to: (i) data sampling bias towards obvious properties (ii) since clusters reflect abstract traits, their surface realizations may be ambiguous [different consequences for ±obvious properties] • Status of clusters motivates contrasting learning approaches: (i) cue-based(ii) model-fitting … but either case requires detectible evidence that leads learner to change

  21. what is the corpus? • questions, relative clauses, etc. • reliable or noisy input data? • hopefully parsed right

  22. What PS model does well • Generalizes beyond input • Distinguishes non-occurring/difficult from non-occurring/impossible sentences ‘learns island constraints’

  23. Does it do so well? • Distinguishing long/hard vs. impossible • Data Sparseness • too many categories • too little data • limits of trigrams • Cross-language variation • Generalizing across dependency types

  24. Data Sparseness? 93 = 729 153 = 3,375 154 = 50,625 Estimated corpus size = 175,000 wh-questions (3-year period) Approx. 5,000 per month, 160 per day

  25. 2011 Version April 2012 Version a. Who do you think that John met __? 2/20923 b. Who do you think John met __? 236/20923 c. *Who do you think that __ left? 0/20923 d. Who do you think __ left? 24/20923

  26. English - badItalian - ok Co-variation • Example 2: Constraints on questions • What do you think Sally ate ___? • What do you think that Sally ate ___? • Who do you think ___ ate the donut? • Who do you think that ___ ate the donut? • This variation is linked to the possibility of post-verbal subjectse.g., telefono Pavarotti (‘Pavarotti called’)Languages that allow post-verbal subjects also allow the red examplePost-verbal subjects are easy for learners to observe

  27. Data Noisiness • The PS corpus of wh-questions is very clean. This is surprising.… and it is valuable for their learner’s success • But does this really help the learner?Input to learners contains many errors, e.g., in agreement, verb complementation *The plate next to your toys need to be put away. *Can you fill the milk into that cup. [many utterances from non-native speaking parents]So doesn’t the learner need to assume that all input could be noisy?

  28. Variation in Island Constraints • Some constraints are universal (mostly), some clearly vary • Some variation seems reducible to other properties • Operationalizing “island effects” • Different types of amelioration • Many UMD contributions: Masaya Yoshida, Jon Sprouse, Lisa Pearl, Akira Omaki, Dave Kush, Dustin Chacón, Nick Huang

  29. Island Constraints • Complement clause …John thinks [that Mary gave a book to which boy]?Which boy does John think that Mary gave a book to __? • Relative clause …John likes the book [that Mary gave to which boy]?*Which boy does John like the book that Mary gave to __? • Adjunct (conditional) clause …John will cry [if Mary gives a book to which boy]?*Which boy will John cry if Mary gives a book to __?

  30. English - badItalian - ok That-t Effects • Example 2: Constraints on questions • What do you think Sally ate ___? • What do you think that Sally ate ___? • Who do you think ___ ate the donut? • Who do you think that ___ ate the donut? • This variation is linked to the possibility of post-verbal subjectse.g., telefono Pavarotti (‘Pavarotti called’)Languages that allow post-verbal subjects also allow the red examplePost-verbal subjects are easy for learners to observe /11308 159 2 13 0

  31. Escapable Relative Clauses • English*The man [whoi [the suit [RC that __i is wearing]] is dirty] arrived late. • Japanesekiteiruyoohuku-gayogoreteirusinsiis.wearingsuit.nomdirty.is gentleman • Major Subject Construction (Japanese, Korean, Chinese)[IPsonosinsii-ga [NP [CPproi __jkiteiru] [yoohukuj]]-gayogoreteiru] that gentleman-nom pro wearing-is suit-nom dirty-is‘That gentleman is such that the suit that he is wearing is dirty.’ • [CPOpi [IP __i [NP [CPproi __jkiteiru] yoohukuj]-gayogoreteiru] [sinsii]] Op pro wearing-is suit-nom dirty-is gentleman ‘The gentleman who the suit that he is wearing is dirty.’

  32. Escapable Relative clauses Relative Clauses John knows a man [who believes in aliens] John knows a man [who believes in what] *What does John know a man [who believes in ___] SwedishDen teorinkänner jag ingen [somtrorpå __]that theory know I noone [who believes in __] English This is a theorem that I need to find somebody who understands __. *I studied the theorem that John met the mathematician who proved __. What did John go to the store to buy __?

  33. Cross-Language Uniformity Japanese English * Wh-Question Formation Scrambling Scrambling cannot escape relative clauses. (Saito 1985) Wh-movement cannot escape relative clauses. (Ross 1967) *どの男の子に太郎は[[花子があげたRC]本NP]]が好きなの? Which boy does John like [NP the book [RC that Mary gave to ]] ?   which boy OKWhich boy does John think [CP that Mary gave a book to __ ] OKどの男の子に太郎は[CP花子が本を__あげたと] 思っているの? Relative Clauses are islands

  34. Cross-language Variation Adjunct Clauses are islands in English, but not in Japanese. English Japanese  Dono-gakusee-niTaroo-wa [Hanako-ga __ present-o which-student-dat T-top H-nom present-acc ageta-ra] nakidasu-no? give-cond cry-Q? “Which student will Taroo cry if Hanako gives a present to” *Which boy will John cry if Mary gives a present to __?

  35. Typological variation (Yoshida 2006: summary of previous studies and field work) Can we find any features that uniquely distinguish these languages from others? We cannot attribute adjunct (non)islandhood to just one of these features What is the combination of features that contains sufficient features?

  36. Typological variation (Yoshida 2006: summary of previous studies and field work) We cannot attribute adjunct (non)islandhood to just one of these features What is the combination of features that contains sufficient features?

  37. Parasitic Gaps Coordinate Structures

  38. Assignment #2 • Read the following: • Noam Chomsky. 1975. Reflections on Language. NY: Praeger. (chapter 1) • Steven Pinker. 1989. Learnability and Cognition. MIT Press. (chapter 1) • Takuya Goro. 2007. Language-specific constraints on scope interpretation in first language acquisition. PhD dissertation, U of Maryland. (selections) • Each of these works describes a learning problem in a different domain of grammar. To what extent do these problems present the same or different challenges for a learner? To what extent might the challenges be addressed by assuming that the child has the benefit of substantial innate knowledge, or a very powerful distributional learning mechanism? • Due Weds 2/28

  39. Phenomenon #1 • Subject-auxiliary inversion & structure-dependence • Wallace has always liked cheese.Has Wallace always liked cheese?Gromit is afraid of penguins.Is Gromit afraid of penguins? • The dog that is afraid of penguins has always liked cheese.…?

  40. Phenomenon #2 • Dative alternation • John gave a book to Mary.John gave Mary a book.John sent a book to Mary.John sent Mary a book.John bought a book for Mary.John bought Mary a book. • *John donated the museum a painting.*John delivered Mary a book.*John purchased Mary a book.

  41. Phenomenon #2 • Locative alternation • John poured the water into the glass.*John poured the glass with water.*John filled the water into the glass.John filled the glass with water.John sprayed the water onto the wall.John sprayed the wall with water.

  42. Scope Variation • Scope Flexibility:Some animal ate every piece of food. Takuya Goro, UMd 2002-7, Assoc. Prof. Tsuda College, Japan

  43. Some animal ate every piece of food. Ambiguous between surface and inverse scope.

More Related