1 / 15

Modelling Language Evolution Lecture 4: Learning bias and linguistic structure

Modelling Language Evolution Lecture 4: Learning bias and linguistic structure. Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit. Summary – the story so far. What is a model? Why do linguists need computational models?

gibson
Download Presentation

Modelling Language Evolution Lecture 4: Learning bias and linguistic structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modelling Language EvolutionLecture 4: Learning bias and linguistic structure Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit

  2. Summary – the story so far • What is a model? Why do linguists need computational models? • Modelling learning. One approach: Neural nets • Nodes, activations, connection weights, hidden representations • Error driven learning • Learning syntax: recurrent nets, starting small, critical period • Evolving network structure: genetic algorithms

  3. Learning bias • We have been talking about what learners are “good” or “bad” at – what they can and cannot learn. • We refer to the learner’s prior bias • (This can be given a simple mathematical definition – but let’s not worry about that…) • Prior bias is everything the learner brings to the problem that is independent of the data • Where does the bias come from? • It comes from biology. It is what is innate.

  4. Language universals and learning biases • Christiansen suggests that languages themselves adapt to learners. • So far we have looked at long-distance dependency and embedding… • Christiansen suggests less general targets for explanation: • Branching direction/head-order consistency • Subjacency • Typically, these are assumed to be innate (and therefore evolved by natural selection) • What if they arise naturally from sequential learning biases?

  5. Head-ordering consistency • Languages typically head-first or head-last. • (for the linguists…) This might be explained with a parameterised of X-bar theory

  6. Recursive consistency • Christiansen generalises head-ordering in terms of the interaction of recursive rules. • Consistent trees:

  7. Recursive consistency • Christiansen generalises head-ordering in terms of the interaction of recursive rules. • Inconsistent trees:

  8. A simple typology • Typologists construct a space of logically-possible languages and assign each a type • Christiansen’s binary typology: • English is 11100

  9. Which languages can SRNs learn? • If languages adapt to learning biases (as opposed to the other way round), perhaps some types will be better than others? • Will the SRN biases predict cross-linguistic distribution? • 8x8x8 SRN trained on next-category prediction • Categories: • Singular N, Plural N • Singular V, Plural V • Singular genitive, Plural genitive • Adposition • End of sentence marker

  10. Experimental setup • Trained on each of the 32 languages • Each language trained on 25 nets • Each of these had 5 different initial weight settings and 5 different random training sets • Each set contained 1000 words • Each net trained on 7 passes through data • So: 800 simulations of 7000 words each • Output in terms of mean standard error of predicting the correct probability distribution for next-word

  11. Results 1: Net error v. recursive inconsistency • Net error correlates very well with number of inconsistencies (r=.83, p<.0001)

  12. Typological data • 625 languages have been characterised in terms of: • Verb-object order • Adposition order (i.e., prepositions or postpositions) • Genitive order • Grouped according to historical relatedness into 252 genera. (Why?) • This controls for imbalances in the sample that are due to historical epiphenomena.

  13. Results 2: Net error v. cross-linguistic distribution • Net error correlates well with proportion of genera (r=.35, p<.05)

  14. Conclusions, and potential problems • We have moved from: • Learners adapt to be good at language (via natural selection) • To: • Language adapts to us • Concerns: • What do Christiansen’s results say about Elman and Batali’s? • Are the neural nets modelling learning, or processing? • What about other universals (e.g., subjacency) • Is equating learning difficulty and universal distribution valid? • Where do the languages come from? and what do the errors mean?

More Related