1 / 23

Modeling Language Acquisition with Neural Networks

Modeling Language Acquisition with Neural Networks. A preliminary research plan Steve R. Howell. Presentation Overview. Goals & challenges Previous & related research Model Overview Implementation details. Project Goals. Model two aspects of human language (grammar and semantics)

oneida
Download Presentation

Modeling Language Acquisition with Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling Language Acquisition with Neural Networks A preliminary research plan Steve R. Howell

  2. Presentation Overview • Goals & challenges • Previous & related research • Model Overview • Implementation details

  3. Project Goals • Model two aspects of human language (grammar and semantics) • Use single neural network performing word prediction • Use orthographic representation • Use small but functional word corpus • e.g. child’s basic functional vocabulary?

  4. Challenges • Need a network architecture capable of modeling both grammar and semantics • What if phonology is required? • Computational limitations

  5. Previous Research

  6. Previous Research • Ellman (1990) • Mozer (1987) • Seidelberg & McLelland (1989) • Landauer et al. (LSA) • Rao & Ballard (1997)

  7. Ellman (1990) • Simple recurrent network (context units) • No built-in representational constraints • Predicts next input from current plus context • Discovers word boundaries in continuous stream of phonemes (slide) Ellman J. L. (1990). Finding structure in time. Cognitive Science, 14, p. 179-211 FOR MORE INFO...

  8. Mozer (1987) - BLIRNET • Interesting model of spatially parallel processing of independent words • Possible input representation -letter triples • e.g. money = **M, **_0, *MO,…E_**, Y** • Encodes beginnings and ends of words well, as well as relative letter position, important to fit human relative-position priming data Mozer M.C. (1987) Early parallel processing in reading: A connectionist approach. In M. Coltheart (Ed.) Attention and Performance, 12: The psychology of reading. FOR MORE INFO...

  9. Siedelberg & McLelland (1989) • Model of word pronunciation • Again, relevant for the input representations used in its orthographic input - triples: • MAKE = **MA, MAK, AKE, KE** (**= WS) • Distributed coding scheme for triples = distributed internal lexicon Seidelberg, M.S. & McClelland J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568 FOR MORE INFO...

  10. Landauer et al. • “LSA” - a statistical model of semantic learning • Very large word corpus • Significant computation required • Good performance • Data set apparently proprietary FOR MORE INFO... Don’t call them, they’ll call you.

  11. Rao & Ballard (1997) • Basis for present network • Algorithms based on extended Kalman Filter • Internal state variable is output of input*weights • (Internal state)*(transpose of feedforward weights) feeds back to predict next input Rao, R.P.N. & Ballard, D.H. (1997). Dynamic model of visual recognition predicts neural response properties in the visual cortex. Neural Computation, 9, 721-763. FOR MORE INFO...

  12. Model Overview

  13. Model Overview • Architecture as in Rao & Ballard • Recurrent structure excellent for temporal variability • Starting with single layer network • Moving to multi-layer Rao & Ballard net

  14. Model Overview(cont’d) • High-level input representations • First layer of net performs word prediction from letters • Second layer adds word prediction from previous words • Words predict next words - Simple grammar?

  15. Model Overview(cont’d) • Additional higher levels should add larger temporal range of context • Words in a large temporal range of current word help to predict it • Implies semantic linkage? • Analogous to LSA “Bag of words” approach at these levels

  16. Possible Advantages • Lower level units learn grammar • Higher level units learn semantics • Combines grammar-learning methods with “bag of words” approach • Possible modification to language generation

  17. Disadvantages • Complex mathematical implementation • Unclear how well higher levels will actually perform • As yet unclear how to modify the net for language generation

  18. Implementation Details

  19. Implementation Challenges • Locating basic functional vocabulary of English language • 600-800 words? • Compare to children’s language learning/usage, not adults • Locating child data?

  20. Model Evaluation (basic) • Test grammar learning as per Ellman • Test semantic regularities as for LSA

  21. Model Evaluation (optimistic) If generative modifications possible: • Ability to output words/phrases semantically linked to input? • ElizaNet? • Child Turing Test? • Human judges compare model output to real children's output for same input?

  22. Current Status • Continue reviewing previous research • Working through implementation details of Rao & Ballard algorithm • Considering different types of high-level input representations • Need to develop/acquire basic English vocabulary/grammar data set

  23. Thank you. Questions and comments are sincerely welcomed. Thoughts on any of the questions raised herein will be extremely valuable. FOR MORE INFO... Please see my web page at: http://www.the-wire.com/~showell/

More Related