200 likes | 298 Views
Modelling Language Acquisition with Neural Networks. Steve R. Howell A preliminary research plan. Presentation Overview. Goals & challenges of this modelling project Examination of previous & related research Overall plan of this project
E N D
Modelling Language Acquisition with Neural Networks Steve R. Howell A preliminary research plan
Presentation Overview • Goals & challenges of this modelling project • Examination of previous & related research • Overall plan of this project • Implementation and Evaluation details, where available
Project Goals • Model two aspects of human language acquisition in a single neural network through word prediction mechanisms: grammar and semantics • Use only orthographic representations, not phonological • Use small but functional word corpus (e.g. child’s basic functional vocabulary?)
Challenges • Need a network architecture capable of modelling both grammar and semantics • Most humans learn language phonologically first, reading later. What if phonology is required? • Computational limitations limit us to a small word corpus; can we achieve functional communication with it?
Previous Research • Ellman (1990) • Mozer (1987) • Seidelberg & McLelland (1989) • Landauer et al. (LSA) • Rao & Ballard (1997)
Ellman (1990) • Competitors • Strengths • Weaknesses Ellman J. L. (1990). Finding structure in time. Cognitive Science, 14, p. 179-211 FOR MORE INFO...
Mozer (1987) • Competitors • Strengths • Weaknesses Mozer M.C. (1987) Early parallel processing in reading: A connectionist approach. In M. Coltheart (Ed.) Attention and Performance, 12: The psychology of reading. FOR MORE INFO...
Siedelberg & McLelland (1989) • Competitors • Strengths • Weaknesses Seidenberg, M.S. & McClelland J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568 FOR MORE INFO...
Landauer et al. • “LSA” - a model of semantic learning by statistical regularity detection using Principal Components Analysis • Very large word corpus, significant process resources required, but good performance • Data set apparently proprietary FOR MORE INFO... Don’t call them, they’ll call you.
Rao & Ballard (1997) • Competitors • Strengths • Weaknesses Rao, R.P.N. & Ballard, D.H. (1997). Dynamic model of visual recognition predicts neural response properties in the visual cortex. Neural Computation, 9, 721-763. FOR MORE INFO...
Overall View of this project • Architecture as in Rao & Ballard • Recurrent structure is as or more applicable to temporal variability as to spatial variability • Starting with single layer network, moving to multi-layer Rao & Ballard net
Overall View (ctd.) • Input representations very high-level • First layer of net thus a word prediction (from letters) level • Second layer adds word prediction from previous words (Simple grammer? Words predict next words?)
Overall View (ctd.) • Additional higher levels should add larger temporal range of influence • Words in a large temporal range of current word help to predict it. • Implies semantic linkage? • Analogous to LSA “Bag of words” approach at these levels
Possible Advantages • Lower level units learn grammar, higher level units learn semantics • Effectively combines grammar-learning methods with LSA-like statistical bag-of-words approach • Top-down prediction route allows for possible modification to language generation, unlike LSA
Disadvantages • Relatively complex mathematical implementation (Kalman Filter) especially compared to Ellman nets • Unclear how well higher levels will actually perform semantic learning • While better suited than LSA, it is as yet unclear how to modify the net for language generation
Resources Required • Hope to find information about basic functional vocabulary of English language (600-800 words?) • Models of language acquisition of course imply comparison to children’s language learning/usage, not adults: child data?
Model Evaluation (basic) • If lower-levels learn to predict words from the previous word or two, then can test as Ellman did. • If higher-levels learn semantic regularities as in LSA, then can test as for LSA.
Model Evaluation (optimistic) • If generative modifications can be made, might be able to output words/phrases semantically linked to input words/phrases (ElizaNET?). • Child Turing test? (human judges compare model output to real childrens’ output for same input?)
Current Status/Next Steps • Still reviewing previous research • Working through implementation details of Rao & Ballard algorithm • Must consider different types of high-level input representations • Need to develop/acquire basic English vocabulary/grammar data set
Thank-you. Questions and comments are expressly welcomed. Thoughts on any of the questions raised herein will be extremely valuable.