200 likes | 213 Views
Explore the Emergentist approach to language embodied in connectionist networks, understanding past tense formation, errors, and the RM model. Learn how language evolves from basic principles and random differences through natural selection.
E N D
The Emergentist Approach To LanguageAs Embodied in Connectionist Networks James L. McClelland Stanford University
The man liked the book. The boy loved the sun. The woman hated the rain. Some Simple Sentences
Sentences Clauses and phrases Words Morphemes Phonemes S -> NP VP VP-> Tense V (NP) Tense V -> V+{past} V -> like, love, hate… N -> man, boy, sun… man -> /m/ /ae/ /n/ {past} -> ed ed -> /t/ or /d/ or /^d/‡ ‡depends on context The Standard Approach:Units and Rules
Standard Approach to the Past Tense • We form the past tense by using a (simple) rule. • If an item is an exception, the rule is blocked. • So we say ‘took’ instead of ‘taked’ • If you’ve never seen an item before, you use the rule • If an item is an exception, but you forget the exceptional past tense, you apply the rule • Predictions: • Regular inflection of ‘nonce forms’ • This man is blinging. Yesterday he … • This girl is tupping. Yesterday she … • Over-regularization errors: • Goed, taked, bringed
The Emergentist Approach • Language (like perception, etc) arises from the interactions of neurons, each of which operates according to a common set of simple principles of processing, representation and learning. • Units and rules are useful to approximately describe what emerges from these interactions but have no mechanistic or explanatory role in language processing, language change, or language learning.
An Emergentist Theory:Natural Selection • No grand design • Organisms produce offspring with random differences (mating helps with this) • Forces of nature favor those best suited to survive • Survivors leave more offspring, so their traits are passed on • The full range of the animal kingdom including all the capabilities of the human mind emerge from these very basic principles
An Emergentist/Connectionist Approach to the Past Tense • Knowledge is in connections • Experience causes connections to change • Sensitivity to regularities emerges
The RM Model • Learns from verb [root, past tense] pairs • [Like, liked]; [love, loved]; [carry, carried]; [take, took] • Present and past are represented as patterns of activation over units that stand for phonological features.
Examples of ‘wickelfeatures’ in the verb ‘baked’ • Starts with a stop followed by a vowel • Has a long vowel preceded by a stop and followed by a stop • Ends with a dental stop preceded by a velar stop • Ends with an unvoiced sound preceded by another unvoiced sound
Summed input A Pattern Associator Network Pattern representing sound of the verb’s past tense Matrix of connections p(a=1) Pattern representing the sound of the verb root
Learning rule for the Pattern Associator network • For each output unit: • Determine activity of the unit based on its input. • If the unit is active when target is not: • Reduce each weight coming into the unit from each active input unit. • If the unit is inactive when the target is active: • Increase the weight coming into the unit from each active input unit. • Each connection weight adjustment is very small • Learning is gradual and cumulative
Over-regularization errors in the RM network Most frequent past tenses in English: • Felt • Had • Made • Got • Gave • Took • Came • Went • Looked • Needed Here’s where 400 more words were introduced Trained with top ten words only.
Regulars co-exist with exceptions. The model produces the regular past for most unfamiliar test items. The model captures the different subtypes among the regulars: like-liked love-loved hate-hated The model is sensitive to the no-change pattern among irregulars: hit-hit cut-cut hide-hid Some features of the model
The model exploits gangs of related exceptions. dig-dug cling-clung swing-swung The ‘regular pattern’ infuses exceptions as well as regulars: say-said, do-did have-had keep-kept, sleep-slept Burn-burnt Teach-taught Additional characteristics
Extensions • As was stated 2-3 times throughout the quarter, networks that used fixed inputs and outputs without hidden units in between are limited • So subsequent models have relied on intermediate layers of units between inputs and outputs. • The principles of the model apply to other forms of information, not only phonological information. • If semantic as well as phonological information about an item is available as part of the input, then the model will exploit semantic as well as phonological similarity (as we argue humans do). • One can re-formulate the learning problem as one involving a speaker and a hearer. Here the learning problem for the speaker is formulated as: • Produce an output that will allow the listener to construct the message you wish to transmit (e.g. hit + past; read + present) • Keep the output as short as possible, as long as it can be understood • This causes networks to create a special kind of irregular form, which we will call reduced regulars: • Did, Made, Had, Said
Weights in the network are adjusted using an error correcting rule so that the network learns to map phonological input onto specified target outputs Activations of input units are also adjusted to ensure that the input contains the information needed to produce the output to allow the input to be as simple as possible Initial input patterns are fully regular High frequency items are presented more often during training Because the network weights are stronger for these items, the network can afford to reduce the input for these high frequency items Lupyan & McClelland (2003) Modelof Past Tense Learning Semantic Target Pattern Corresponding toIntended Meaning Phonological Input Subject to Reduction
Simulation of Emergence of Reduced Regulars • In English, frequent items are less likely to be regular. • Also, d/t items are less likely to be regular. • This emerged in our simulation. • While the past tense is usually one phoneme longer than present, this is less true for the high frequency past tense items. • Reduction of high frequency past tenses affects a phoneme other than the word final /d/ or /t/.
Key Features of the Past Tense model • No lexical entries and no rules • No problem of rule induction or grammar selection