360 likes | 446 Views
A computational study of cross-situational techniques for learning word-to-meaning mappings. Jeffrey Mark Siskind Presented by David Goss-Grubbs March 5, 2006. The Problem: Mapping Words to Concepts. Child hears John went to school Child sees GO( John , TO( school )) Child must learn
E N D
A computational study of cross-situational techniques for learning word-to-meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March 5, 2006
The Problem: Mapping Words to Concepts • Child hears John went to school • Child sees GO(John, TO(school)) • Child must learn • John John • went GO(x, y) • to TO(x) • school school
Two Problems • Referential uncertainty • MOVE(John, feet) • WEAR(John, RED(shirt)) • Determining the correct alignment • John TO(x) • walked school • to John • school GO(x, y)
Helpful Constraints • Partial Knowledge • Cross-situational inference • Covering constraints • Exclusivity
Partial Knowledge • Child hears Mary lifted the block • Child sees • CAUSE(Mary, GO(block, UP)) • WANT(Mary, block) • BE(block, ON(table)) • If the child knows lift contains CAUSE, the second two hypotheses can be ruled out.
Cross-situational inference • John lifted the ball CAUSE(John, GO(ball, UP)) • Mary lifted the block CAUSE(Mary, GO(block, UP)) • Thus, lifted {UP, GO(x, y), GO(x, UP), CAUSE(x, y), CAUSE(x, GO(y, z)), CAUSE(x, GO(y, UP))}
Covering constraints • Assume: all components of an utterance’s meaning come from the meanings of words in that utterance. • If it is known that CAUSE is not part of the meaning of John, the or ball, it must be part of the meaning of lifted. • (But what about constructional meaning?)
Exclusivity • Assume: any portion of the meaning of an utterance comes from no more than one of its words. • If John walked WALK(John) andJohn JohnThen walked can be no more thanwalked WALK(x)
Three more problems • Bootstrapping • Noisy Input • Homonymy
Bootstrapping • Lexical acquisition is much easier if some of the language is already known • Some of Siskind’s strategies (e.g. cross-situational learning) work without such knowledge • Others (e.g. exclusivity) require it. • The algorithm starts off slow, then speeds up
Noise • Only a subset of all possible meanings will be available to the algorithm • If none of them contain the correct meaning, cross-situational learning would cause those words never to be acquired • Some portion of the input must be ignored. • (A statistical approach is rejected – it is not clear why)
Homonymy • Similar to noisy input, cross-situational techniques would fail to find a consistent mapping for homonymous words. • When an inconsistency is found, a split is made. • If the split is corroborated, a new sense is created; otherwise it is noise.
The problem, formally stated • From: a sequence of utterances • Each utterance is an unordered collection of words • Each utterance is paired with a set of conceptual expressions • To: a lexicon • The lexicon maps each word to a set of conceptual expressions, one for each sense of the word
Composition • Select one sense for each word • Find all ways of combining these conceptual expressions • The meaning of an utterance is derived only from the meaning of its component words. • Every conceptual expression in the meanings of the words must appear in the final conceptual expression (copies are possible)
The simplified algorithm: no noise or homonymy • Two learning stages • Stage 1: The set of conceptual symbols • E.g. {CAUSE, GO, UP} • Stage 2: The conceptual expression • CAUSE(x, GO(y, UP))
Stage 1: Conceptual symbol set • Maintain sets of necessary and possible conceptual symbols for each word • Initialize the former to the empty set and the latter to the universal set • Utterances will increase the necessary set and decrease the possible set, until they converge on the actual conceptual symbol set
Stage 2: Conceptual expression • Maintain a set of possible conceptual expressions for each word • Initialize to the set of all expressions that can be composed from the actual conceptual symbol set • New utterances will decrease the possible conceptual expression set until only one remains
Selecting the meaning John took the ball • CAUSE(John, GO(ball, TO(John))) • WANT(John, ball) • CAUSE(John, GO(PART-OF (LEFT(arm), John), TO(ball))) • Second is eliminated because no CAUSE • Third is eliminated because no word has LEFT or PART-OF
Noise and Homonymy • Noisy or homonymous data can corrupt the lexicon • Adding an incorrect element to the set of necessary elements • Taking a correct element away from the set of possible elements • This may or may not create an inconsistent entry
Extended algorithm • Necessary and possible conceptual symbols are mapped to senses rather than words • Words are mapped to their senses • Each sense has a confidence factor
Sense assignment • For each utterance, find the cross-product of all the senses • Choose the “best” consistent sense assignment • Update the entries for those senses as before • Add to a sense’s confidence factor each time it is used in a preferred assignment
Inconsistent utterances • Add the minimal number of new senses until the utterance is no longer inconsistent – three possibilities • If the current utterance is noise, new senses are bad (and will be ignored) • There really are new senses • The original senses were bad, and the right senses are only now being added. • On occasion, remove senses with low confidence factors
Four simulations • Vary the task along five parameters • Vocabulary growth rate by size of corpus • Number of required exposures to a word by size of corpus • How high can it scale?
Method (1 of 2) • Construct a random lexicon • Vary it by three parameters • Vocabulary size • Homonymy rate • Conceptual-symbol inventory size
Method (2 of 2) • Construct a series of utterances, each paired with a set of meaning hypotheses • Vary this by the following parameters • Noise rate • Degree of referential uncertainty • Cluster size (5) • Similarity probability (.75)