A computational study of cross-situational techniques for learning word-to-meaning mappings

A computational study of cross-situational techniques for learning word-to-meaning mappings Jeffrey Mark Siskind Presented by David Goss-Grubbs March 5, 2006

The Problem: Mapping Words to Concepts • Child hears John went to school • Child sees GO(John, TO(school)) • Child must learn • John John • went  GO(x, y) • to  TO(x) • school  school

Two Problems • Referential uncertainty • MOVE(John, feet) • WEAR(John, RED(shirt)) • Determining the correct alignment • John TO(x) • walked  school • to  John • school  GO(x, y)

Helpful Constraints • Partial Knowledge • Cross-situational inference • Covering constraints • Exclusivity

Partial Knowledge • Child hears Mary lifted the block • Child sees • CAUSE(Mary, GO(block, UP)) • WANT(Mary, block) • BE(block, ON(table)) • If the child knows lift contains CAUSE, the second two hypotheses can be ruled out.

Cross-situational inference • John lifted the ball CAUSE(John, GO(ball, UP)) • Mary lifted the block  CAUSE(Mary, GO(block, UP)) • Thus, lifted  {UP, GO(x, y), GO(x, UP), CAUSE(x, y), CAUSE(x, GO(y, z)), CAUSE(x, GO(y, UP))}

Covering constraints • Assume: all components of an utterance’s meaning come from the meanings of words in that utterance. • If it is known that CAUSE is not part of the meaning of John, the or ball, it must be part of the meaning of lifted. • (But what about constructional meaning?)

Exclusivity • Assume: any portion of the meaning of an utterance comes from no more than one of its words. • If John walked WALK(John) andJohn  JohnThen walked can be no more thanwalked  WALK(x)

Three more problems • Bootstrapping • Noisy Input • Homonymy

Bootstrapping • Lexical acquisition is much easier if some of the language is already known • Some of Siskind’s strategies (e.g. cross-situational learning) work without such knowledge • Others (e.g. exclusivity) require it. • The algorithm starts off slow, then speeds up

Noise • Only a subset of all possible meanings will be available to the algorithm • If none of them contain the correct meaning, cross-situational learning would cause those words never to be acquired • Some portion of the input must be ignored. • (A statistical approach is rejected – it is not clear why)

Homonymy • Similar to noisy input, cross-situational techniques would fail to find a consistent mapping for homonymous words. • When an inconsistency is found, a split is made. • If the split is corroborated, a new sense is created; otherwise it is noise.

The problem, formally stated • From: a sequence of utterances • Each utterance is an unordered collection of words • Each utterance is paired with a set of conceptual expressions • To: a lexicon • The lexicon maps each word to a set of conceptual expressions, one for each sense of the word

Composition • Select one sense for each word • Find all ways of combining these conceptual expressions • The meaning of an utterance is derived only from the meaning of its component words. • Every conceptual expression in the meanings of the words must appear in the final conceptual expression (copies are possible)

The simplified algorithm: no noise or homonymy • Two learning stages • Stage 1: The set of conceptual symbols • E.g. {CAUSE, GO, UP} • Stage 2: The conceptual expression • CAUSE(x, GO(y, UP))

Stage 1: Conceptual symbol set • Maintain sets of necessary and possible conceptual symbols for each word • Initialize the former to the empty set and the latter to the universal set • Utterances will increase the necessary set and decrease the possible set, until they converge on the actual conceptual symbol set

Stage 2: Conceptual expression • Maintain a set of possible conceptual expressions for each word • Initialize to the set of all expressions that can be composed from the actual conceptual symbol set • New utterances will decrease the possible conceptual expression set until only one remains

Example

Selecting the meaning John took the ball • CAUSE(John, GO(ball, TO(John))) • WANT(John, ball) • CAUSE(John, GO(PART-OF (LEFT(arm), John), TO(ball))) • Second is eliminated because no CAUSE • Third is eliminated because no word has LEFT or PART-OF

Updated table

Stage 2

Noise and Homonymy • Noisy or homonymous data can corrupt the lexicon • Adding an incorrect element to the set of necessary elements • Taking a correct element away from the set of possible elements • This may or may not create an inconsistent entry

Extended algorithm • Necessary and possible conceptual symbols are mapped to senses rather than words • Words are mapped to their senses • Each sense has a confidence factor

Sense assignment • For each utterance, find the cross-product of all the senses • Choose the “best” consistent sense assignment • Update the entries for those senses as before • Add to a sense’s confidence factor each time it is used in a preferred assignment

Inconsistent utterances • Add the minimal number of new senses until the utterance is no longer inconsistent – three possibilities • If the current utterance is noise, new senses are bad (and will be ignored) • There really are new senses • The original senses were bad, and the right senses are only now being added. • On occasion, remove senses with low confidence factors

Four simulations • Vary the task along five parameters • Vocabulary growth rate by size of corpus • Number of required exposures to a word by size of corpus • How high can it scale?

Method (1 of 2) • Construct a random lexicon • Vary it by three parameters • Vocabulary size • Homonymy rate • Conceptual-symbol inventory size

Method (2 of 2) • Construct a series of utterances, each paired with a set of meaning hypotheses • Vary this by the following parameters • Noise rate • Degree of referential uncertainty • Cluster size (5) • Similarity probability (.75)

Sensitivity analysis

Vocabulary size

Degree of referential uncertainty

Noise rate

Conceptual-symbol inventory size

Homonymy rate

Vocabulary Growth

Number of exposures

A computational study of cross-situational techniques for learning word-to-meaning mappings

A computational study of cross-situational techniques for learning word-to-meaning mappings

Presentation Transcript

A Word Study of Terrorism

A Word Study of Terrorism

Vectorial Representations of Meaning for a Computational Model of Language Comprehension

When the Shoe Fits: Cross-Situational Learning in Realistic Learning Environments

A BIG meaning for a small word: Archiving :

Meaning -Word Analysis

Word Meaning

Selected Study Techniques for each Learning Style

Morphemes: Structural Clues for Word Meaning

MEANING OF WORD/ PHRASE/SENTENCES

Study Techniques For Your Learning Style

Whose meaning? The challenges of key word conceptualisations in a cross-linguistic perspective

Situational approach to learning

Word Meaning

Word Meaning

Religious Meaning of the Cross

Impact of Cross Aisles in a Rectangular Warehouse: A Computational Study

COMPUTATIONAL TECHNIQUES

Selected Theories of Word Meaning

Teaching Word Meaning

25 Techniques of Situational Crime Prevention