Using Analogy to Discover the Meaning of Pictures

Using Analogy to Discover the Meaning of Pictures Melanie Mitchell Computer Science Department Portland State University and External Professor Santa Fe Institute

An image-understanding task:

High-level perception “Meaning” ? Simple Segmentation Color, Shape, Texture Object recognition Pattern recognition Low-level vision

High-level perception “Meaning” Simple Segmentation Color, Shape, Texture Object recognition Pattern recognition Low-level vision The “SEMANTIC GAP’

The HMAX model for object recognition(Serre, Wolf, Bileschi, Risenhuber, and Poggio, 2006)

Gabor Filters Gabor filter: Essentially a localized Fourier transform in the image. Filter has associated frequency , scale s, and orientation . Response measures extent to which  is present at orientation at scale s centered about pixel (x,y).

S1 units: Gabor filters (one per pixel) 16 scales / frequencies, 4 orientations

C1 unit: Maximum value of group of S1 units, pooled over slightly different positions and scales 8 scales / frequencies, 4 orientations

S2 units: Radial Basis Functions over “Natural Image Patches” • Idea is that natural images contain universal, low-level features that are useful in classifying objects. • Randomly sample small “crops” from natural images, and feed them through S1 and C1 layers. • Collect a set of N patches , {Pi | i 1, ..., N}, of C1 layer from this random sample. • Now, with new image, a unit S2i corresponding to Pi gets input X from C1 layer, computes a radial basis function: • Gives “degree” to which feature Piis present in input X.

C2 units: Compute maximum over groups of S2 inputs

Feature vector representing image Support Vector Machine classification

Object detection (here, “car”) with HMAX model (Bileschi, 2006)

Sample of results from Poggio model (Serre et al., 2006) (Bileschi, 2006)

Question: Is this a picture of “dog walking”?

Can we use a simple ontology to answer this question? “Dog walking” Person Dog leash holds attached to action action walking

But...

Can we use a simple ontology to answer this question? “Dog walking” Person Dog leash holds attached to Dogs action action running walking

But...

Can we use a simple ontology to answer this question? “Dog walking” Person Dog leash holds attached to Dogs action action Cat running Iguana walking

But...

Can we use a simple ontology to answer this question? “Dog walking” Person Dog leash Helicopter Bicycle Car holds attached to Dogs action action Cat running Iguana walking

But...

Why is image-understanding hard for computers?

Why is image-understanding hard for computers? • It is vastly open-ended.

Dog grooming Fanny pack Dog walking Gasoline Lawn mower Sidewalk Beach Stick Inside Runway Sky Helicopter Leash Army Grass Airplane Dog Outside Person Ground Holding Attached to Tree Backpack Car Far from Close to Standing Running Above Left of Walking Track

Why is image-understanding hard for computers? • It is vastly open-ended. • Can’t solve by feeding image’s feature vector to all known “object classifiers”; in general too many such classifiers, and they are too imperfect! (Compare with StreetScenes system.) • In general can’t even construct high-level“feature vector” ahead of time, since there are too many possible features and you don’t know which features are relevant. • Need dynamics! Need to construct “probable”, coherent, consistent, representation of picture at “recognition time”. Construction process must allow different parts of representation to influence one another dynamically.

In constructing representation, need to limit exploration of features to the most promising possibilities ― but how do you know which ones are promising without exploring them? • Need prior, higher-level knowledge to interact with lower-level vision in both directions (bottom-up and top-down). • Need to allow prior knowledge to be “fluid” – allow concepts to “slip”. Need to perceive essential similarity in the face of superficial differences (analogy-making). • In short, need “active symbols”: concepts with dynamic activation (relevance) that can be activated by other active symbols, spread activation to conceptual neighbors, and that can push for themselves to be instantiated in the perception of a situation.

Concept network Active Symbol Architectures(Hofstadter et al.) “Top-down” perceptual agents (codelets) Workspace Temperature “Bottom-up” perceptual agents (codelets)

Architecture of Copycat Concept network (Slipnet) a b c ---> a b d i i j j k k --> ? Perceptual and structure-building agents (codelets) Workspace Temperature

Idealizing analogy-making

Idealizing analogy-making abc ---> abd ijk ---> ?

Idealizing analogy-making abc ---> abd ijk ---> ijl (replace rightmost letter by successor)

Idealizing analogy-making abc ---> abd ijk ---> ijl (replace rightmost letter by successor) ijd (replace rightmost letter by ‘d’)

Idealizing analogy-making abc ---> abd ijk ---> ijl (replace rightmost letter by successor) ijd (replace rightmost letter by ‘d’) ijk (replace all ‘c’s by ‘d’s)

Idealizing analogy-making abc ---> abd ijk ---> ijl (replace rightmost letter by successor) ijd (replace rightmost letter by ‘d’) ijk (replace all ‘c’s by ‘d’s) abd (replace any string by ‘abd’)

Idealizing analogy-making abc ---> abd iijjkk ---> ?

Idealizing analogy-making abc ---> abd iijjkk ---> iijjkl Replace rightmost letter by successor

Idealizing analogy-making abc ---> abd iijjkk ---> ?

Idealizing analogy-making abc ---> abd iijjkk ---> iijjll Replace rightmost “letter” by successor

Idealizing analogy-making abc ---> abd kji ---> ?

Idealizing analogy-making abc ---> abd kji ---> kjj Replace rightmost letter by successor

Idealizing analogy-making abc ---> abd kji ---> lji Replace “rightmost” letter by successor

Idealizing analogy-making abc ---> abd kji ---> kjh Replace rightmost letter by “successor”

Using Analogy to Discover the Meaning of Pictures

Using Analogy to Discover the Meaning of Pictures

Presentation Transcript

The analogy of the Arrow

Using Context Clues to understand meaning

The Power of Analogy

Analogy

Using Context Clues to Understand the meaning of words

The Spiritual Analogy of Cycling

The pen analogy

Using context clues to help determine the meaning of words

Cell Analogy Analogy to a School

Analogy

Symbolism of the Analogy

The Camera Analogy

ANALOGY

The Pictures of using Braille Display

The Analogy of the Car

Analogy

The PICTURES way of Using PowerPoint to Learn

Use of Analogy

The Cell Analogy

The Iceberg Analogy

Analogy

Khmer Meaning of Pictures in Different Categories