Fluidly representing the world: Way, way harder than you think

Fluidly representing the world:Way, way harder than you think

The One Thing to Remembera year from now Rules and statistics must DYNAMICALLY INTERACT to represent the world appropriately in computer models of intelligence. Rules are not enough Statistics is not enough.

Let’s start things off with a question that you will never forget...

What do cows drink? FIRST ANSWER Bottom-up, Statistical (e.g., Connectionism) conscious COW MILK DRINK unconscious Statistical, bottom-up representations

What do cows drink? FIRST ANSWER Bottom-up, Statistical (e.g., Connectionism) conscious COW MILK DRINK unconscious These neurons are activated without ever have heard the word “milk”

What do cows drink? RIGHT ANSWER Top-Down, Rule-Based (e.g., Symbolic AI) Facts about the world: ISA(cow, mammal) ISA(mammal, animal) Rule 1: IF animal(X) AND thirsty(X) THEN lack_water(X) Rule 2: IF lack_water(X) then drink_water(X) Conclusion:COWS DRINK WATER

What do cows drink? Context 1: Milk Context 2: Water Bottom-up + Top-Down Rule-based representations conscious MILK COW DRINK unconscious Statistical, bottom-up representations

Artificial Intelligence: When a computer will be able to represent the world in such a way that, when it is asked, “What do cows drink?” it will be able to answer either “milk” or “water,” depending on the context of the question.

Rules are not enough

Rules to characterize an “A” • Two oblique, approximately vertical lines, slanting inwards and coming together at the top of the figure • An approximately horizontal crossbar • An enclosed area above an open area • Vertical lines longer than the crossbar

Characterize this word

It says “Gutenberg” • It is written in a Gothic font. • It looks like very old-fashioned script. • Germans would have an easier time reading it than Italians. • It has a very German feel to it. • It is nine letters long. • It contains the letter “e” twice, and the letter “u” once. • It means “good mountain” in English • It makes one think of the Vulgate Bible. • The “t” is written in mirror-image. • ……..

And can also be read perfectly upside-down!… Question: But is that really a property of “Gutenberg”? Answer: Yes, in some contexts. But does that mean that the representation of “Gutenberg” must always include: “Can be read upside-down when written in a VERY special way in pseudo-Gothic script”?

Moral of the StoryIn the real world, a fixed set of rules, however long, is never sufficient to cover all possible contexts in which an object or a situation occurs.

But the statistics of the environment are not enough, either.

Without top-down conceptual knowledge we have no hope of understanding the following image

Statistics is not enough

Statistics is not enough “A dark spot. Hmmm…. Doesn’t look like anything.”

Statistics is not enough “Pictures often have faces in them. Is that a face in the lower-left hand corner?”

Statistics is not enough “Nah, doesn’t join up with anything else.”

Statistics is not enough

Statistics is not enough “Oh, THERE’s a face.”

Statistics is not enough “But, again, it doesn’t make sense, just an isolated face…”

Statistics is not enough “Let’s look at that dark spot again. A shadow?”

Statistics is not enough “Hey, TREES produce shadows. Is there a tree around? THAT could be a tree!”

Statistics is not enough “If that’s a tree, that could be a metal grating.”

Statistics is not enough “But trees with metal gratings like that are on sidewalks. So where’s the kerb?”

Statistics is not enough “If this is a kerb, it should go on straight. Does it? Yes, sort of. Now what’s this thing in the middle?”

Statistics is not enough “That sort of looks like a dog’s head. That could make sense.”

Statistics is not enough “But heads are attached to bodies, so there should be a body. Hey, that looks like it could be a front leg.”

Statistics is not enough “The rest fits pretty well with that interpretation, too. Why so spotty? A dalmatian, yeah, sure, drinking from a road gutter under a tree.”

Dynamically converging on the appropriate representation Top-down Context-dependent representation Bottom-up

N.B. The representation must allow a machine to understand the object in essentially the same way we do. Consider representing an ordinary, everyday object

The object: A cobblestone

A “cobblestone” is like…. • a brick • asphalt • the hand-brake of a car • a paper-weight • a sharpening steel • a ruler • a weapon • a brain • a weight in balance pans • a nutcracker • Carrera marble • a bottle opener • a turnip in French • an anchor for a (little) boat • a tent peg • a hammer • etc…. • a voter’s ballot...

UNDER 21, this is your ballot. May 1968, Paris

So, how could a machine ever produce context-dependent representations?

Searching for an analogy • Context • - I was late in reviewing a paper. • - The editor, a friend, said, “Please get it done now.” • I put the paper on my cluttered desk, in plain view, so that it simply COULD NOT ESCAPE MY ATTENTION…it would keep making me feel guilty till I got it done. • I wanted to find a nice way of saying I had devised a way so that the paper would continue to bug me till I had gotten it done

Top-down: semantic, rule-based, conscious part of network “Something that won’t go away until you’ve taken care of it” No, too harsh to relate a paper to review to a hungry child No, too easy to just scratch it; papers are hard to review Until you get up and swat it it NEVER stops buzzing! You can’t make them go away, ever Swat mosquito  Do the review a mosquito dandelions an itch hungry child Bottom-up: statistics-based “sub-cognitive”, unconscious par of network

Representations of an object/situation must always be tailored to the context in which they are found. A machine must be able to do this automatically.

Fluidly representing the world: Way, way harder than you think