Structured Representation in Neural Systems Vision is Hard Why is Vision Hard?

Parsing Images with Context/Content Sensitive GrammarsEran Borenstein, Stuart Geman, Ya Jin, Wei Zhang

Structured Representation in Neural Systems • Vision is Hard • Why is Vision Hard? • Hierarchies of Reusable Parts • Demonstration System: Reading License Plates • Generalization: Face Detection

Artificial Intelligence • Knowledge Engineering engineer everything, learn nothing • Learning Theory engineer nothing, learn everything • Both Lack Model

Natural Intelligence • Strong Representation simulation and semantics • Hierarchy and Reusability ventral visual pathway, linguistics, compositionality

License plate images from Logan Airport Machines still can’t reliably read license plates

Wafer ID’s Machines can’t read fixed-font fixed-scale characters as well as humans

Super Bowl Machines can’t find the bad guys at the Super Bowl

same Empire style table twins Instantiation Vision is content sensitive

Human Interactive Proofs “Clutter” Background is structured, and made of the same stuff!

Hierarchical of Reusable Parts e.g. animals, trees, rocks e.g. contours, intermediate objects “Bricks” e.g. linelets, curvelets, T-junctions e.g. discontinuities, gradient

Hierarchy of Disjunctions of Conjunctions

selected subgraph Interpretations and Probabilities Interpretation

Interpretations and Probabilities GRAPHICAL MODEL (Markov) LIKELIHOOD RATIO (non-Markov) X

Generative (Bayesian) Model

Test set: 385 images, mostly from Logan Airport Courtesy of Visics Corporation

Architecture license plates license numbers (3 digits + 3 letters, 4 digits + 2 letters) plate boundaries, strings (2 letters, 3 digits, 3 letters, 4 digits) generic letter, generic number, L-junctions of sides characters, plate sides parts of characters, parts of plate sides

Image interpretation Original Image Top object Top 10 objects Top 25 objects

Image interpretation Top objects Test image

Performance • 385 images • Six plates read with mistakes (>98%) • Approx. 99.5% characters read correctly • Zero false positives

Efficient discrimination: Markov versus Content-Sensitive dist. Original image Zoomed license region Top object under Markov distribution Top object under content-sensitive distribution

Efficient discrimination: testing objects against their parts Test image 9 active “8” bricks under whole model 1 active “8” brick under parts model

Summary Vision is Content Sensitive Non-Markovian probability models Background is Structured, and Made of the Same Stuff Objects come equipped with their own background models

Plates Face Detection Rigid Deformable “Black/White” Data Model Intensity Model Hand-Crafted Probabilities Learned Probabilities

Face Hierarchy

0.6 1 Sampling from Data Model

Sampling faces from the distribution

PATTERN SYNTHESIS = PATTERN RECOGNITION Ulf Grenander

Structured Representation in Neural Systems Vision is Hard Why is Vision Hard?

Structured Representation in Neural Systems Vision is Hard Why is Vision Hard?

Presentation Transcript

Color Vision

Hard Times Tokens Part One – Political Tokens

An Introduction to Hard-Disk Drives

Implementing Hard Drives

Organizing and Managing Your Hard Disk

The Effects of Vision on Learning

Insomniac’s SPU Best Practices Hard-won lessons we’re applying to a 3rd generation PS3 title

The Course

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, Sixth Edition

Vision

“Horribly Hard Middle School”

Virginia Department for the Deaf and Hard of Hearing and

NIGHT VISION GOGGLES

Computer and Robot Vision II

Computer and Robot Vision II

Hard Drive Technologies

Moving Up Day 2010 Class of 2014

Hard Probes

Chapter 8