(Infinitely) Deep Learning in Vision

(Infinitely) Deep Learningin Vision Max Welling (UCI) collaborators: Ian Porteous (UCI) Evgeniy Bart UCI/Caltech) Pietro Perona (Caltech)

Outline • Nonparametric Bayesian Taxonomy models for object categorization • Hierarchical representations from networks of HDPs

Motivation • Building systems that learn for a lifetime, from • “construction to destruction” • E.g. unsupervised learning of object category taxonomies. • (with E. Bart, I Porteous and P. Perona) • Hierarchical models can help to: • Act as a prior to transfer information to new categories • Fast recognition • Classify at appropriate level of abstraction (Fidodogmammal) • Can define similarity measure (kernel) • Nonparametric Bayesian framework allows models to grow • their model complexity without bound (with growing dataset size)

Nonparametric Model for Visual Taxonomy image / scene prior over trees is nested CRP (Blei et al. 04) -a path is more popular if it has been traveled a lot taxonomy word distribution for topic k topic 1 topic 2 topic k visual word detection 0.7 0.26 0.04

300 images from Corel database. (experiments and figures by E. Bart)

Taxonomy of Quilts

Beyond Trees? • Deep belief nets are more powerful alternatives to taxonomies (in a modeling sense). • Nodes in the hierarchy represent overlapping and increasingly abstract categories • More sharing of statistical strength • Proposal: stack LDA models

LDA (Blei, Ng, Jordan ‘02) token i in doc. j was assigned to type w (observed). token i in image j was assigned to topic k (hidden). image-specific distribution over topics. Topic-specific distribution over words.

Stage-wise LDA LDA • Use Z1 as pseudo-data for next layer. • After second LDA model is fit, we have 2 distributions over Z1. • We combine these distributions by taking their mixture.

Special Words Layer .. • At the bottom layer we have an image-specific distribution over words. • It filters out image-idiosyncrasies which are not modeled well by topics Special words topic model (Chemudgunda, Steyvers, Smyth, 06)

.. .. Model At every level a switching variable picks either or . The lowest level at which was picked disconnects the upstream variables. Last layer that has any data assigned to it. A switching variable has picked this level– all layers above are disconnected.

Collapsed Gibbs Sampling Marginalize out • Given X, perform an upward • pass to compute posterior • probabilities for each level. • Sample a level. • From that level, sample all • downstream Z-variables. • (ignore upstream Z-variables) .. ..

The Digits ... (I deeply believe in) All experiments done by I. Porteous (and finished 2 hours ago).

This level filters out image-idiosyncrasies. No information from this level is “transferred” to test-data

(level 1 topic distributions) (level 2 topic distributions)

Assignment to Levels Brightness = average level assignment

Properties • Properties which are more specific to an image/document are explained at lower levels of hierarchy. • They act as a data-filters for the higher layers • Higher levels become increasingly abstract, with larger “receptive fields” • and higher variance (complex cell property). Limitation? • Higher levels therefore “own” less data. •  Hence higher levels are have larger plasticity. • The more data, the more levels become populated. •  We infer the number of layers. • By marginalizing out parameters , all variables become coupled.

Conclusion • Nonparametric Bayesian models good candidate for “lifelong learning” • need to improve computational efficiency & memory requirements • Algorithm for growing object taxonomies as a function of observed data • Proposal for deep belief net based on stacking LDA modules • more flexible representation & more sharing of statistical strength than taxonomy • Infinite Extension: • LDA  HDP • mixture over levels  Dirichlet process • nr. hidden variables per layer and nr layers inferred • demo?

(Infinitely) Deep Learning in Vision

(Infinitely) Deep Learning in Vision

Presentation Transcript

The Aluminum Can Infinitely Recyclable

The infinitely complex… Fractals

Infinitely repeated games The concept of present value (see pp.14-18):

SIMULTANEOUS CONTINUATION OF INFINITELY MANY SINKS NEAR A QUADRATIC HOMOCLINIC TANGENCY

Elastic/Plastic Behavior of Infinitely Long Cylinders Subject to Mechanical and Thermal Loads

Infinitely Variable Transmission

“ A little knowledge that acts is worth infinitely more than much knowledge that is idle.”

After All I can’t comprehend Your infinitely Beautiful and perfect love

What Makes A Dress Look Infinitely Better?

The infinitely complex… Fractals

Why Esources Benefits Are Infinitely Better Than What Other UK Directory Services Offer

Easy Methods To Be An Infinitely More Productive Golf player