1 / 35

a factor model for learning higher-order features in natural images yan karklin ( nyu + hhmi)

a factor model for learning higher-order features in natural images yan karklin ( nyu + hhmi) joint work with mike lewicki ( case western reserve u ) icml workshop on learning feature hierarchies - june 18, 2009. retina. LGN. V2. V1. non-linear processing in the early visual system.

minh
Download Presentation

a factor model for learning higher-order features in natural images yan karklin ( nyu + hhmi)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. a factor model for learning higher-order features in natural images yan karklin (nyu + hhmi) joint work with mike lewicki (case western reserve u) icml workshop on learning feature hierarchies - june 18, 2009

  2. retina LGN V2 V1

  3. non-linear processing in the early visual system contrast/size response function estimated linear receptive field cat LGN (DeAngelis, Ohzawa, Freeman 1995) cat LGN (Carandini 2004)

  4. normalized response ÷ 2 ( ) + ... cell (V1) model (Schwartz and Simoncelli, 2001)

  5. non-linear statistical dependencies in natural images 0

  6. a generative model for covariance matrices multivariate Gaussian model latent variables specify the covariance

  7. a distributed code if only we could find a basis for covariance matrices... = y1 + y2 + y3 + y4 + . . .

  8. a distributed code if only we could find a basis for covariance matrices... = y1 + y2 + y3 + y4 + . . . use the matrix-logarithm transform = y1 + y2 + y3 + y4 + . . .

  9. a compact parameterization for cov-components find common directions of change in variation/correlation bark foliage

  10. a compact parameterization for cov-components find common directions of change in variation/correlation bark foliage vectors bk allow a more compact description of the basis functions bk’s are unit norm (wjk can absorb scaling)

  11. the full model . . . . . .

  12. the full model . . . . . .

  13. the full model . . . . . .

  14. the full model . . . . . .

  15. normalization by the inferred covariance matrix normalized response ÷

  16. training the model on natural images . . . . . . • learning details: • gradient ascent on data likelihood • 20x20 image patches from ~100 natural images • 1000 bk’s • 150 yj’s

  17. learned vectors . . . . . .

  18. learned vectors . . . . . . black – 25 example features grey – all 1000 features in model

  19. one unit encodes global image contrast w105,: + . . . . . . 0 -

  20. original patches

  21. normalization by inferred contrast

  22. w105,: w85,: . . . . . . w57,: w11,: 0

  23. w12,: w125,: . . . . . . w90,: w144,:

  24. w27,: w37,: . . . . . . w78,: w66,:

  25. original patches

  26. normalization by inferred contrast

  27. normalization by inverse covariance

  28. Restricted Boltzmann Machines h v

  29. gated Restricted Boltzmann Machines (Memisevic and Hinton, CVPR 07) h x y - Gaussian (pixels) - Gaussian (pixels) - binary trained on image sequences image sequence (t  )

  30. gated Restricted Boltzmann Machines (Memisevic and Hinton, CVPR 07) h x y - Gaussian (pixels) - Gaussian (pixels) - binary trained on image sequences image sequence (t  ) log-Covariance factor model

  31. ÷ • summary • modeling non-linear dependencies motivated by statistical patterns • higher-order models can capture context • normalization by local “whitening” removes most structure • questions • joint model for normalized response and normalization (context) signal? • how to extend to entire image? • where are these computations found in the brain?

More Related