1 / 95

Outline

Explore statistical modeling and conceptualization of visual patterns in natural images, discussing the underlying stochastic processes and the representation of visual knowledge. Learn about parsing images, mathematical definitions of patterns, and models for object recognition.

aiversen
Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline • Statistical Modeling and Conceptualization of Visual Patterns • S. C. Zhu, “Statistical modeling and conceptualization of visual patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 6, 1-22, 2003

  2. A Common Framework of Visual Knowledge Representation • Visual patterns in natural images • Natural images consist of an overwhelming number of visual patterns • Generated by very diverse stochastic processes • Comments • Any single image normally consists of a few recognizable/segmentable visual patterns • Scientifically, given that visual patterns are generated by stochastic processes, shall we model the underlying stochastic processes or model visual patterns presented in the observations from the stochastic processes? Computer Vision

  3. A Common Framework of Visual Knowledge Representation – cont. Computer Vision

  4. A Common Framework of Visual Knowledge Representation – cont. • The image analysis as an image parsing problem • Parse generic images into their constituent patterns (according to the underlying stochastic processes) • Perceptual grouping when applied to points, lines, and curves processes • Image segmentation when applied to region processes • Object recognition when applied to high level objects Computer Vision

  5. A Common Framework of Visual Knowledge Representation – cont. Computer Vision

  6. A Common Framework of Visual Knowledge Representation – cont. • Required components for parsing • Mathematical definitions and models of various visual patterns • Definitions and models are intrinsically recursive • Grammars (or called rules) • Which specifies the relationships among various patterns • Grammars should be stochastic in nature • A parsing algorithm Computer Vision

  7. Syntactical Pattern Recognition Computer Vision

  8. A Common Framework of Visual Knowledge Representation – cont. • Conceptualization of visual patterns • The concept of a pattern is an abstraction of some properties decided by certain “visual purposes” • They are feature statistics computed from • Raw signals • Some hidden descriptions inferred from raw signals • Mathematically, each pattern is equivalent to a set of observable signals governed by a statistical model Computer Vision

  9. A Common Framework of Visual Knowledge Representation – cont. • Statistical modeling of visual patterns • Statistical models are intrinsic representations of visual knowledge and image regularities • Due to noise and distortion in imaging process? • Due to noise and distortion in the underlying generative process? • Due to transformations in the underlying stochastic process? • Pattern theory Computer Vision

  10. A Common Framework of Visual Knowledge Representation – cont. • Statistical modeling of visual patterns - continued • Mathematical space for patterns and spaces • Depends on the forms • Parametric • Non-parametric • Attributed graphs • Different models • Descriptive models • Bottom-up, feature-based models • Generative models • Hidden variables for generating images in a top-down manner Computer Vision

  11. A Common Framework of Visual Knowledge Representation – cont. • Learning a visual vocabulary • Hierarchy of visual descriptions for general visual patterns • Vocabulary of visual description • Learning from an ensemble of natural images • Vocabulary is far from enough • Rich structures in physics • Large vocabulary in speech and language Computer Vision

  12. A Common Framework of Visual Knowledge Representation – cont. • Computational tractability • Computational heuristics for effective inference of visual patterns • Discriminative models • A framework • Discriminative probabilities are used as proposal probabilities that drive the Markov chain search for fast convergence and mixing • Generative models are top-down probabilities and the hidden variables to be inferred from posterior probabilities Computer Vision

  13. A Common Framework of Visual Knowledge Representation – cont. • Discussion • Images are generated by rendering 3D objects under some external conditions • All the images from one object form a low dimensional manifold in a high dimensional image space • Rendering can be modeled fairly accurately • Describing a 3D object requires a huge amount of data • Under this setting • A visual pattern simply corresponds to the manifold • Descriptive model attempts to characterize the manifold • Generative model attempts to learn the 3D objects and the rendering Computer Vision

  14. 3D Model-Based Recognition Computer Vision

  15. Literature Survey • To develop a generic vision system, regularities in images must be modeled • The study of natural image statistics • Ecologic influence on visual perception • Natural images have high-order (i.e., non-Gaussian) structures • The histograms of Gabor-type filter responses on natural images have high kurtosis • Histograms of gradient filters are consistent over a range of scales Computer Vision

  16. Natural Image Statistics Example Computer Vision

  17. Analytical Probability Models for Spectral Representation • Transported generator model (Grenander and Srivastava, 2000) where • gi’s are selected randomly from some generator space G • the weigths ai’s are i.i.d. standard normal • the scales ri’s are i.i.d. uniform on the interval [0,L] • the locations zi’s as samples from a 2D homogenous Poisson process, with a uniform intensity l, and • the parameters are assumed to be independent of each other Computer Vision

  18. Analytical Probability Models - continued • Define • Model u by a scaled -density Computer Vision

  19. Analytical Probability Models - continued Computer Vision

  20. Analytical Probability Models - continued Computer Vision

  21. Analytical Probability Models - continued Computer Vision

  22. Analysis of Natural Image Components • Harmonic analysis • Decomposing various classes of functions by different bases • Including Fourier transform, wavelet transforms, edgelets, curvelets, and so on Computer Vision

  23. Sparse Coding From S. C. Zhu Computer Vision

  24. Grouping of Natural Image Elements • Gestalt laws • Gestalt grouping laws • Should be interpreted as heuristics rather than deterministic laws • Nonaccidental property Computer Vision

  25. Illusion Computer Vision

  26. Illusion – cont. Computer Vision

  27. Ambiguous Figure Computer Vision

  28. Statistical Modeling of Natural Image Patterns • Synthesis-by-analysis Computer Vision

  29. Analog from Speech Recognition Computer Vision

  30. Modeling of Natural Image Patterns • Shape-from-X problems are fundamentally ill-posed • Markov random field models • Deformable templates for objects • Inhomogeneous MRF models on graphs Computer Vision

  31. Four Categories of Statistical Models • Descriptive models • Constructed based on statistical descriptions of the image ensembles • Homogeneous models • Statistics are assumed to be the same for all elements in the graph • Inhomogeneous models • The elements of the underlying graph are labeled and different features and statistics are used at different sites Computer Vision

  32. Variants of Descriptive Models • Casual Markov models • By imposing a partial ordering among the vertices of the graph, the joint probability can be factorized as a product of conditional probabilities • Belief propagation networks • Pseudo-descriptive models Computer Vision

  33. Generative Models • Use of hidden variables that can “explain away” the strong dependency in observed images • This requires a vocabulary • Grammars to generate images from hidden variables • Note that generative models can not be separated from descriptive models • The description of hidden variables requires descriptive models Computer Vision

  34. Discriminative Models • Approximation of posterior probabilities of hidden variables based on local features • Can be seen as importance proposal probabilities Computer Vision

  35. An Example Computer Vision

  36. Problem formation Input: a set of images Output: a probability model Here, f(I)represents the ensemble of images in a given domain, we shall discuss the relationship between ensemble and probability later. Computer Vision

  37. The Kullback-Leibler Divergence Problem formation The model p approaches the true density Computer Vision

  38. Maximum Likelihood Estimate Computer Vision

  39. Model Pursuit 1. What is W -- the family of models ? 2. How do we augment the space W? Computer Vision

  40. Features are deterministic mathematical transforms of an image. Hidden variables are stochastic and are inferred from an image. Two Choices of Models • The exponential family – descriptive models --- Characterize images by features and statistics 2. The mixture family -- generative models --- Characterize images by hidden variables Computer Vision

  41. I: Descriptive Models • Step 1: extracting image features/statistics as transforms For example: histograms of Gabor filter responses. Other features/statistics: Gabors, geometry, Gestalt laws, faces. Computer Vision

  42. I.I: Descriptive Models Step 2: using features/statistics to constrain the model Two cases: • On infinite lattice Z2 --- an equivalence class. • On any finite lattice --- a conditional probability model. image space on Z2 image space on lattice L Computer Vision

  43. I.I Descriptive Model on Finite Lattice Modeling by maximum entropy: Subject to: Remark: p and f have the same projected marginal statistics. Computer Vision

  44. Minimax Entropy Learning For a Gibbs (max. entropy) model p, this leads to the minimax entropy principle (Zhu,Wu, Mumford 96,97) Computer Vision

  45. FRAME Model • FRAME model • Filtering, random field, and maximum entropy • A well-defined mathematical model for textures by combining filtering and random field models Computer Vision

  46. I.I Descriptive Model on Finite Lattice The FRAME model (Zhu, Wu, Mumford, 1996) This includes all Markov random field models. Remark: all known exponential models are from maxent., and maxent was proposed in Physics (Jaynes, 1957). The nice thing is that it provides a parametric model integrating features. Computer Vision

  47. I.I Descriptive Model on Finite Lattice Two learning phases: 1. Choose information bearing features -- augmenting the probability family. 2. Compute the parameter L by MLE -- learning within a family. Computer Vision

  48. Maximum Entropy • Maximum entropy • Is an important principle in statistics for constructing a probability distribution on a set of random variables • Suppose the available information is the expectations of some known functions n(x), that is • Let W be the set of all probability distributions p(x) which satisfy the constraints Computer Vision

  49. Maximum Entropy – cont. • Maximum Entropy – continued • According to the maximum entropy principle, a good choice of the probability distribution is the one that has the maximum entropy subject to Computer Vision

  50. Maximum Entropy – cont. • Maximum Entropy – continued • By Lagrange multipliers, the solution for p(x) is • where Computer Vision

More Related