850 likes | 949 Views
Automatic Contextual Pattern Modeling. Pengyu Hong Beckman Institute for Advanced Science and Technology University of Illinois at Urbana Champaign. hong@ifp.uiuc.edu http://www.ifp.uiuc.edu/~hong. Overview. Motivations. Define the problem. Formulation the problem. Design the algorithm.
E N D
Automatic Contextual Pattern Modeling Pengyu Hong Beckman Institute for Advanced Science and Technology University of Illinois at Urbana Champaign hong@ifp.uiuc.edu http://www.ifp.uiuc.edu/~hong
Overview • Motivations • Define the problem • Formulation the problem • Design the algorithm • Experimental results • Conclusions and discussions
Color histogram Edge + The global features Motivations
Motivations A simple example. What kind of visual pattern shared by the following histograms? The color histograms of six images
Motivations The global texture information is also given … The normalized wavelet texture histograms of those six images
Motivations The images
Motivations The global features of an object are the mixtures of the local features of the primitives. • The global features alone are not enough for distinguishing different objects/scenes in many cases.
Motivations An object consists of several primitives among which various contextual relations are defined. ! It is very important to model both the primitives and the relations.
Motivations • Examples of primitives • Regions • Edges • … In terms of images • Examples of relations • Relative distance between two primitives • Relative orientation between two primitives • The size ratio between two primitives • …
An example of ARG The representation First, we need to choose an representation for the information in order to calculate it. Attributed relational graph (ARG) [Tsai1979] has been extensively used to represent objects/scenes.
The representation – ARG • The nodes of an ARG represent the object primitives. The attributes (color histogram, shapes, texture, etc.) of the nodes represent the appearance features of the object primitives. • The lines represent the relations between the object primitives. ARG
The representation – ARG An example: The image is segmented and represented as an ARG. The nodes represents the regions. The color of the nodes denotes the mean color of the regions The lines represent the adjacent relations among the regions.
The representation – ARG The advantage of the ARG representation. • Separate the local features and allow the user to examine the objects/scenes on a finer scale.
Scene 2 Global translation and rotation Scene 1 Scene 2 The representation – ARG The advantage of the ARG representation. • Separate the local features and allow the user to examine the objects on a finer scale. • Separate the local spatial transformations and the global spatial transformations of the object. + Local deformation
Problem definition A set of sample ARGs Summarize Pattern model Detection Recognition Synthesis
Problem definition How to build this Pattern model ? • Manually design • Learn from multiple observations
Related work • Maron and Lozano-Pérez 1998 Develop a Bayes learning framework to learn visual patterns from multiple labeled images. • Frey and Jojic 1999 Use generative model to jointly estimate the transformations and the appearance of the image pattern. • Guo, Zhu, & Wu. 2001 Integrate descriptive model and generative model to learn visual pattern from multiple labeled images. • Hong & Huang 2000, Hong , Wang & Huang 2000
The contribution Develop the methodology and theory for automatically learning a probability parametric pattern model to summarize a set of observed samples. The probability parametric pattern model is called the pattern ARG model. It models both the appearance and the structure of the objects/scenes.
Formulate the problem • Assume the observations sample ARGs {Gi} are the realizations of some underlying stochastic process governed by a probability distribution f(G). • The objective of learning is to estimate a model p(G) to approximate f(G) by minimizing the Kullback-Leibler divergence KL(f || p). [Cover & Thomas 1991]:
Formulate the problem Therefore, we have a maximum likelihood estimator (MLE).
Simplicity: The model uses a set of parameters to represent f(G). f(G) Generality: Use a set of components (mixtures) to approximate the true distribution. How to calculate p(o) ? In practice, it is often necessary to impose structures on the distribution. For example, linear family
o21 r113 oS4 r112 o24 rS14 r114 o11 rS34 oS2 o12 r234 r214 oS4 o14 oS1 rS12 o23 r134 oS3 o12 o13 oS2 rS13 o22 oS1 o11 r213 o14 r212 oS3 o13 G1 G2 GS o24 o23 o21 o22 Illustration of modeling A set of sample images {Ii}, i = 1, … S. A set of sample ARGs {Gi}, i = 1, … S.
o21 r113 oS4 r112 o24 rS14 r114 o11 rS34 r234 r214 oS1 rS12 o23 r134 oS3 o12 o13 oS2 rS13 o22 r213 o14 r212 G1 G2 GS M12 M13 112 113 M2 M3 M1 A pattern ARG model of M components {i}, i = 1, …, M 11 12 13 M14 114 134 M34 M4 1 M 14 Illustration of modeling A set of sample ARGs {Gi}, i = 1, … S. M << S Summarize
o11 o12 o13 o14 M 1 + M12 M13 112 113 M2 M3 M1 r112 11 12 r113 13 M14 114 r114 134 M34 M4 r134 14 Hierarchically linear modeling A pattern ARG with M components 1 M On macro scale A sample ARG = hh
o12 11 M1 14 M4 12 13 M3 M2 + + + M 1 M12 M13 M2 M3 M1 11 12 13 M14 M34 M4 14 Hierarchically linear modeling 112 113 A pattern ARG with M components 114 134 On micro scale A sample node = h(hh)
Attributed distributions Relational distributions 14 11 12 13 The underlying distributions Each component of the pattern ARG model is a parametric model. 112 113 114 134
The task is to… • The parameters of the distribution functions Learn the parameters of the pattern ARG model given the sample ARGs. • The parameters ({h}, {h}) that describe the contribution of the model components.
oS2 o12 oS4 o14 oS1 o11 oS3 o13 o24 o23 o21 o22 Sometimes … The instances of the pattern are in various backgrounds.
Sometimes … It is labor intensive to manually extract each instance out of its background. The learning procedure should automatically extract the instances of the pattern ARG out of the sample ARGs.
o21 oS4 r113 r112 o24 o11 r234 oS1 r114 r214 rS14 o23 rS34 o12 o13 oS2 o22 rS12 r134 o14 oS3 r213 rS13 r212 G1 G2 GS M12 M13 112 113 M2 M3 M1 A pattern ARG model with M components {i}, i = 1, …, M 11 12 13 M14 114 134 M34 M4 1 M 14 Modified version of modeling M << S Summarize
Learning via the EM algorithm The EM [Dempster1977]algorithm is a technique of finding the maximum likelihood estimate of the parameters of the underlying distributions from a training set. The EM algorithm defines a likelihood function, which in this case is:
The parameters to be estimated. The sample ARG set. The correspondences between the sample ARGs and the pattern ARG model. Underlying distribution is a function of under the assumption that = (t) Learning via the EM algorithm
Learning via the EM algorithm Analogy
is calculated, where t is the number of iterations. is updated by Learning via the EM algorithm The EM algorithm works iteratively in two steps: Expectation & Maximization • Expectation step • Maximization step • Modify the structure of the pattern ARG model
A sample ARG Initialize the pattern ARG model For example if the pattern ARG model has 3 components.
The Expectation step Calculate the likelihood of the data It is not so complicated as it appears! Please refer to the paper for the details.
The Maximization step Update • The expressions for the parameters ({h}, {h}), which describe the contribution of model components, can be derived without knowing the forms of the attributed distributions and those of the relational distributions. Derive the expressions for (t+1). • For Gaussian attributed distributions and Gaussian relational distributions, we can obtain analytical expressions to estimate the distribution parameters. Please refer to the paper for the details.
The Maximization step ({h}, {h})
The Maximization step The parameters of the Gaussian attributed distributions Mean Covariance matrix
The Maximization step The parameters of the Gaussian relational distributions Mean Covariance matrix
Modify the structure Initialize the components of the pattern ARG model Null node
Modify the structure • Modify the structure of the pattern ARG model. It is very possible that the model components are initialized so that they contain some nodes representing backgrounds. During the iterations of the algorithm, we examine the parameters ({h}, {h}) and decide which model nodes should be marked as background nodes.
Modify the structure During the iterations of the algorithm, we examine the parameters ({h}, {h}) and decide which model nodes should be marked as background nodes.
Detect the pattern Use the learned pattern ARG model to detect the pattern. Given an new graph Gnew, we calculate the following likelihood
Experimental results I • Automatic image pattern extraction In this experiment, the images are segmented. The color feature (RGB and its variances) of a segment is used. However, our theory can be applied directly on image pixels (see Discussions) or other image primitives (e.g. edges). Segmentation is just used to reduce the computational complexity.
Experimental results I The images
The segmentation results Experimental results I
Experimental results I The ARGs
Experimental Results I The learning results as subgraph in the sample ARGs