710 likes | 897 Views
Machine Learning Techniques for Computer Vision (ECCV 2004). Christopher M. Bishop. Overview of Part 2. Mixture modelsEMVariational InferenceBayesian model complexityContinuous latent variables. Machine Learning Techniques for Computer Vision (ECCV 2004). Christopher M. Bishop. The Gaussian Dist
E N D
1. Part 2: Unsupervised Learning
2. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Overview of Part 2 Mixture models
EM
Variational Inference
Bayesian model complexity
Continuous latent variables
3. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop The Gaussian Distribution Multivariate Gaussian
Maximum likelihood
4. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Gaussian Mixtures Linear super-position of Gaussians
Normalization and positivity require
5. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Example: Mixture of 3 Gaussians
6. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Maximum Likelihood for the GMM Log likelihood function
Sum over components appears inside the log
no closed form ML solution
7. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM Algorithm – Informal Derivation
8. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM Algorithm – Informal Derivation M step equations
9. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM Algorithm – Informal Derivation E step equation
10. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM Algorithm – Informal Derivation Can interpret the mixing coefficients as prior probabilities
Corresponding posterior probabilities (responsibilities)
11. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Old Faithful Data Set
12. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
13. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
14. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
15. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
16. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
17. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
18. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable View of EM To sample from a Gaussian mixture:
first pick one of the components with probability
then draw a sample from that component
repeat these two steps for each new data point
19. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable View of EM Goal: given a data set, find
Suppose we knew the colours
maximum likelihood would involve fitting each component to the corresponding cluster
Problem: the colours are latent (hidden) variables
20. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Incomplete and Complete Data
21. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable Viewpoint
22. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable Viewpoint Binary latent variables describing which component generated each data point
Conditional distribution of observed variable
Prior distribution of latent variables
Marginalizing over the latent variables we obtain
23. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Graphical Representation of GMM
24. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable View of EM Suppose we knew the values for the latent variables
maximize the complete-data log likelihood
trivial closed-form solution: fit each component to the corresponding set of data points
We don’t know the values of the latent variables
however, for given parameter values we can compute the expected values of the latent variables
25. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Posterior Probabilities (colour coded)
26. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Over-fitting in Gaussian Mixture Models Infinities in likelihood function when a component ‘collapses’ onto a data point: with
Also, maximum likelihood cannot determine the number K of components
27. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Cross Validation Can select model complexity using an independent validation data set
If data is scarce use cross-validation:
partition data into S subsets
train on S-1 subsets
test on remainder
repeat and average
Disadvantages
computationally expensive
can only determine one or two complexity parameters
28. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Mixture of Gaussians Parameters and latent variables appear on equal footing
Conjugate priors
29. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Data Set Size Problem 1: learn the functionfor from 100 (slightly) noisy examples
data set is computationally small but statistically large
Problem 2: learn to recognize 1,000 everyday objects from 5,000,000 natural images
data set is computationally large but statistically small
Bayesian inference
computationally more demanding than ML or MAP(but see discussion of Gaussian mixtures later)
significant benefit for statistically small data sets
30. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Variational Inference Exact Bayesian inference intractable
Markov chain Monte Carlo
computationally expensive
issues of convergence
Variational Inference
broadly applicable deterministic approximation
let denote all latent variables and parameters
approximate true posterior using a simpler distribution
minimize Kullback-Leibler divergence
31. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop General View of Variational Inference For arbitrarywhere
Maximizing over would give the true posterior
this is intractable by definition
32. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Variational Lower Bound
33. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Factorized Approximation Goal: choose a family of q distributions which are:
sufficiently flexible to give good approximation
sufficiently simple to remain tractable
Here we consider factorized distributions
No further assumptions are required!
Optimal solution for one factor, keeping the remainder fixed
coupled solutions so initialize then cyclically update
message passing view (Winn and Bishop, 2004)
34. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
35. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Lower Bound Can also be evaluated
Useful for maths/code verification
Also useful for model comparison:
36. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Illustration: Univariate Gaussian Likelihood function
Conjugate prior
Factorized variational distribution
37. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Initial Configuration
38. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop After Updating
39. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop After Updating
40. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Converged Solution
41. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Variational Mixture of Gaussians Assume factorized posterior distribution
No other approximations needed!
42. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Variational Equations for GMM
43. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Lower Bound for GMM
44. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop VIBES Bishop, Spiegelhalter and Winn (2002)
45. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop ML Limit If instead we choosewe recover the maximum likelihood EM algorithm
46. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bound vs. K for Old Faithful Data
47. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Model Complexity
48. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Sparse Bayes for Gaussian Mixture Corduneanu and Bishop (2001)
Start with large value of K
treat mixing coefficients as parameters
maximize marginal likelihood
prunes out excess components
49. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
50. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
51. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Summary: Variational Gaussian Mixtures Simple modification of maximum likelihood EM code
Small computational overhead compared to EM
No singularities
Automatic model order selection
52. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Continuous Latent Variables Conventional PCA
data covariance matrix
eigenvector decomposition
Minimizes sum-of-squares projection
not a probabilistic model
how should we choose L ?
53. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Probabilistic PCA Tipping and Bishop (1998)
L dimensional continuous latent space
D dimensional data space
54. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Probabilistic PCA Marginal distribution
Advantages
exact ML solution
computationally efficient EM algorithm
captures dominant correlations with few parameters
mixtures of PPCA
Bayesian PCA
building block for more complex models
55. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA
56. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA
57. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA
58. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA
59. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA
60. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA
61. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA
62. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian PCA Bishop (1998)
Gaussian prior over columns of
Automatic relevance determination (ARD)
63. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Non-linear Manifolds Example: images of a rigid object
64. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Mixture of BPCA Models
65. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
66. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Flexible Sprites Jojic and Frey (2001)
Automatic decomposition of video sequence into
background model
ordered set of masks (one per object per frame)
foreground model (one per object per frame)
67. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
68. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Transformed Component Analysis Generative model
Now include transformations (translations)
Extend to L layers
Inference intractable so use variational framework
69. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop
70. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Constellation Model Li, Fergus and Perona (2003)
Object recognition from small training sets
Variational treatment of fully Bayesian model
71. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Constellation Model
72. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Summary of Part 2 Discrete and continuous latent variables
EM algorithm
Build complex models from simple components
represented graphically
incorporates prior knowledge
Variational inference
Bayesian model comparison