1 / 71

Part 2: Unsupervised Learning

Machine Learning Techniques for Computer Vision (ECCV 2004). Christopher M. Bishop. Overview of Part 2. Mixture modelsEMVariational InferenceBayesian model complexityContinuous latent variables. Machine Learning Techniques for Computer Vision (ECCV 2004). Christopher M. Bishop. The Gaussian Dist

micheal
Download Presentation

Part 2: Unsupervised Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Part 2: Unsupervised Learning

    2. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Overview of Part 2 Mixture models EM Variational Inference Bayesian model complexity Continuous latent variables

    3. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop The Gaussian Distribution Multivariate Gaussian Maximum likelihood

    4. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Gaussian Mixtures Linear super-position of Gaussians Normalization and positivity require

    5. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Example: Mixture of 3 Gaussians

    6. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Maximum Likelihood for the GMM Log likelihood function Sum over components appears inside the log no closed form ML solution

    7. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM Algorithm – Informal Derivation

    8. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM Algorithm – Informal Derivation M step equations

    9. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM Algorithm – Informal Derivation E step equation

    10. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM Algorithm – Informal Derivation Can interpret the mixing coefficients as prior probabilities Corresponding posterior probabilities (responsibilities)

    11. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Old Faithful Data Set

    12. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    13. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    14. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    15. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    16. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    17. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    18. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable View of EM To sample from a Gaussian mixture: first pick one of the components with probability then draw a sample from that component repeat these two steps for each new data point

    19. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable View of EM Goal: given a data set, find Suppose we knew the colours maximum likelihood would involve fitting each component to the corresponding cluster Problem: the colours are latent (hidden) variables

    20. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Incomplete and Complete Data

    21. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable Viewpoint

    22. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable Viewpoint Binary latent variables describing which component generated each data point Conditional distribution of observed variable Prior distribution of latent variables Marginalizing over the latent variables we obtain

    23. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Graphical Representation of GMM

    24. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Latent Variable View of EM Suppose we knew the values for the latent variables maximize the complete-data log likelihood trivial closed-form solution: fit each component to the corresponding set of data points We don’t know the values of the latent variables however, for given parameter values we can compute the expected values of the latent variables

    25. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Posterior Probabilities (colour coded)

    26. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Over-fitting in Gaussian Mixture Models Infinities in likelihood function when a component ‘collapses’ onto a data point: with Also, maximum likelihood cannot determine the number K of components

    27. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Cross Validation Can select model complexity using an independent validation data set If data is scarce use cross-validation: partition data into S subsets train on S-1 subsets test on remainder repeat and average Disadvantages computationally expensive can only determine one or two complexity parameters

    28. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Mixture of Gaussians Parameters and latent variables appear on equal footing Conjugate priors

    29. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Data Set Size Problem 1: learn the function for from 100 (slightly) noisy examples data set is computationally small but statistically large Problem 2: learn to recognize 1,000 everyday objects from 5,000,000 natural images data set is computationally large but statistically small Bayesian inference computationally more demanding than ML or MAP (but see discussion of Gaussian mixtures later) significant benefit for statistically small data sets

    30. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Variational Inference Exact Bayesian inference intractable Markov chain Monte Carlo computationally expensive issues of convergence Variational Inference broadly applicable deterministic approximation let denote all latent variables and parameters approximate true posterior using a simpler distribution minimize Kullback-Leibler divergence

    31. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop General View of Variational Inference For arbitrary where Maximizing over would give the true posterior this is intractable by definition

    32. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Variational Lower Bound

    33. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Factorized Approximation Goal: choose a family of q distributions which are: sufficiently flexible to give good approximation sufficiently simple to remain tractable Here we consider factorized distributions No further assumptions are required! Optimal solution for one factor, keeping the remainder fixed coupled solutions so initialize then cyclically update message passing view (Winn and Bishop, 2004)

    34. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    35. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Lower Bound Can also be evaluated Useful for maths/code verification Also useful for model comparison:

    36. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Illustration: Univariate Gaussian Likelihood function Conjugate prior Factorized variational distribution

    37. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Initial Configuration

    38. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop After Updating

    39. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop After Updating

    40. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Converged Solution

    41. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Variational Mixture of Gaussians Assume factorized posterior distribution No other approximations needed!

    42. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Variational Equations for GMM

    43. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Lower Bound for GMM

    44. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop VIBES Bishop, Spiegelhalter and Winn (2002)

    45. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop ML Limit If instead we choose we recover the maximum likelihood EM algorithm

    46. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bound vs. K for Old Faithful Data

    47. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Model Complexity

    48. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Sparse Bayes for Gaussian Mixture Corduneanu and Bishop (2001) Start with large value of K treat mixing coefficients as parameters maximize marginal likelihood prunes out excess components

    49. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    50. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    51. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Summary: Variational Gaussian Mixtures Simple modification of maximum likelihood EM code Small computational overhead compared to EM No singularities Automatic model order selection

    52. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Continuous Latent Variables Conventional PCA data covariance matrix eigenvector decomposition Minimizes sum-of-squares projection not a probabilistic model how should we choose L ?

    53. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Probabilistic PCA Tipping and Bishop (1998) L dimensional continuous latent space D dimensional data space

    54. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Probabilistic PCA Marginal distribution Advantages exact ML solution computationally efficient EM algorithm captures dominant correlations with few parameters mixtures of PPCA Bayesian PCA building block for more complex models

    55. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA

    56. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA

    57. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA

    58. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA

    59. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA

    60. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA

    61. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop EM for PCA

    62. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian PCA Bishop (1998) Gaussian prior over columns of Automatic relevance determination (ARD)

    63. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Non-linear Manifolds Example: images of a rigid object

    64. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Mixture of BPCA Models

    65. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    66. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Flexible Sprites Jojic and Frey (2001) Automatic decomposition of video sequence into background model ordered set of masks (one per object per frame) foreground model (one per object per frame)

    67. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    68. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Transformed Component Analysis Generative model Now include transformations (translations) Extend to L layers Inference intractable so use variational framework

    69. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop

    70. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Constellation Model Li, Fergus and Perona (2003) Object recognition from small training sets Variational treatment of fully Bayesian model

    71. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Bayesian Constellation Model

    72. Machine Learning Techniques for Computer Vision (ECCV 2004) Christopher M. Bishop Summary of Part 2 Discrete and continuous latent variables EM algorithm Build complex models from simple components represented graphically incorporates prior knowledge Variational inference Bayesian model comparison

More Related