1 / 32

EM Algorithm and Mixture of Gaussians

EM Algorithm and Mixture of Gaussians. Collard Fabien - 20046056 김진식 (Kim Jinsik) - 20043152 주찬혜 (Joo Chanhye) - 20043595. Summary. Hidden Factors EM Algorithm Principles Formalization Mixture of Gaussians Generalities Processing Formalization Other Issues

anjelita
Download Presentation

EM Algorithm and Mixture of Gaussians

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EM AlgorithmandMixture of Gaussians Collard Fabien - 20046056 김진식 (Kim Jinsik) - 20043152 주찬혜 (Joo Chanhye) - 20043595

  2. Summary • Hidden Factors • EM Algorithm • Principles • Formalization • Mixture of Gaussians • Generalities • Processing • Formalization • Other Issues • Bayesian Network with hidden variables • Hidden Markov models • Bayes net structures with hidden variables 2

  3. Hidden factors The Problem : Hidden Factors • Unobservable / Latent / Hidden • Make them as variables • Simplicity of the model 3

  4. 162 54 54 486 54 Symptom 1 Symptom 2 Symptom 3 Hidden factors Simplicity details (graph1) 2 2 2 Smoking Diet Exercise 708 Priors ! 4

  5. Heart Disease 54 Hidden factors Simplicity details (Graph2) 2 2 2 Smoking Diet Exercise 78 Priors 6 6 6 Symptom 1 Symptom 2 Symptom 3 5

  6. EM Algorithm A Solution : EM Algorithm • Expectation • Maximization 6

  7. EM Algorithm Principles : Generalities • Given : • Cause (or Factor / Component) • Evidence • Compute : • Probability in connection table 7

  8. E Step : For each evidence (E), Use parameters to compute probability distribution Weighted Evidence : P(causes/evidence) M Step : Update the estimates of parameters Based on weighted evidence EM Algorithm Principles : The two steps Parameters : P(effects/causes) P(causes) 8

  9. EM Algorithm Principles : the E-Step • Perception Step • For each evidence and cause • Compute probablities • Find probable relationships 9

  10. EM Algorithm Principles : the M-Step • Learning Step • Recompute the probability • Cause event / Evidence event • Sum for all Evidence events • Maximize the loglikelihood • Modify the model parameters 10

  11. EM Algorithm Formulae : Notations • Terms •  : underlying probability distribution • x : observed data • z : unobserved data • h : current hypothesis of  • h’ : revised hypothesis • q : a hidden variable distribution • Task : estimate  from X • E-step: • M-step: 11

  12. EM Algorithm Formulae : the Log Likelihood • L(h) estimates the fitting of the parameter h to the data x with the given hidden variables z : • Jensen's inequality for any distribution of hidden states q(z) : • Defines the auxiliary function A(q,h): • Lower bound on the log likelihood • What we want to optimize 12

  13. EM Algorithm Formulae : the E-step • Lower bound on log likelihood : • H(q) entropy of q(z), • Optimize A(q,h) • By distribute data over hidden variables 13

  14. EM Algorithm Formulae : the M-step • Maximise A(q,h) • By choosing the optimal parameters • Equivalent to optimize likelihood 14

  15. EM Algorithm Formulae : Convergence (1/2) • EM increases the log likelihood of the data at every iteration • Kullback-Liebler (KL) divergence • Non negative • Equals 0 iff q(z)=p(z/x,h) 15

  16. Formulae : Convergence (2/2) • Likelihood increases at each iteration • Usually, EM converges to a local optimum of L 16

  17. Problem of likelihood • Can be high dimensional integral • Latent variables  additional dimensions • Likelihood term can be complicated 17

  18. Mixture of Gaussians The Issue : Mixture of Gaussian • Unsupervised clustering • Set of data points (Evidences) • Data generated from mixture distribution • Continuous data : Mixture of Gaussians • Not easy to handle : • Number of parameters is Dimension-squared 18

  19. Mixture of Gaussians Gaussian Mixture model (2/2) • Distribution • Likelihood of Gaussian Distribution : • Likelihood given a GMM : • N number of Gaussians • wi the weight of Gaussian I • All weights positive • Total weight = 1 19

  20. EM for Gaussian Mixture Model • What for ? • Find parameters: • Weights: wi=P(C=i) • Means: i • Covariances: i • How ? • Guess the priority Distribution • Guess components (Classes -or Causes) • Guess the distribution function 20

  21. Mixture of Gaussians Processing : EM Initialization • Initialization : • Assign random value to parameters 21

  22. Mixture of Gaussians Processing : the E-Step (1/2) • Expectation : • Pretend to know the parameter • Assign data point to a component 22

  23. Mixture of Gaussians Processing : the E-Step (2/2) • Competition of Hypotheses • Compute the expected values of Pij of hidden indicator variables. • Each gives membership weights to data point • Normalization • Weight = relative likelihood of class membership 23

  24. Mixture of Gaussians Processing : the M-Step (1/2) • Maximization : • Fit the parameter to its set of points 24

  25. Mixture of Gaussians Processing : the M-Step (2/2) • For each Hypothesis • Find the new value of parameters to maximize the log likelihood • Based on • Weight of points in the class • Location of the points • Hypotheses are pulled toward data 25

  26. Mixture of Gaussians Applied formulae : the E-Step • Find Gaussian for every data point • Use Bayes’ rule: 26

  27. Maximize A For each parameter of h, search for : Results : μ* σ2* w* Mixture of Gaussians Applied formulae : the M-Step 27

  28. Mixture of Gaussians Eventual problems • Gaussian Component shrinks • Variance 0 • Likelihood infinite • Gaussian Components merge • Same values • Share the data points • A Solution : reasonable prior values 28

  29. Other Issues Bayesian Networks 29

  30. Other Issues Hidden Markov models • Forward-Backward Algorithm • Smooth rather than filter 30

  31. Other Issues Bayes net with hidden variables • Pretend that data is complete • Or invent new hidden variable • No label or meaning 31

  32. Conclusion • Widely applicable • Diagnosis • Classification • Distribution Discovery • Does not work for complex models • High dimension •  Structural EM 32

More Related