Expectation-Maximization (EM) Algorithm

Expectation-Maximization (EM) Algorithm Original slides from Tatung University (Taiwan) Edited by: Muneem S.

Contents • Introduction • Main Body • Mixture Model • EM-Algorithm on GMM • Appendix  Missing Data

EM Algorithm Introduction

Introduction • EM is typically used to compute maximum likelihoodestimates given incomplete samples. • The EM algorithm estimates the parameters of a model iteratively. • Starting from some initial guess, each iteration consists of • an E step (Expectation step) • an M step (Maximization step)

Applications • Discovering the value of latent variables • Estimating the parameters of HMMs • Estimating parameters of finite mixtures • Unsupervised learning of clusters • Filling in missing data in samples • …

EM Algorithm Main Body

 Maximum Likelihood

 Latent Variables Incomplete Data Complete Data

Complete Data  Complete Data Likelihood

Complete Data Complete Data Likelihood A function of latent variable Y and parameter  A function of parameter  A function of random variable Y. The result is in term of random variable Y. Computable If we are given ,

Expectation Expectation: Conditional Expectation:

Expectation Step Let (i1) be the parameter vector obtained at the (i1)th step. Define (Conditional Expectation of log likelihood of complete data)

Maximization Step Let (i1) be the parameter vector obtained at the (i1)th step. Define

EM Algorithm Mixture Model

Mixture Models • If there is a reason to believe that a data set is comprised of several distinct populations, a mixture model can be used. • It has the following form: with

 Mixture Models Let yi{1,…, M} represents the source that generates the data.

 Mixture Models

Mixture Models

Mixture Models Given x and , the conditional density of ycan be computed.

 Complete-Data Likelihood Function

Expectation g: Guess

Expectation Zero when yi l

Expectation

Expectation 1

Maximization Given the initial guess g, We want to find , to maximize the above expectation. In fact, iteratively.

EM Algorithm EM-Algorithm on GMM

The GMM (Guassian Mixture Model) Guassian model of a d-dimensional source, say j : GMM with M sources:

Goal Mixture Model subject to To maximize:

Goal Mixture Model Correlated with l only. Correlated with l only. subject to To maximize:

Finding l Due to the constraint on l’s, we introduce Lagrange Multiplier, and solve the following equation.

Finding l 1 N 1

Finding l

Only need to maximize this term Finding l Consider GMM unrelated

Only need to maximize this term Finding l Therefore, we want to maximize: How? knowledge on matrix algebra is needed. unrelated

Finding l Therefore, we want to maximize:

Summary EM algorithm for GMM Given an initial guess g, find new as follows Not converge

Susanna Ricco

APPENDIX

EM Algorithm Example: Missing Data

  Univariate Normal Sample Sampling

  Maximum Likelihood Sampling We want to maximize it. Given x, it is a function of  and 2

Log-Likelihood Function Maximize this instead By setting and

Max. the Log-Likelihood Function

  Miss Data Missing data Sampling

E-Step be the estimated parameters at the initial of the tth iterations Let

Expectation-Maximization (EM) Algorithm