Newton Method for the ICA Mixture Model

Newton Method for theICA Mixture Model Jason A. Palmer1 Scott Makeig1Ken Kreutz-Delgado2 Bhaskar D. Rao2 1 Swartz Center for Computational Neuroscience2 Dept of Electrical and Computer EngineeringUniversity of California San Diego, La Jolla, CA

Introduction • Want to model sensor array data with multiple independent sources — ICA • Non-stationary source activity — mixture model • Want the adaptation to be computationally efficient — Newton method

Outline • ICA mixture model • Basic Newton method • Positive definiteness of Hessian when model source densities are true source densities • Newton for ICA mixture model • Example applications to analysis of EEG

ICA Mixture Model—toy example • 3 models in two dimensions, 500 points per model • Newton method converges < 200 iterations, natural gradient fails to converge, has difficulty on poorly conditioned models

ICA Mixture Model • Want to model observations x(t), t = 1,…,N, different models “active” at different times • Bayesian linear mixture model, h = 1, . . . , M : • Conditionally linear given the model, : • Samples are modeled as independent in time:

Source Density Mixture Model • Each source density mixture component has unknown location, scale, and shape: • Generalizes Gaussian mixture model, more peaked, heavier tails

ICA Mixture Model—Invariances • The complete set of parameters to be estimated is: h = 1, . . ., M, i = 1, . . ., n, j = 1, . . ., m • Invariances: W row norm/source density scale and model centers/source density locations:

Basic ICA Newton Method • Transform gradient (1st derivative) of cost function using inverse Hessian (2nd derivative) • Cost function is data log likelihood: • Gradient: • Natural gradient (positive definite transform):

Newton Method – Hessian • Take derivative of (i,j)th element of gradient with respect to (k,l)th element of W : • This defines a linear transform : • In matrix form, this is:

Newton Method – Hessian • To invert: rewrite the Hessian transformation in terms of the source estimates: • Define , , : • Want to solve linear equation :

Newton Method – Hessian • The Hessian transformation can be simplified using source independence and zero mean: • This leads to 2x2 block diagonal form:

Newton Direction • Invert Hessian transformation, evaluate at gradient: • Leads to the following equations: • Calculate the Newton direction:

Positive Definiteness of Hessian • Conditions for positive definiteness: • Always true for true when model source densities match true densities: 1) 2) 3)

Newton for ICA Mixture Model • Similar derivation applies to ICA mixture model:

Convergence Rates • Convergence is really much faster than natural gradient. Works with step size 1! • Need correct source density model log likelihood iteration iteration

Segmentation of EEG experiment trials 3 models 4 models trial trial time time log likelihood log likelihood iteration iteration

Applications to EEG—Epilepsy 1 model 5 models log likelihood time time log likelihood difference from single model time

Conclusion • We applied method of Amari, Cardoso and Laheld, to formulate a Newton method for the ICA mixture model • Arbitrary source densities modeled with non-gaussian source mixture model • Non-stationarity modeled with ICA mixture model (multiple mixing matrices learned) • It works! Newton method is substantially faster (superlinear). Also Newton can converge when Natural Gradient fails

Code • There is Matlab code available!! • Generate toy mixture model data for testing • Full method implemented: mixture sources, mixture ICA, Newton • Extended version of paper in preparation, with derivation of mixture model Newton updates • Download from: http://sccn.ucsd.edu/~jason

Acknowledgements • Thanks to Scott Makeig, Howard Poizner, Julie Onton, Ruey-Song Hwang, Rey Ramirez, Diane Whitmer, and Allen Gruber for collecting and consulting on EEG data • Thanks to Jerry Swartz for founding and providing ongoing support the Swartz Center for Computational Neuroscience • Thanks for your attention!

Newton for ICA Mixture Model

Newton Method for the ICA Mixture Model

Newton Method for the ICA Mixture Model

Presentation Transcript

Newton Raphson Method

3.8 Newton s Method

Newton Method for the ICA Mixture Model

Newton-Raphson Method

Newton-Raphson Method

Newton-Raphson Method

Newton-Raphson Method

Gaussian Mixture Model

Interval Newton Method

Univariate Gaussian Mixture Model

A Newton Method for Linear Programming

A Mixture Model for Expert Finding

Newton-Raphson Method

The Mixture Model-based approach

The Mixture Model-based approach

Newton-Raphson Method

Growth Mixture Model

Newton-Raphson Method

Newton-Raphson Method

Newton-Raphson Method