1 / 17

by Nizar Bouguila and Djemel Ziou

High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length. by Nizar Bouguila and Djemel Ziou. Dissusion led by Qi An Duke University Machine Learning Group. Outline. Introduction The generalized Dirichlet mixture

Download Presentation

by Nizar Bouguila and Djemel Ziou

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila and Djemel Ziou Dissusion led by Qi An Duke University Machine Learning Group

  2. Outline • Introduction • The generalized Dirichlet mixture • The minimal message length (MML) criterion • Fisher information matrix and priors • Density estimation and model selection • Experimental results • Conclusions

  3. Introduction • How to determine the number of components in a mixture model for high-dimensional data? • Stochastic and resampling (Slow) • Implementation of model selection criteria • Fully Bayesian way • Deterministic (Fast) • Approximate Bayesian criteria • Information/coding theory concepts • Minimal message length (MML) • Akaike’s information criterion (AIC)

  4. The generalized Dirichlet distribution • A d dimensional generalized Dirichlet distribution is defined to be where and , , , It can be reduced to the Dirichlet distribuiton when

  5. The generalized Dirichlet distribution For the generalized Dirichlet distribution: The GDD has a more general covariance structure than the DD and it is conjugate to multinomial distribution.

  6. GDD vs. Gaussian • The GDD has smaller number of parameters to estimate. The estimation can be more accurate • The GDD is defined in a support [0,1] and can be extended to a compact support [A,B]. It is more appropriate for the nature of data. Beta distribution: Beta type-II distribution: They are equal if we set u=v/(1+v).

  7. A GDD mixture model A generalized Dirichlet mixture model with M components, where p(X|α) takes a form of the GDD.

  8. The MML criterion • The message length is defined as minus the logarithm of the posterior probability. • After placing an explicit prior over parameters, the message length for a mixture of distribution is given as prior likelihood Fisher Information optimal quantization constant

  9. Fisher Information matrix • The Fisher information matrix is the expected value of the Hessian minus the logarithm of the likelihood where

  10. Prior distribution • Assume the independence between difference components Mixture weighs GDD parameters Place a Dirichlet distribution and a generalized Dirichlet distribution on P and α, respectively, with parameters set to 1.

  11. Message length • After obtaining the Fisher information and specifying the prior distribution, the message length can be expressed as

  12. Estimation and selection algorithm • The authors use an EM algorithm to estimate the mixture parameters. • To overcome the computation issue and local maxima problem, they implement a fairly sophisticated initialization algorithm. • The whole algorithm is summarized in the next page

  13. Experimental results The correct number of mixture are 5, 6, 7, respectively

  14. Experimental results

  15. Experimental results • Web mining: • Training with multiple classes of labels • Use to predict the label of testing sample • Use top 200 words frequency

  16. Conclusions • A MML-based criterion is proposed to select the number of components in generalized Dirichlet mixtures. • Full dimensionality of the data is used. • Generalized Dirichlet mixtures allow more modeling flexibility than mixture of Gaussians. • The results indicate clearly that the MML and LEC model selection methods outperform the other methods.

More Related