Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009 Presented by Haojun Chen Some contents are from author’s paper and poster

Outline • Introduction • Dirichelet-Bernoulli Alignment (DBA) Model • Model Inference and Prediction • Experiments • Conclusion

Pattern Class Instance Feature Introduction • In this paper, multi-class, multi-label and multi-instance classification (M3C) problem is considered. • Goal: infer class label for both pattern and its instances (e.g. document) (e.g. topic) (e.g. paragraph) (e.g. word) Figure is adopted from author’s poster

: set of input patterns : corresponding labels : dictionary features : set of instances in pattern n : a bag of discrete features : class label Problem Formalization • For a multi-class, multi-label multi-instance corpus , we define

Basic Assumption • Assumption 1 [Exchangeability]: A corpus is a bag of patterns, and each pattern is a bag of instances. • Assumption 2 [Distinguishablity]: Each pattern can belong to several classes, but each instance belongs to a single class. Tree Structure Assumption

Dirichelet-Bernoulli Alignment (DBA) Model (1) DBA generative process: • Sample pattern-level class mixture

Dirichelet-Bernoulli Alignment (DBA) Model (2) • For each of the M instances in X • Choose instance-level class label • Generate the instance according to observation model

Dirichelet-Bernoulli Alignment (DBA) Model (3) • Generate pattern-level label where

Model Inference and Prediction • Parameter Estimation (MLE) • Variational Approximation • Prediction • Pattern Classification: • Instance Disambiguation:

Why The Name? • Lower bound Fourth term

Experiments 1 • Text classification • ModApte split of the Reuters-21578 text collection, 10788 documents, 10 classes • Each paragraph of a document is represented with Vector-Space-Model • Eliminate docs with empty label sets, length<20. Remaining 1879 docs, 721 docs (38.4%) with multiple labels • Compared with Multinomial-event-model-based Naive-Bayes (MNB) and two state-of-art multi-instance multi-label classifiers (MIMLSVM and MIMLBOOST)

Experiments 2 • Named entity disambiguation • Yahoo! Answer query log crawled in 2008,101 classes, 216563questions • 300 entities for training and 100 for test • Compared with Multinomial Naive Bayes with TF (MNBTF) or TFIDF (MNBTFIDF) attributes, as well as linear SVM classifier with TF (SVMTF) or TFIDF (SVMTFIDF) attributes.

Conclusion • A Dirichlet-Bernoulli Alignment model is proposed and proved to be useful for both pattern classification and instance disambiguation

Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009