Hierarchical Topic Models and the Nested Chinese Restaurant Process

Hierarchical Topic Models and the Nested Chinese Restaurant Process Blei, Griffiths, Jordan, Tenenbaum presented by Rodrigo de Salvo Braz

Document classification • One-class approach: one topic per document, with words generated according to the topic. • For example, a Naive Bayes model.

Document classification • It is more realistic to assume more than one topic per document. • Generative model: pick a mixture distribution over K topics and generate words from it.

Document classification • Even more realistic: topics may be organized in a hierarchy (not independent); • Pick a path from root to leaf in a tree; each node is a topic; sample from the mixture.

Dirichlet distribution (DD) • Distribution over distribution vectors of dimension K:P(p; u, ) = 1/Z(u) ipiui • Parameters are a prior distribution (“previous observations”); • Symmetric Dirichlet distribution assumes a uniform prior distribution (ui = uj, any i, j).

Latent Dirichlet Allocation (LDA) • Generative model of multiple-topic documents; • Generate a mixture distribution on topics using a Dirichlet distribution; • Pick a topic according to their distribution and generate words according to the word distribution for the topic.

Latent Dirichlet Allocation (LDA) DD hyper parameter Topics   K  Words w Topic distribution W

Chinese Restaurant Process (CRP) 1 out of 9 customers

Chinese Restaurant Process (CRP) 9 out of 9 customers Data point (a distribution itself) sampled

Species Sampling Mixture • Generative model of multiple-topic documents; • Generate a mixture distribution on topics using a CRP prior; • Pick a topic according to their distribution and generate words according to the word distribution for the topic.

Species Sampling Mixture CRP hyper parameter Topics   K  Words w Topic distribution W

Nested CRP 1 2 3 4 5 6 1 3 4 2 5 6 3 6 1 4 2 5

Hierarchical LDA (hLDA) • Generative model of multiple-topic documents; • Generate a mixture distribution on topics using a Nested CRP prior; • Pick a topic according to their distribution and generate words according to the word distribution for the topic.

hLDA graphical model

Artificial data experiment 100 1000-word documents on 25-term vocabulary Each vertical bar is a topic

CRP prior vs. Bayes Factors

Predicting the structure

NIPS abstracts

Comments • Accommodates growing collections of data; • Hierarchical organization makes sense, but not clear to me why the CRP prior is the best prior for that; • No mention of time; maybe it takes a very long time.

Hierarchical Topic Models and the Nested Chinese Restaurant Process

Hierarchical Topic Models and the Nested Chinese Restaurant Process

Presentation Transcript

hierarchical regression models

Hierarchical Models

Linear Hierarchical Models

11. Nested Logit Models

Hierarchical Beta Process and the Indian Buffet Process

Hierarchical (nested) ANOVA

'Linear Hierarchical Models'

Hierarchically nested factor models

Hierarchical Models and Variance Components

HIERARCHICAL LINEAR MODELS

Nested (Hierarchical) Designs

Weighted Chinese Restaurant Process for clustering barcodes

Nested models of the Southland Current

Chinese 1 Restaurant Chinese

Hierarchical Models

The Nested Dirichlet Process

Nested logit and GEV models

Hierarchical Models

Nested Logit Models