300 likes | 886 Views
Hierarchical Topic Models and the Nested Chinese Restaurant Process. Blei, Griffiths, Jordan, Tenenbaum presented by Rodrigo de Salvo Braz. Document classification. One-class approach: one topic per document, with words generated according to the topic. For example, a Naive Bayes model.
E N D
Hierarchical Topic Models and the Nested Chinese Restaurant Process Blei, Griffiths, Jordan, Tenenbaum presented by Rodrigo de Salvo Braz
Document classification • One-class approach: one topic per document, with words generated according to the topic. • For example, a Naive Bayes model.
Document classification • It is more realistic to assume more than one topic per document. • Generative model: pick a mixture distribution over K topics and generate words from it.
Document classification • Even more realistic: topics may be organized in a hierarchy (not independent); • Pick a path from root to leaf in a tree; each node is a topic; sample from the mixture.
Dirichlet distribution (DD) • Distribution over distribution vectors of dimension K:P(p; u, ) = 1/Z(u) ipiui • Parameters are a prior distribution (“previous observations”); • Symmetric Dirichlet distribution assumes a uniform prior distribution (ui = uj, any i, j).
Latent Dirichlet Allocation (LDA) • Generative model of multiple-topic documents; • Generate a mixture distribution on topics using a Dirichlet distribution; • Pick a topic according to their distribution and generate words according to the word distribution for the topic.
Latent Dirichlet Allocation (LDA) DD hyper parameter Topics K Words w Topic distribution W
Chinese Restaurant Process (CRP) 1 out of 9 customers
Chinese Restaurant Process (CRP) 2 out of 9 customers
Chinese Restaurant Process (CRP) 3 out of 9 customers
Chinese Restaurant Process (CRP) 4 out of 9 customers
Chinese Restaurant Process (CRP) 5 out of 9 customers
Chinese Restaurant Process (CRP) 6 out of 9 customers
Chinese Restaurant Process (CRP) 7 out of 9 customers
Chinese Restaurant Process (CRP) 8 out of 9 customers
Chinese Restaurant Process (CRP) 9 out of 9 customers Data point (a distribution itself) sampled
Species Sampling Mixture • Generative model of multiple-topic documents; • Generate a mixture distribution on topics using a CRP prior; • Pick a topic according to their distribution and generate words according to the word distribution for the topic.
Species Sampling Mixture CRP hyper parameter Topics K Words w Topic distribution W
Nested CRP 1 2 3 4 5 6 1 3 4 2 5 6 3 6 1 4 2 5
Hierarchical LDA (hLDA) • Generative model of multiple-topic documents; • Generate a mixture distribution on topics using a Nested CRP prior; • Pick a topic according to their distribution and generate words according to the word distribution for the topic.
Artificial data experiment 100 1000-word documents on 25-term vocabulary Each vertical bar is a topic
Comments • Accommodates growing collections of data; • Hierarchical organization makes sense, but not clear to me why the CRP prior is the best prior for that; • No mention of time; maybe it takes a very long time.