Semantic History Embedding in Online Generative Topic Models

Semantic History Embedding in OnlineGenerative Topic Models Pu Wang (presenter) Authors: Loulwah AlSumait (lalsumai@gmu.edu) Daniel Barbará (dbarbara@gmu.edu) Carlotta Domeniconi (carlotta@cs.gmu.edu) Department of Computer Science George Mason University SDM 2009

Outline • Introduction and related work • Online LDA (OLDA) • Parameter Generation • Sliding history window • Contribution weights • Experiments • Conclusion and future work

Introduction • When a topic is observed at a certain time, it is more likely to appear in the future • previously discovered topics hold important information about the underlying structure of data • Incorporating such information in future knowledge discovery can enhance the inferred topics

Related Work • Q. Sun, R. Li et al. ACL 2008. • LDA-based Fisher kernel to measure the text semantic similarity between blocks of LDA documents • X. Wang et al. ICDM 2007 • Topical N-Gram model that automatically identified feasible N-grams based on the context that surround it • X. Phan et al. IW3C2 2008. • a classifier on both a small set of labeled documents in addition to an LDA topic model estimated from Wikipedia.

Time (time between t & t+1 = ε) t t+1 t+1  t t+1 Topic Evolution Tracking PriorsConstruction Emerging TopicDetection t+1 t+1  t  t  t+1 t+1 t zti zit+1 S t+ 1 St wti wit+1 Nd Nd M t Mt+1 Tracking Topics Emerging Topic List t+1 t  t t+1 K K Online LDA (OLDA)

Inference Process • Parameter Generation • Simple inference problem • Gibbs Sampling Current stream Historic observations Current stream Historic observations

Topic Evolution Tracking • Topic alignment over time • Handles changes in lexicon, topic drift Time t t+1 Aligned topicsover time P(topic) P(word|topic)

Sliding History Window • Consider all topic-word distributions within a “sliding history window” (δ) • Alternatives for keeping track of history at time t • full memory, δ= t • short memory, δ=1 • Intermediate memory, δ= c • Matrix Evolution Matrix Topic distribution over time Dictionary

Contribution Control • Evolution Tuning Parameters ω • Individual weights of models • Decaying history: ω1 < ω2<…< ωδ • Equal contributions: ω1 = ω2=…= ωδ • Total weight of history (vs. weight of new observations) • Balanced weights (sum=1) • Biased toward the past (sum>1) • Biased toward the future (sum<1)

Parameter Generation • Priors of Topic distribution over words at time t+1 • Generate topic distribution

Experimental Design • “Matlab Topic Modeling Toolbox”, by Mark Steyvers and Tom Griffiths • Datasets: • NIPS • Proceedings from 1988-2000 • 1,740 papers, 13,649 unique words, 2,301,375 word tokens • 13 streams, size from 90 to 250 doc’s per stream • Reuters-21578 • News from 26-FEB-1987 to 19-OCT-1987 • 10,337 documents; 12,112 unique words; 793,936 word tokens • 30 streams (29/340 doc’s, 1/517 doc’s) • Baselines: • OLDAfixed: no memory • OLDA (ω(1) ): short memory • Performance Evaluation • measure: Perplexity • Test set: documents of next year or stream

ReutersOLDA with fixed β vs. OLDA with semantic β No memory

ReutersOLDA with different window size and weights • Increasing window size enhanced prediction • Incremental history information (δ>1,sum>1) did not improve topic estimation at all Incremental History Information short memory Equal contribution Increase window size

NIPSOLDA with Different Window • Increasing window size enhanced prediction w.r.t. short memory • Window size greater than 3 enhanced prediction • Effect of total weight Short memory No memory

NIPSOLDA with Different Total Weight • Models with lower total weight resulted in better prediction No memory Sum of weight = 1 Decrease sum of weights

NIPS & ReutersOLDA with Different Total Weight • Variable sum(ω) • δ= 2 Decrease total sum of weights Increase total sum of weights

NIPSOLDA with Equal vs Decaying History Contribution

Conclusions • the effect of embedding semantic information in LDA topic modeling of text streams • Parameter generation based on topical structures inferred in the past • Semantic embedding enhances OLDA prediction • Effect of • Total influence of history, • History window size, and • Equal or decaying contributions • Future work • use of prior-knowledge • effect of embedded historic semantics on detecting emerging and/or periodic topics

Semantic History Embedding in Online Generative Topic Models

Semantic History Embedding in Online Generative Topic Models

Presentation Transcript

Generative and Discriminative Models in NLP: A Survey

Generative Topic Models for Community Analysis

Models of Generative Grammar

Generative Models For Text

Generative and Discriminative Models in Text Classification

GENERATIVE TOPIC

Generative Models vs. Discriminative models

Linear Classification Models: Generative

Topic models

Generative Models

Generative Models of Images of Objects

Generative Models for Image Understanding

Generative Models for Crowdsourced Data

Generative Topic Models for Community Analysis

Topic Models

Embedding ICT in the History Curriculum

Embedding population dynamics models in inference

Generative Models

Semantic Representations with Probabilistic Topic Models

Models of Generative Grammar

Semantic Models in Neural Networks

Generative Models for Image Analysis