Online Learning of Semantic Relations

Online Learning ofSemantic Relations NirGrinberg and William M. Pottenger, Ph.D. Rutgers University

Introduction • What are semantic relations? • “Barack H. Obama is the 44th President of the United States” • “Barack Obama takes the oath of office as President of the United States” • “Barack Obama, in full Barack Hussein Obama II (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009– ) and the first African…” • “X was born in Y” or “X is from Y”, etc.

Introduction • Why are we interested in Semantic Relations? • Information Extraction, Information Retrieval and Question Answering • Building blocks for IDEAs • Interpretability and Generalization of Topic Models

Related Work • Early works: DIPRE (Brin ’98), Snowball (Agichtein et al. 2000) • ACE and MUC-7 Datasets appearing => • Supervised methods appear. • Using features like extracted entities, POS, parse tree… ? • Kernel functions • Unsupervised: Dirt (Lin et al. ‘01) and USP (Poon et al ‘09)

Related Work • Topic Modeling: • Nubbi(Chang et al. 2009) • Rel-LDA and Type-LDA (Yao et al. 2011) Rel-LDA Type-LDA

What is missing? • Interpretability? • Parallelizable but not O(N) • Interaction with other features? • Higher-Order learning?

One more Related Work • Pachinko Allocation Model: (PAM) by Li et al. 2007 • Capture arbitrary: • Topic-Topic correlations • Topic-Word correlations • Better than LDA and CTM PAM

Our Approach • SemRel: based on Type-LDA and PAM. • Adds a layer of abstraction • Improve interpretability • Allow feature interactions • Variational Inference: • Stochastic natural gradient SemRel

Preprocessing • Tokenization, Lemmatization, POS tagging, NER • Using StanfordNLP toolbox • Dependency Path Parsing • Using MaltParser • Filtering out long paths and syntactically irrelevant • Filtering out infrequent features and entities

Example • “Gamma Knife, made by the Swedish medical technology firm Elekta, focuses low dosage gamma radiation ...”

The Algorithm • We derived similar online learning algorithms for RelLDA, Type-LDA and PAM

Results

Results • SemReloutperforms Type-LDA: • two tailed paired t-test across # topics: • t(4)= -6.01, p<0.002 • two tailed paired t-test across folds: • p<0.001 • Preprocessing is more of bottleneck than the learning algorithm!

Future Work • We’re currently investigating convergence • Complementary qualitative evaluation • Other datasets • Extensions with more features • Word, Entities, Higher-Order features, etc.

Conclusions • Yet another topic model, but: • Moved from Bag-Of-Words assumption without breaking the framework • Devised an online learning algorithm • Hopefully, improved on interpretability

Q&AThank you!

Online Learning of Semantic Relations