190 likes | 295 Views
A Confidence Model for Syntactically-Motivated Entailment Proofs. Asher Stern & Ido Dagan ISCOL June 2011, Israel. Recognizing Textual Entailment (RTE). Given a text, T , and a hypothesis, H Does T entail H. Example. T : An explosion caused by gas took place at a Taba hotel
E N D
A Confidence Model for Syntactically-Motivated Entailment Proofs Asher Stern & Ido Dagan ISCOL June 2011, Israel
Recognizing Textual Entailment (RTE) • Given a text, T, and a hypothesis, H • Does T entail H Example • T: An explosion caused by gas took place at a Taba hotel • H: A blast occurred at a hotel in Taba.
Proof Over Parse Trees T = T0→ T1→ T2→ ... →Tn = H
Bar Ilan Proof System - Entailment Rules Generic Syntactic Lexical Syntactic Lexical explosion blast
Bar Ilan Proof System H: A blast occurred at a hotel in Taba. An explosion caused by gas took place at a Taba hotel A blast caused by gas took place at a Taba hotel A blast took place at a Taba hotel A blast occurred at a Taba hotel A blast occurred at a hotel in Taba. Lexical Lexical syntactic Syntactic
Tree-Edit-Distance Insurgents attacked soldiers -> Soldiers were attacked by insurgents
Proof over parse trees Which steps? How to classify? Decide “yes” if and only if a proof was found Almost always “no” Cannot handle knowledge inaccuracies Estimate a confidence to the proof correctness • Tree-Edits • Regular or custom • Entailment Rules
Proof systems TED based Entailment Rules based Linguistically motivated Rich knowledge No estimation of proof correctness Incomplete proofs Mixed system with ad-hoc approximate match criteria • Estimate the cost of a proof • Complete proofs • Arbitrary operations • Limited knowledge Our System • The benefits of both worlds, and more! • Linguistically motivated complete proofs • Confidence model
Our Method • Complete proofs • On the fly operations • Cost model • Learning model parameters
On the fly Operations • “On the fly” operations • Insert node on the fly • Move node / move sub-tree on the fly • Flip part of speech • Etc. • More syntactically motivated than Tree Edits • Not justified, but: • Their impact on the proof correctness can be estimated by the cost model.
Cost Model The Idea: Represent the proof as a feature-vector Use the vector in a learning algorithm
Cost Model • Represent a proof as F(P) = (F1, F2 … FD) • Define weight vector w=(w1,w2,…,wD) • Define proof cost • Classify a proof • b is a threshold • Learn the parameters (w,b)
Search Algorithm • Need to find the “best” proof • “Best Proof” = proof with lowest cost • Assuming a weight vector is given • Search space is exponential • pruning
Parameter Estimation • Goal: find good weight vector and threshold (w,b) • Use a standard machine learning algorithm (logistic regression or linear SVM) • But: Training samples are not given as feature vectors • Learning algorithm requires training samples • Training samples construction requires weight vector • Learning weight vector done by learning algorithm • Iterative learning
Parameter Estimation • Start with w0, a reasonable guess for weight vector • i=0 • Repeat until convergence • Find the best proofs and construct vectors, using wi • Use a linear ML algorithm to find a new weight vector, wi+1 • i = i+1
Conclusions • Linguistically motivated proofs • Complete proofs • Cost model • Estimation of proof correctness • Search best proof • Learning parameters • Results • Reasonable behavior of learning scheme
Thank you Q & A