180 likes | 266 Views
Towards a probabilistic Model for Lexical Entailment. Eyal Shnarch, Jacob Goldberger, Ido Dagan. Entailment at the lexical level. Hypothesis. Obama gave a speech last night in the Israeli lobby conference. speech. Obama. Israeli lobby.
E N D
Towards a probabilistic Model for Lexical Entailment Eyal Shnarch, Jacob Goldberger, Ido Dagan
Entailment at the lexical level Hypothesis Obama gave a speech last night in the Israeli lobbyconference speech Obama Israeli lobby In his speech at the American Israel Public Affairs Committee yesterday, the president challenged … American Israel Public Affairs Committee BarackObama’s AIPAC address ... Texts BarackObama the president AIPAC address
Lexical-level systems are very handy • Important component within a full inference system • Pose hard-to-beat baselines • (Mirkin et. al 2009, Majumdar and Bhattacharyya 2010) • Can be used in cases where there are no deep analysis tools for target language • e.g. no parser
Modeling entailment at the lexical level Text Obama’s Cadillac got stuck in Dublin in a large Irish crowd resource 2 resource 1 social group resource 1 uncovered The president’s car got stuck in Ireland, surrounded by many people Hypothesis
Lexical entailment scores Mostly heuristic: • Percent covered/un-covered • (Majumdar and Bhattacharyya, 2010, Clark and Harrison, 2010) • Similarity estimation • (Corley and Mihalcea, 2005; Zanzotto and Moschitti,2006) • Vector space • (MacKinlay and Baldwin, 2009)
Terminology T Obama’s Cadillac got stuck in Dublin in a large Irish crowd rule1 resource 2 rule resource 1 social group chain rule2 resource 1 uncovered The president’s car got stuck in Ireland, surrounded by many people H lexical resource
Goal – a probabilistic model Text Obama’s Cadillac got stuck in Dublin in a large Irish crowd social group The president’s car got stuck in Ireland, surrounded by many people Hypothesis Addressing: • Distinguish resources reliability levels • Consider transitive chains length • Consider multiple evidence
Entailment validation process … … t1 ti tm T The validity of a rule depends on the reliability of the resource which provided it resource 2 resource 1 t’ A chain is valid if all its rule steps are valid chain resource 1 A single term is entailed if at least one of its evidence is a valid entailment chain h1 hj hn … … H A hypothesis is entailed if all its terms are entailed
Probabilistic model for Lexical Entailment validity prob. of a rule step r is the reliability of the resource R(r) which suggested it … … t1 ti tm T resource 2 resource 1 MATCH t’ chain resource 1 OR h1 hj hn … … H AND if EM to estimate parameter set entailment holds
Let’s try a concrete example Obama’s Cadillac got stuck in Dublin in a large Irish crowd T wordnet wordnet co-occurr 0.7 0.7 wikipedia 0.45 social group catvar 0.55 0.8 0.65 wikipedia wordnet 0.55 0.7 wordnet 0.7 uncovered 0.17 The president’s car got stuck in Ireland, surrounded by many people H * numbers in blue are parameter values found by our model
Extension 1: relaxing with noisy-AND • final AND gate demands the entailment of all hypothesis terms • sentence level entailment is possible even if not all terms are entailed • this strict demand is especially unfair for longer hypotheses noisy-
Better results with extension 1 F1 * * * significant improvement over base prob. according to Mc-Nemar’s test with p<0.01
uncovered term covered term Extension 2: terms independence assumption H H As T covers more terms of H – our belief in each rule application increases
Putting it all together is best F1 * * Negative result: F1 usually decreases when allowing chains
Summary A probabilistic model: • Learns for each lexical resource an individual reliability value • Considers multiple evidence and chain length • Two extensions which brings us to… • Performance is in line with best entailment systems noisy-
Future work • Better model for transitivity • noisy-AND for chains too • Verify rule application in a specific context • next talk by Shachar Mirkin • Test with other application data sets • passage retrieval for QA • Integrate into a full entailment system Thank you!