340 likes | 482 Views
When logical inference helps in determining textual entailment ( and when it doesn’t) Johan Bos & Katja Markert. Linguistic Computing Laboratory Dipartimento di Informatica Universit à di Roma “La Sapienza”. Natural Language Processing Group Computer Science Department Universit y of Leeds.
E N D
When logical inference helps in determining textual entailment (and when it doesn’t)Johan Bos & Katja Markert Linguistic Computing LaboratoryDipartimento di InformaticaUniversità di Roma “La Sapienza” Natural Language Processing GroupComputer Science DepartmentUniversity of Leeds
Aristotle’s Syllogisms ARISTOTLE 1 (TRUE) All men are mortal. Socrates is a man. ------------------------------- Socrates is mortal.
Talk Outline • Hybrid system combining: • Shallow semantic approach • Deep semantic approach • Machine Learning • Features of both approaches are combined in one classifier
Shallow Semantic Analysis • Primarily based on word overlap • Using weighted lemmas • Weights correspond to inverse doc. freq. • Web as corpus • Wordnet for synonyms • Additional features • Number of words in T and H • Type of dataset
Deep Semantic Analysis • Compositional Semantics • How to build semantic representationsfor the text and hypothesis • Do this in a systematic way • Logical Inference • FOL theorem proving • FOL model building
Compositional Semantics • The ProblemGiven a natural language expression, how do we convert it into a logical formula? • Frege’s principleThe meaning of a compound expression is a function of the meaning of its parts.
Compositional Semantics • We need a theory of syntax, to determine the parts of a natural language expression • We will use CCG • We need a theory of semantics, to determine the meaning of the parts • We will use DRT • We need a technique to combine the parts • We will use the Lambda-calculus
Combinatorial Categorial Grammar • CCG is a lexicalised theory of grammar (Steedman 2001) • Deals with complex cases of coordination and long-distance dependencies • Lexicalised, hence easy to implement • English wide-coverage grammar • Fast robust parser available
Discourse Representation Theory • Well understood semantic formalism • Scope, anaphora, presupposition, tense, etc. • Kamp `81, Kamp & Reyle `93, Van der Sandt `92 • Semantic representations (DRSs) can be build using traditional tools • Lambda calculus • Underspecification • Model-theoretic interpretation • Inference possible • Translation to first-order logic
CCG/DRT example NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) z. x. x(y. )
CCG/DRT example NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) z. x.x(y. ) -------------------------------------------------------- (FA) NP: a spokesman p. q. ;p(x);q(x)(z. )
CCG/DRT example NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) z. x.x(y. ) -------------------------------------------------------- (FA) NP: a spokesman q. ; ;q(x))
CCG/DRT example NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) z. x.x(y. ) -------------------------------------------------------- (FA) NP: a spokesman q. ;q(x)
CCG/DRT example NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) x. x.x(y. ) -------------------------------------------------------- (FA) NP: a spokesman q. ;q(x) -------------------------------------------------------------------------------- (BA) S: a spokesman lied x.x(y. ) (q. ;q(x))
CCG/DRT example NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) x. x.x(y. ) -------------------------------------------------------- (FA) NP: a spokesman q. ;q(x) -------------------------------------------------------------------------------- (BA) S: a spokesman lied ;
CCG/DRT example NP/N:a N:spokesman S\NP:lied p. q. ;p(x);q(x) x. x.x(y. ) -------------------------------------------------------- (FA) NP: a spokesman q. ;q(x) -------------------------------------------------------------------------------- (BA) S: a spokesman lied
The Clark & Curran Parser • Use standard statistical techniques • Robust wide-coverage parser • Clark & Curran (ACL 2004) • Grammar derived from CCGbank • 409 different categories • Hockenmaier & Steedman (ACL 2002) • Results: 96% coverage WSJ • Bos et al. (COLING 2004) • Example output:
Logical Inference • How do we perform inference with DRSs? • Translate DRS into first-order logic • Use off-the-shelf inference engines • What kind of inference engines? • Theorem Prover: Vampire(Riazanov & Voronkov 2002) • Model Builder: Paradox
Using Theorem Proving • Given a textual entailment pair T/H: • Produce DRSs for T and H • Translate these DRSs into FOL • Give to the theorem prover: T’ H’ • If a proof is found, then T entails H • Good results for examples with: • apposition, relative clauses, coordination • intersective adjectives, noun noun compounds • passive/active alternations
Example (Vampire: proof) RTE-2 112 (TRUE) On Friday evening, a car bomb exploded outside a Shiite mosque in Iskandariyah, 30 miles south of the capital. ----------------------------------------------------- A bomb exploded outside a mosque.
Example (Vampire: proof) RTE-2 489 (TRUE) Initially, the Bundesbank opposed the introduction of the euro but was compelled to accept it in light of the political pressure of the capitalist politicians who supportedits introduction. ----------------------------------------------------- The introduction of the euro has been opposed.
Background Knowledge • Many examples in the RTE dataset require additional knowledge • Lexical knowledge • Linguistic Knowledge • World knowledge • Generate Background Knowledge for T&H in first order logic • Give this to the theorem prover: (BK & T’) H’
Lexical Knowledge • We use WordNet as a start to get additional knowledge • All of WordNet is too much, so we create MiniWordNets • Based on hyponym relations • Remove redundant information • Conversion in first order logic
Linguistic Knowledge • Manually coded rules • Possessives • Active/passive alternation • Noun noun compound interpretation
Linguistic & World Knowledge • Manually coded 115 rules • Spatial knowledge • Causes of death • Winning prizes or awards • Family relations • Diseases • Producers • Employment • Ownership
Knowledge at work • Background Knowledge:x(soar(x)rise(x)) RTE 1952 (TRUE) Crude oil prices soared to record levels. ----------------------------------------------------- Crude oil prices rise.
Troubles with theorem proving • Theorem provers are extremely precise • They won’t tell you when there is “almost” a proof • Even if there is a little background knowledge missing, Vampire will say: NO
Vampire: no proof RTE 1049 (TRUE) Four Venezuelan firefighters who were traveling to a training course in Texas were killed when their sport utility vehicle drifted onto the shoulder of a Highway and struck a parked truck. ---------------------------------------------------------------- Four firefighters were killed in a car accident.
Using Model Building • Need a robust way of inference • Use model builders • Paradox (Claessen & Sorensson 2003) • Mace (McCune) • Produce minimal model by iteration of domain size • Use size of models to determine entailment • Compare size of model of T and T&H • If the difference is small, then it is likely that T entails H
Using Model Building • Given a textual entailment pair T/H withtext T and hypothesis H: • Produce DRSs for T and H • Translate these DRSs into FOL • Generate Background Knowledge • Give this to the Model Builder: i) BK & T’ ii) BK & T’ & H’ • If the models for i) and ii) are similar in size, then T entails H
Features for Classifier • Features from deep analysis: • proof (yes/no) • inconsistent (yes/no) • domain size, model size • domain size difference, abs and relative • model size difference, abs and relative • Combine this with features from shallow approach • Machine learning took WEKA
Conclusions • Why relatively low results? • Recall for feature proof is low • Most proofs are also found by word overlap • Same for small domain size differences • Not only bad news • Deep analysis more consistent across different datasets
Future Stuff • Error analysis! • Difficult, dataset not focussed • Many different sources of errors • Prepare more focussed datasets for system development? • Use better techniques for usingnumeric features • Improve linguistic analysis • More background knowledge!