190 likes | 393 Views
Approximating Textual Entailment with LFG and FrameNet Frames. Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland University, Saarbr ü cken Second Pascal Challenge Workshop Venice, April 2006. Outline of this Talk. Frame Semantics
E N D
Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland University,Saarbrücken Second Pascal Challenge Workshop Venice, April 2006
Outline of this Talk • Frame Semantics • A baseline system for approximating Textual Entailment • LFG syntactical analyses with • Frame semantics • Statistical decision: entailed? • Walk-through example from RTE 2006 • RTE 2006 results / brief conclusions
Frame Semantics (Fillmore 1976, Fillmore et. al. 2003) • Lexical semantic classification of predicates and their argument structure • A frame represents a prototypical situation (e.g. Commercial_transaction, Theft, Awareness) • A set of roles identifies the participants or propositions involved • Frames are organized in a hierarchy • Berkeley FrameNet Project db: 600 frames, 9.000 lexical units, 135.000 annotated sentences
Voice: active / passive Lexicalization POS: verb / noun Linguistic Normalizations(Frame: Commerce_buy)
Frame Semantics for RTE Focusing on lexical semantic classes and role-based argument structure • Built-innormalizations help to determine semantic similarity at a high level of abstraction • Disregarding aspects of “deep“ semantics: negation, modality, quantification, ... • Open for deeper modeling on demand (e.g. our treatment of modality)
text hypothesis A Baseline System for Approximating Textual Entailment • Fine-grained LFG-based syntactic analysis • English LFG grammar (Riezler et al. 2002) • Wide-coverage with high-quality probabilistic disambiguation • Frame Semantics • Shallow lexical-semantic classification of predicate-argument structure • Extensions: WordNet senses, SUMO concepts • Computing structural and semantic overlap of t and h • Hypothesis: large overlap ≈ entailment
text-hypothesis match graph different types of matches (aspects of similarity) text LFG f-structure graph w/ frames & concepts hypothesis LFG f-structure graph w/ frames & concepts Feature extraction lexical, syntactic, semantic structure & overlap measures A Baseline System for Approximating Textual Entailment Linguistic Analyses Computing Semantic Overlap Model training & classification Statistical Decision: Entailment?
Linguistic Components XLE parsing: LFG f-structure Fred / Detour / Rosy: frames & roles WordNet-based WSD: WordNet & SUMO F-structure w/ semantics projection Using XLE term rewriting system (Crouch 2005) • Rule-based: extend & refine sem. proj. • NEs, Locations • Co-reference • Modality, etc.
Example from RTE 2006 Pair 716 Text In 1983, Aki Kaurismäki directed his first full-time feature. Hypothesis Aki Kaurismäki directed a film.
Detour System frames (via WordNet) Fred & Rosy frames & roles (statistical) Automatic Frame Annotation for Text (SALTO Viewer) Collins Parse
Automatic Frame Annotation for Hypothesis 716_h: Aki Karusmäki directed a film.
Rule-based (LFG-NER) LFG + Frames for Hypothesis(FEFViewer) Aki Kaurismäki directed a film.
Hypothesis-Text-Match GraphsComputing Structural and Semantic overlap Match graph bundles overlapping partial graphs marked by match types • Aspects of similarity • Syntax-based (i.e. lexical and structural): Identical predicates (attributes) trigger node (edge) matches. • Semantics-based: Identical frames/concepts (roles) trigger node (edge) matches. • Degrees of similarity • Strict matching • Weak matching conditions for non-identical predicates: • “Structurally related” e.g. via coreference (relative clauses, appositives, pronominals) • “Semantically related” via WordNet, Frame-Relations
Grammatically related h: Aki Kaurismäki directed a film. WordNet related t: In 1983, Aki Kaurismäki directed his first full-time feature.
Statistical Modeling • Feature extraction on the basis of • Syntactic, Semantic matches (of different types) • Matching clusters’ sizes • Ratio (matched vs. hypothesis) • (Non-)matching modality • RTE-task, fragmentary (parse),… • Training/classification with WEKA tool • Feature selection • Predicate Matches • Frame overlap • Matching cluster size • Model 1: Conjunctive rule (Feat. 1,2) • Model 2: LogitBoost (Feat. 1,2,3)
RTE 2006 Results • SUM (and IR) are natural tasks for Frame Semantics, IE and QA need more deeper modeling (aboutness vs. factivity) • Error analysis • True positives: high semantic overlap • True negatives: 27% involve modality mismatches • False examples: poor modeling of dissimalrity • Many high-frequency features measuring similarity • Few low-frequency features measuring dissimilarity
Brief Conclusions • Good approximation of semantic similarity • Deep LFG syntactical analyses integrated with • Shallow lexical Frame Semantics (plus other lex. resources) • Match graph measuring overlap • Need better model for semantic dissimilarity • Too few rejections (false positives >> false negatives) • Towards deeper modeling • Treatment of modal contexts • Integration of lexical inferences • Open for collaborations