520 likes | 536 Views
This research focuses on adding structure to Open Information Extraction (IE) by constructing proposition entailment graphs, which organize information hierarchically and enable investigating inference phenomena. The study presents an algorithm for constructing these graphs and analyzes predicate entailment in the context of propositions. The dataset includes 30 gold-standard graphs and 1.5 million entailment annotations.
E N D
Focused Entailment Graphs for Open IE Propositions Omer Levy Ido Dagan Jacob Goldberger Bar-Ilan University, Israel
Open IE • Extracts propositions from text “…which makes aspirin relieve headaches.” • No supervision • No pre-defined schema
What’s missing in Open IE? • Structure • Open IE does not consolidate natural language expressions relieveheadachetreatheadache
Adding Structure to Open IE Which structure? • Build a graph of Open IE propositions and their semantic relations
Adding Structure to Open IE Which structure? • Build a graph of Open IE propositions and their entailment relations Why entailment? • Merges paraphrases into mutual entailment cliques aspirin relievesheadacheaspirin treatsheadache • Organizes information hierarchically from specific to general aspirinrelievesheadachepainkiller relieves headache
aspirin, eliminate, headache aspirin, cure, headache Original Open IE Output coffee, help, headache drug, relieve, headache headache, control with, aspirin drug, treat, headache tea, soothe, headache analgesic, banish, headache headache, respond to, painkiller headache, treat with, caffeine
drug, relieve, headache drug, treat, headache Consolidated Open IE Output headache, respond to, painkiller headache, treat with, caffeine analgesic, banish, headache tea, soothe, headache headache, control with, aspirin aspirin, cure, headache aspirin, eliminate, headache coffee, help, headache
Semantic Applications • Example: Structured Queries • “What relieves headaches?”
Semantic Applications • Example: Structured Queries • “What relieves headaches?”
drug, relieve, headache drug, treat, headache Structured Query: headache, respond to, painkiller headache, treat with, caffeine analgesic, banish, headache tea, soothe, headache headache, control with, aspirin aspirin, cure, headache aspirin, eliminate, headache coffee, help, headache
drug, relieve, headache drug, treat, headache Structured Query: headache, respond to, painkiller headache, treat with, caffeine analgesic, banish, headache tea, soothe, headache headache, control with, aspirin aspirin, cure, headache aspirin, eliminate, headache coffee, help, headache
drug Structured Query: painkiller caffeine analgesic tea aspirin coffee
Our Contributions • Structuring Open IE withProposition Entailment Graphs • Dataset: 30 gold-standard graphs, 1.5 million entailment annotations • Algorithm for constructing Focused Proposition Entailment Graphs • Analysis: Predicate entailment is not quite what we thought
Related Work: Predicate Entailment Graphs • Berant et al. (2010,2011,2012) • We extend Berant et al.’s work from predicates to propositions
Focused Proposition Entailment Graphs • Nodes: Open IE propositions • Edges: Textual Entailment
Focused Proposition Entailment Graphs • Assumptions: Binary Propositions and Common Topic • Binary Propositions • Focused on a common topic
Focused Proposition Entailment Graphs • Assumptions: Binary Propositions and Common Topic • Binary Propositions • Focused on a common topic
drug, relieve, headache drug, treat, headache headache, respond to, painkiller headache, treat with, caffeine analgesic, banish, headache tea, soothe, headache headache, control with, aspirin aspirin, cure, headache aspirin, eliminate, headache coffee, help, headache
drug, relieve, headache drug, treat, headache headache, respond to, painkiller headache, treat with, caffeine analgesic, banish, headache tea, soothe, headache headache, control with, aspirin aspirin, cure, headache aspirin, eliminate, headache coffee, help, headache
Focused Proposition Entailment Graphs • Edges: Textual Entailment Proposition Entailment • Simpler than sentence-level entailment • More complicated than lexical entailment • Enables investigation of inference phenomena in an isolated manner
Constructing Proposition Entailment Graphs Task Definition: Given a set of propositions , find all their entailment edges.
Dataset: High-Quality Open IE Propositions • Google’s Syntactic N-grams • Based on millions of books • Filter for subject-verb-object • Including prepositional objects and passive • Result: 68 million high-quality propositions
Dataset: Annotating Entailment Graphs • Select 30 healthcare topics • antibiotic, caffeine, insomnia, scurvy, … • Collect a set of propositions focused on each topic • Manually clean noisy extractions • Retaining 200 propositions per graph (average) • Efficiently annotate entailment • 1.5 million entailment judgments
How do we recognize proposition entailment? . Observation: propositions entail their lexical components entail
How do we recognize proposition entailment? . Observation: propositions entail their lexical components entail
How do we recognize proposition entailment? . Proposition entailment is reduced to lexical entailment in context
Lexical Entailment Lexical Entailment Features Lexical Entailment(Logistic)
Lexical Entailment Lexical Entailment Features Features • WordNet Relations • UMLS • Distributional Similarity • String Edit Distance Lexical Entailment(Logistic) Supervision
From Lexical to Proposition Entailment Lexical Entailment Features Lexical Entailment(Logistic) Supervision
From Lexical to Proposition Entailment Predicate Entailment Features Argument Entailment Features Predicate Entailment(Logistic) Argument Entailment(Logistic) Supervision Supervision
From Lexical to Proposition Entailment Predicate Entailment Features Argument Entailment Features Predicate Entailment(Logistic) Argument Entailment(Logistic) Supervision Supervision Proposition Entailment(Conjunction)
Distant Supervision (WordNet)? Predicate Entailment Features Argument Entailment Features Following Snow (2005), Berant (2012) Predicate Entailment(Logistic) Argument Entailment(Logistic) WordNet WordNet Proposition Entailment(Conjunction)
Direct Supervision (30 Annotated Graphs) Predicate Entailment Features Argument Entailment Features Predicate Entailment(Logistic) Argument Entailment(Logistic) Proposition Entailment(Conjunction) Annotated Graphs
Direct Supervision (30 Annotated Graphs) Predicate Entailment Features Argument Entailment Features Hidden Layer Proposition Entailment(Conjunction) Annotated Graphs
FlatModel Predicate Entailment Features Argument Entailment Features Proposition Entailment(Logistic) Annotated Graphs
Compared Methods • Component-Level Distant Supervision (WordNet) • Predicates & Arguments • Predicates Only • Arguments Only • Proposition-Level Direct Supervision (30 Annotated Graphs) • Hierarchical (our method) • Flat • All methods used Berant et al.’s Global Optimization method
Direct Supervision: Flat vs Hierarchical • Hierarchal model performs better than flat model • Better to model predicate and argument entailment separately
Distant vs Direct Supervision • Direct supervision is better • Although WordNet provides more training examples
Predicate Entailment with Distant Supervision • Ignoring predicates improves distant supervision baselines
Are WordNet relations capturing real-world predicate entailments?
Predicate Entailment vs WordNet Relations Over a predicate inference subset, how many predicate entailments are covered by WordNet? • Positive indicators • synonyms, hypernyms, entailment
Predicate Entailment vs WordNet Relations Why isn’t WordNet capturing predicate entailment? Over a predicate inference subset, how many predicate entailments are covered by WordNet? • Positive indicators • synonyms, hypernyms, entailment • Negative Indicators • antonyms, hyponyms, cohyponyms
Predicate Entailment is Context-Sensitive The words do not necessarily entail, but the situations do.
Predicate Entailment is Context-Sensitive The words do not necessarily entail, but the situations do.
Investigating Context-Sensitive Entailment • Recent work on context-sensitive lexical inference • e.g. (Melamud et al., 2013) • Previous datasets • Lexical substitution (McCarthy and Navigli, 2007) • Predicate inference (Zeichner et al., 2012) • We offer a new dataset of real-world lexical entailments in context! • Sample:synthetic vs naturally occurring • Size:several thousands vs 1.5 million