210 likes | 323 Views
Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and Genitives CICLING 2012, New Delhi. Anselmo Peñas NLP & IR Group, UNED, Spain Ekaterina Ovchinnikova USC – Information Science Institute, USA. Texts omit information. Humans optimize language generation effort
E N D
Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and GenitivesCICLING 2012, New Delhi AnselmoPeñas NLP & IR Group, UNED, Spain Ekaterina Ovchinnikova USC – Information Science Institute, USA
Texts omit information • Humans optimize language generation effort • We omit information that we know the receptor is able to predict and recover • Our research goal is to make explicit the omitted information in texts
Implicit predicates • In particular, some noun compounds and genitives are used in such way • In these cases, we want to recover the implicit predicates • For example: • Morning coffee -> coffee drunk in the morning • Malaria mosquito -> mosquito that carries malaria
How to find the candidates? • Nakov & Hearst 2006 • Search the web • N1 N2 -> N2 THAT * N1 • Malaria mosquito -> mosquito THAT * malaria • Here we use Proposition Stores • Harvest a text collection that will serve as context • Parse documents • Count N-V-N, N-V-P-N, N-P-N, … structures • Build Proposition Stores (Peñas & Hovy, 2010)
Proposition Stores Example: propositions that relate Bomb, attack • npn:[bomb:n, in:in, attack:n]:13. • nvpn:[bomb:n, explode:v, in:in, attack:n]:11. • nvnpn:[bomb:n, kill:v, people:n, in:in, attack:n]:8. • npn:[attack:n, with:in, bomb:n]:8. • … All of them could be paraphrases for the noun compound “bomb attack”
NE Semantic Classes Now, What happens if we have a Named Entity? • Shakespeare’s tragedy • -> write • Why? • Consider • John’s tragedy • Airbus’ tragedy
NE Semantic Classes We are considering the “semantic classes” of the NE Shakespeare -> writer writer, tragedy -> write
Class-Instance relations • Fortunately, relevant semantic classes are pointed out in texts through well-known structures • appositions, copulative verbs, “such as”, … • Here we take advantage of dependency parsing to get class-instance relations NNP NNP NNP nn appos be NN NN NN
Class-Instance relations World News has_instance(leader,'Yasir':'Arafat'):1491. has_instance(spokesman,'Marlin':'Fitzwater'):1001. has_instance(leader,'Mikhail':'S.':'Gorbachev'):980. has_instance(chairman,'Yasir':'Arafat'):756. has_instance(agency,'Tass'):637. has_instance(leader,'Radovan':'Karadzic'):611. has_instance(adviser,'Condoleezza':'Rice'):590. …
So far Propositions: <p,a> | P(p,a) p: predicate a: list of arguments <a1 …an> P(p,a): joint probability Class-instance relations: <c,i> | P(c,i) c: class i: instance P(c,i): joint probability
Probability of a predicate • Let’s consider the following example Favre pass • Assume the text has pointed out he is a quarterback • What is Favre doing with the pass? The same as other quarterbacks • The quarterbacks we observed before in the background collection – Proposition Store
Probability of a predicate Favre pass -> p | P(p|i) Favre -> quarterback | P(c|i) quarterback, pass -> throw | P(p|c) We already have: We need to estimate: P(p|c) (What other quarterbacks do with passes)
Probability of a predicate quarterback pass -> p | P(p|c) • Steve:Young pass -> throw | P(p|i) • Culpepper pass -> complete | P(p|i) • … We already have and P(p|i) comes from previous observation: Proposition Store
Evaluation • We want to address the following questions • Do we find the paraphrases required to enable Textual Entailment? • Do all the noun-noun dependencies need to be paraphrased? • How frequently NEs appear in them?
Experimental setting • Proposition Store from • 216,303 World News • 7,800,000 sentences parsed • RTE-2 (Recognizing Textual Entailment) • 83 entailment decisions depend on noun-noun paraphrases • 77 different noun-noun paraphrases
Results How frequently NEs appear in these pairs? • 82% of paraphrases contain at least one NE • 62% are paraphrasing NE-N (e.g. Vikings quarterback)
Results Do all the noun-noun dependencies need to be paraphrased? • No, only 54% in our test set • Some compounds encode semantic relations such as: • 12% are locative relations (e.g. New York club) • Temporal relations (e.g. April 23rdstrike , Friday semi-final) • Class-instance relations (e.g. quarterback Favre) • Measure, … • Some are trivial: • 27% are paraphrased with “of”
Results • Do we find the paraphrases required to enable Textual Entailment? • Yes in 63% of non-trivial cases
Results RTE-2 pair 485: Paraphrase not found United Nations vehicle ↔ United Nations produces vehicles United Nations doesn’t share any class with the instances that “produce vehicles” Toyota vehicle -> develop, build, sell, produce, make, export, recall, assemble, …
Conclusions • A significant proportion of noun-noun dependencies includes Named Entities • Some noun-noun dependencies don’t require the retrieval of implicit predicates • The method proposed is sensitive to different Nes • Different NEs retrieve different predicates • Current work: to select the most relevant paraphrase according to the text • We are exploring weighted abduction
Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and GenitivesCICLING 2012, New Delhi Thanks!