Anselmo Peñas NLP & IR Group, UNED, Spain Ekaterina Ovchinnikova

Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and GenitivesCICLING 2012, New Delhi AnselmoPeñas NLP & IR Group, UNED, Spain Ekaterina Ovchinnikova USC – Information Science Institute, USA

Texts omit information • Humans optimize language generation effort • We omit information that we know the receptor is able to predict and recover • Our research goal is to make explicit the omitted information in texts

Implicit predicates • In particular, some noun compounds and genitives are used in such way • In these cases, we want to recover the implicit predicates • For example: • Morning coffee -> coffee drunk in the morning • Malaria mosquito -> mosquito that carries malaria

How to find the candidates? • Nakov & Hearst 2006 • Search the web • N1 N2 -> N2 THAT * N1 • Malaria mosquito -> mosquito THAT * malaria • Here we use Proposition Stores • Harvest a text collection that will serve as context • Parse documents • Count N-V-N, N-V-P-N, N-P-N, … structures • Build Proposition Stores (Peñas & Hovy, 2010)

Proposition Stores Example: propositions that relate Bomb, attack • npn:[bomb:n, in:in, attack:n]:13. • nvpn:[bomb:n, explode:v, in:in, attack:n]:11. • nvnpn:[bomb:n, kill:v, people:n, in:in, attack:n]:8. • npn:[attack:n, with:in, bomb:n]:8. • … All of them could be paraphrases for the noun compound “bomb attack”

NE Semantic Classes Now, What happens if we have a Named Entity? • Shakespeare’s tragedy • -> write • Why? • Consider • John’s tragedy • Airbus’ tragedy

NE Semantic Classes We are considering the “semantic classes” of the NE Shakespeare -> writer writer, tragedy -> write

Class-Instance relations • Fortunately, relevant semantic classes are pointed out in texts through well-known structures • appositions, copulative verbs, “such as”, … • Here we take advantage of dependency parsing to get class-instance relations NNP NNP NNP nn appos be NN NN NN

Class-Instance relations World News has_instance(leader,'Yasir':'Arafat'):1491. has_instance(spokesman,'Marlin':'Fitzwater'):1001. has_instance(leader,'Mikhail':'S.':'Gorbachev'):980. has_instance(chairman,'Yasir':'Arafat'):756. has_instance(agency,'Tass'):637. has_instance(leader,'Radovan':'Karadzic'):611. has_instance(adviser,'Condoleezza':'Rice'):590. …

So far Propositions: <p,a> | P(p,a) p: predicate a: list of arguments <a1 …an> P(p,a): joint probability Class-instance relations: <c,i> | P(c,i) c: class i: instance P(c,i): joint probability

Probability of a predicate • Let’s consider the following example Favre pass • Assume the text has pointed out he is a quarterback • What is Favre doing with the pass? The same as other quarterbacks • The quarterbacks we observed before in the background collection – Proposition Store

Evaluation • We want to address the following questions • Do we find the paraphrases required to enable Textual Entailment? • Do all the noun-noun dependencies need to be paraphrased? • How frequently NEs appear in them?

Experimental setting • Proposition Store from • 216,303 World News • 7,800,000 sentences parsed • RTE-2 (Recognizing Textual Entailment) • 83 entailment decisions depend on noun-noun paraphrases • 77 different noun-noun paraphrases

Results How frequently NEs appear in these pairs? • 82% of paraphrases contain at least one NE • 62% are paraphrasing NE-N (e.g. Vikings quarterback)

Results Do all the noun-noun dependencies need to be paraphrased? • No, only 54% in our test set • Some compounds encode semantic relations such as: • 12% are locative relations (e.g. New York club) • Temporal relations (e.g. April 23rdstrike , Friday semi-final) • Class-instance relations (e.g. quarterback Favre) • Measure, … • Some are trivial: • 27% are paraphrased with “of”

Results • Do we find the paraphrases required to enable Textual Entailment? • Yes in 63% of non-trivial cases

Results RTE-2 pair 485: Paraphrase not found United Nations vehicle ↔ United Nations produces vehicles United Nations doesn’t share any class with the instances that “produce vehicles” Toyota vehicle -> develop, build, sell, produce, make, export, recall, assemble, …

Conclusions • A significant proportion of noun-noun dependencies includes Named Entities • Some noun-noun dependencies don’t require the retrieval of implicit predicates • The method proposed is sensitive to different Nes • Different NEs retrieve different predicates • Current work: to select the most relevant paraphrase according to the text • We are exploring weighted abduction

Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and GenitivesCICLING 2012, New Delhi Thanks!

Anselmo Peñas NLP & IR Group, UNED, Spain Ekaterina Ovchinnikova