90 likes | 281 Views
Components for a semantic textual similarity system. Focus on word and sentence similarity Formal side: define similarity in principle. Characterizing word meaning in context.
E N D
Components for a semantic textual similarity system • Focus on word and sentence similarity • Formal side: define similarity in principle
Characterizing word meaning in context • Given a word in a particular sentence context: Can we characterize its meaning without reference to dictionary senses? • Why? • For many lemmas, hard to draw sentence boundaries (-> Kilgarriff, Hanks in lexicography; Kintsch in cognition; Cruse, Tuggy in cognitive linguistics)
Characterizing word meaning in context • How? • Compute vector space representation for word in particular sentence context • Read off: contextually appropriate paraphrases
Approaches • Make clusters that correspond to senses. In given context, compute weighting over clusters / choose cluster (Reisinger & Mooney, Dinu & Lapata) • One vector per word: mixes senses • Use context to “bend” word vector, adapt it to given context (Mitchell & Lapata, Erk & Pado, Thater et al) • Language modeling (Washtell, Moon & Erk)
Using contextualized word vectors • Part of sentence similarity approach (Reddy et al) • Paraphrases • Determine inference rule applicability
Viewpoint from vector space approaches to sentence similarity • Mitchell & Lapata; Clark, Coecke, Sadrzadeh, Grefenstette; Baroni & Zamparelli, Socher et al • Mostly applied to phrase pairs / sentence pairs with same structure • Even Socher et al seem to focus on cases with mostly parallel sentence structure
Similarity between sentences of dissimilar structure • Central: MWE and alternation detection • lemma-specific paraphrases and MWEs:covered by automatically induced inference rules • alternations: • passivization • John broke the vase / the vase broke • Principled approach: Graph rewriting system to transform sentence structure (Bar-Haim et al)
A plug for events and SRL • Central: identifying events & participants about which the sentences speak • Semantically equivalent sentences talk about the same events • Hence, SRL, coreference
A plug for events and SRL • Once events have been identified: • time and date expressions • modals • negation • embedded propositions