670 likes | 941 Views
Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity. Mohammad Taher Pilehvar David Jurgens Roberto Navigli. Semantic Similarity ; how similar are a pair of lexical items ?. Semantic Similarity. Sense Level. Sentence Level. Word Level.
E N D
Align, Disambiguate, and WalkA Unified Approach for Measuring Semantic Similarity Mohammad TaherPilehvar David Jurgens Roberto Navigli
Semantic Similarity Sense Level Sentence Level Word Level
Semantic Similarity Sense Level Sentence Level Word Level
Semantic Similarity Sentence level • Applications • Paraphraserecognition (Tsatsaronis et al., 2010) • MT evaluation (Kauchak and Barzilay, 2006) • QuestionAnswering (Surdeanu et al., 2011) • TextualEntailment (Dagan et al., 2006) The worker was terminated The boss fired him
Semantic Similarity Sense Level Sentence Level Word Level
SemanticSimilarity Word level • Applications • Lexicalsimplification (Biran et al., 2011) Locuacious → Talkative • Lexicalsubstitution (McCarthy and Navigli, 2009) • heater fireplace
Semantic Similarity Sense Level Sentence Level Word Level
SemanticSimilarity Sense level firesense #1 • Applications • Coarseningsenseinventories(Snow et al., 2007) • Semanticpriming (Neely et al., 1989) firesense #8
Exisiting Similarity Measures Allison and Dix (1986) Gusfield (1997) Wise (1996) Patwardan(2003) Keselj et al. (2003) Banerjee and Pederson (2003) Salton and McGill (1983) Hirst and St-Onge (1998) Gabrilovich and Markovitch (2007) Lin (1998) Radinsky et al. (2011) Jiang and Conrath (1997) Ramage et al. (2009) Resnik (1995) Yeh et al., (2009) Sussna (1993, 1997) Turney (2007) Wu and Palmer (1994) Landauer et al. (1998) Leacock and Chodorow (1998) Sentence Word Sense
Exisiting Similarity Measures But Allison and Dix (1986) Gusfield (1997) None directlycoversalllevels atthe sametime Wise (1996) Patwardan(2003) Keselj et al. (2003) Banerjee and Pederson (2003) Salton and McGill (1983) Hirst and St-Onge (1998) Gabrilovich and Markovitch (2007) Lin (1998) Different output scales Radinsky et al. (2011) Jiang and Conrath (1997) Ramage et al. (2009) Resnik (1995) Yeh et al., (2009) Sussna (1993, 1997) Different internal representations which are not comparable to each other Turney (2007) Wu and Palmer (1994) Landauer et al. (1998) Leacock and Chodorow (1998) Sense Word Sentence
Contribution Sense Word sentence A unified representation for anylexical item State-of-the-art performance in eachlevel Using onlyWordNet
Advantage 1Unified representation sense sentence text word word sense Alllexicalitems this way
Advantage 2Cross-level semantic similarity sense word sentence A large and imposinghouseMansion Residence#3
Advantage 3Sense-level operation sense word set of senses Ambiguity sentence sense Worker was fired. He was terminated. worker#1 fire#4 terminate#4
Outline sense word text Introduction Methodology Experiments
How Does it work? lexical item 1 lexical item 2 semantic signature 2 semantic signature 1
Semantic Signature sense word sentence Unifiedsemanticsignature
Semantic Signature sense word sentence Alignment-baseddisambiguation Sense or set of senses Unifiedsemanticsignature
Semantic Signature A woman isfryingfood
Semantic Signature Distributional representation over allsynsets in WordNet . . . Importance of thissynset (syn_4) for ourlexical item
Semantic Signature a woman isfryingfood oil#4 ship#1 sugar#2 table#3 physics#1 carpet#2 kitchen#3 cooking#1 natural_gas#2 frying_pan#1
PersonalizedPageRank a woman isfryingfood { , , } 1 2 1 woman fry food n v n 2 fry v 1 woman 1 n food n
1 food n 2 food n 3 cook v 1 beverage 2 dish n n 1 fat 1 cooking n 1 french_fries n Theseweightsform a semanticsignature n 1 nutriment n 2 fry v
Comparing Semantic Signatures • Parametric • Cosine • Non-parametric • WeightedOverlap • Top-k Jaccard
Alignment-based disambiguation sense word sentence Alignment-baseddisambiguation Sense or set of senses Unifiedsemanticsignature
Alignment-based disambiguation sense word sentence sentence Alignment-baseddisambiguation Sense or set of senses Unifiedsemanticsignature
Whyisdisambiguationneeded? The worker was fired He wasterminated
Alignment-based disambiguation An employeewasterminated from work by his boss. A manager fired the worker.
Alignment-based disambiguation An employeewasterminatedfrom workby hisboss. A managerfiredthe worker.
Alignment-based disambiguation employeen bossn managern firev workn workern terminatev
Alignment-based disambiguation employeen bossn managern firev workn workern terminatev 1 1 1 1 1 1 1 manager boss worker fire work terminate employee n n n v n v n 2 2 2 2 manager 2 boss fire 2 worker work terminate n n n v n n v 3 3 work fire 3 terminate n v . . . . . . v 4 terminate . . . v . . . Sentence 1 Sentence 2
Alignment-based disambiguation employeen bossn managern firev workn workern terminatev 1 1 1 1 1 1 1 manager boss worker fire work terminate employee n n n v n v n 2 2 2 2 manager 2 boss fire 2 worker work terminate n n n v n n v 3 3 work fire 3 terminate n v . . . . . . v 4 terminate . . . v . . . Sentence 1 Sentence 2
Alignment-based disambiguation 0.5 0.3 1 1 1 1 boss work terminate employee n n v n 2 2 boss 2 work terminate 1 manager n n v n 3 work 3 terminate n . . . v 4 terminate v 2 manager . . . n n Tversky (1977) Markman and Gentner (1993)
Alignment-based disambiguation 0.5 0.3 0.3 1 1 1 1 boss work terminate employee n n v n 2 2 boss 2 work terminate 1 manager n n v n 3 work 3 terminate n . . . v 4 terminate v 2 manager . . . n n
Alignment-based disambiguation employeen bossn managern firev workn workern terminatev 1 1 1 1 1 1 1 manager boss worker fire work terminate employee n n n v n v n 2 2 2 2 manager 2 boss fire 2 worker work terminate n n n v n n v 3 3 work fire 3 terminate n v . . . . . . v 4 fire 4 terminate v v . . . Sentence 1 Sentence 2 . . .
Alignment-based disambiguation employeen bossn managern firev workn workern terminatev 1 1 1 1 1 1 1 manager boss worker fire work terminate employee n n n v n v n 2 2 2 2 manager 2 boss fire 2 worker work terminate n n n v n n v 3 3 work fire 3 terminate n v . . . . . . v 4 fire 4 terminate v v . . . Sentence 1 Sentence 2 . . .
Outline sense word text Introduction Methodology Experiments
Experiments • Sentencelevel • SemanticTextualSimilarity (SemEval-2012)
Experiments • Sentencelevel • SemanticTextualSimilarity (SemEval-2012) • Word level • Synonymyrecognition (TOEFL dataset) • Correlation-based (RG-65 dataset)
Experiments • Sentencelevel • SemanticTextualSimilarity (SemEval-2012) • Word level • Synonymyrecognition (TOEFL dataset) • Correlation-based (RG-65 dataset) • Senselevel • CoarseningWordNetsenseinventory
Experiment 1Similarity at Sentence level • Semantic Textual Similarity (STS-12) • 5 datasets • Three evaluation measures • ALL, ALLnrm, and Mean • Top-ranking systems • UKP2 (Bär et al., 2012) • TLSim and TLSyn (Šarić et al., 2012)
Experiment 1Similarity at Sentence level Features • Main features • Cosine • Weighted Overlap • Top-k Jaccard • String-based features • Longest common substring • Longest common subsequence • Greedy string tiling • Character/word n-grams