590 likes | 724 Views
SPred : Large-scale Harvesting of Semantic Predicates. Cup of. Tiziano Flati and Roberto Navigli. “. Over 2.25 billion cups of coffee are consumed in the world every day. ”. c up of *. c up of *. Objective :. cup of *. Challenge #1: discovering representative arguments.
E N D
SPred: Large-scale Harvesting of Semantic Predicates Cup of Tiziano Flati and Roberto Navigli
“ Over 2.25 billion cups ofcoffee are consumed in the world every day ” SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
cup of * SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
cup of * SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Objective: cup of * SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Challenge #1:discoveringrepresentativearguments SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Challenge #2:inferringsemanticclasses cup of * SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
X suchas Y GAS ICE CREAM LEXICAL PATTERNS SELECTIONAL PREFERENCES MEAT FISH [Resnik ‘96, Erk ‘07, Chambers& Jurasky ‘10] [Hearst 92, Kozareva& Hovy ‘10, Wu& Weld ‘10] EAT SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
X suchas Y GAS ICE CREAM LEXICAL PATTERNS SELECTIONAL PREFERENCES MEAT FISH [Resnik ‘96, Erk ‘07, Chambers& Jurasky ‘10] [Hearst 92, Kozareva& Hovy ‘10, Wu& Weld ‘10] EAT SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
X suchas Y GAS ICE CREAM LEXICAL PATTERNS SELECTIONAL PREFERENCES MEAT FISH [Resnik ‘96, Erk ‘07, Chambers& Jurasky ‘10] [Hearst 92, Kozareva& Hovy ‘10, Wu& Weld ‘10] EAT SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
SPred Challenge #2:inferringsemanticclasses Challenge #1:discoveringrepresentativearguments SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
SPred Challenge #2:inferringsemanticclasses CONTRIBUTION # 1 Capturingconcepts for long tailargumentsusing a novelwikification procedure SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
SPred CONTRIBUTION # 1 Capturingconcepts for long tailargumentsusing a novelwikification procedure CONTRIBUTION # 2 InferringWordNetsemanticclasses from a distribution of Wikipedia pages SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
METHODOLOGY SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
HARVESTING ARGUMENTS FROM WIKIPEDIA … cup of * * wasdesigned by the biggest * in 1987 a very big * … LINKING ARGUMENTS TO WIKIPEDIA AND WORDNET WordNet … cup of [Beverage] [Structure]wasdesigned by the biggest[Event] in 1987 a very big [Phenomenon] … LINKING ARGUMENTS FROM WORDNET TO SEMANTIC CLASSES WordNet SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
LEXICAL PREDICATE cup of* SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
* cup of wasdesigned by the biggest in 1987 a very big … LEXICAL PREDICATE * * Cup ofcoffee * SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
FILLING ARGUMENT cup of coffee SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
redwine Italy wasdesigned by wasdesignedby artist hotel … FILLING ARGUMENT cup of cup of dress bridge a very big a very big … Cup of coffee SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Example output [Liquid] cup of[Beverage] SEMANTIC PREDICATE [Coffee] [Alcohol] [Milk] [Irish coffee] SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
[Beverage] [Country] wasdesigned by wasdesigned by [Artist] [Building] … cup of cup of [Clothing] [Platform] a very big a very big … Cup ofBeverage SEMANTIC PREDICATE SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
HARVESTING ARGUMENTS FROM WIKIPEDIA lexical predicate … cup of * * wasdesigned by the biggest * in 1987 a very big * … LINKING ARGUMENTS TO WIKIPEDIA AND WORDNET WordNet … cup of Beverage Structurewasdesigned by the biggestEvent in 1987 a very big Phenomenon … lexical predicate CLASS CLASS CLASS CLASS LINKING ARGUMENTS FROM WORDNET TO SEMANTIC CLASSES WordNet SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
( ) cup of* SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
coffee tea Italy milk yeast … cup of* SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
lexical predicate * HARVESTING ARGUMENTS FROM WIKIPEDIA … cup of * * wasdesigned by the biggest * in 1987 a very big * … LINKING ARGUMENTS TO WIKIPEDIA AND WORDNET WordNet … cup of Beverage Structurewasdesigned by the biggestEvent in 1987 a very big Phenomenon … lexical predicate [CLASS] [CLASS] [CLASS] [CLASS] LINKING ARGUMENTS FROM WORDNET TO SEMANTIC CLASSES WordNet SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Earl grey tea Earl grey tea cup of cup of Earl grey tea cup of cup of SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Researchquestion#1: How to determinewhich Wikipedia page best corresponds to an argument? … and drank over twenty cups of coffeeeach day… ? SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Wikipedianswilloccasionallylink the arguments for us William G. McGowan He was also a three-pack-a-day smoker and drank over twenty cups of coffeeeach day until his first heart attack. As leader of MCI, he labored for several years to gain the financing and … For free! SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Problem #1: Notmanyarguments are linked Allinstances of ‘coffee’ 113 4 linked ? SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Problem #1: Notmanyarguments are linked Allinstances of ‘coffee’ 113 4 linked ? How to link theseinstances? SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
1st heuristic: Onesense per page If the argument text hasbeenlinkedsomewhere else in the article, use thatlink’s page Health effects of caffeine Manuallylinked the greatest benefits were observed in those who drankcoffeefor a long period in their lifetime. […] roughly 80 to 100cups ofcoffeefor an average adult taken within a limited time… Onesense per page SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
2nd heuristic: Trust the inventory Ifthere’sonlyone page for thatargument text, link to that page Trust the inventory his days in the library with a cup ofEarl Grey tea. The main character of the… 1 sense only! SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Problem #2: Sameargumentlinkedto multiple pages Allinstances of ‘water’ 78 ? linked 100% 4 2 linked SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Researchquestion #2: How to determinewhichWordNetconcepts best represent Wikipedia pages? ( ) cup of* SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
BabelNet: a mapping from Wikipedia pages to concepts [Navigli & Ponzetto, 2012] NEsandspecializedconcepts from Wikipedia Concepts from WordNet Conceptsintegrated from bothresources SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Argumentmapping Coffee Coffee is a brewedbeverage with a distinct aroma and flavor, prepared from the roasted seeds… SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Argumentmapping The vastmajority of Wikipedia pages [4M+]do nothave a correspondingconcept in WordNet [117K+] ( ) = ? SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
WCL Argumentmapping: hypernymextraction [Navigli & Velardi, 2010] + link Earl Grey tea Earl Grey tea is a tea with a distinctive flavour and aroma derived from the addition of oil extracted from the rind of the bergamot orange, a fragrant citrus fruit. Traditionally, the term "Earl Grey“… Tea Hypernymextracted by WCL Tea is an aromatic beverage commonly prepared by pouring hot or boiling water… Targetlemma Definitionalsentence SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Argumentmapping: an example WCL We can thussynergisticallymap to WordNet more than 500K pages! Tea Tea is an aromatic beverage commonly prepared by pouring hot or boiling water… WCL Earl Grey tea BabelNet Earl Grey tea is a tea with a distinctive flavour and aroma derived from… In literature, the main character in Haruki Murakami's Kafka on the Shore starts his days in the library with a cup of Earl Grey tea. The main character of the… Trust the inventory SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
lexical predicate HARVESTING ARGUMENTS FROM WIKIPEDIA … cup of * * wasdesigned by the biggest * in 1987 a very big * … LINKING ARGUMENTS TO WIKIPEDIA AND WORDNET WordNet … cup of Beverage Structurewasdesigned by the biggestEvent in 1987 a very big Phenomenon … LINKING ARGUMENTS FROM WORDNET TO SEMANTIC CLASSES SEMANTIC PREDICATE WordNet SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Researchquestion#3: how to generalizeWordNetconceptsassociated with arguments? SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Generalization to semanticclasses 3K+mostfrequentconcepts freely downloadable CORE CONCEPTS {} Core concepts of {} {} {} {} {} {} {} SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Generalization to semanticclasses 3K+mostfrequentconcepts freely downloadable CORE CONCEPTS {} Core concepts of {} {} {} {} Semantic Class of {} {} {} {} SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Generalization to semanticclasses • By repeating the same procedure for all thearguments of a lexical predicate wediscoverclusters of arguments for eachsemanticclass Semantic class Tea Tea is an aromatic beverage commonly prepared by pouring hot or boiling water… WCL Earl Grey tea BabelNet Earl Grey tea is a tea with a distinctive flavour and aroma derived from… In literature, the main character in Haruki Murakami's Kafka on the Shore starts his days in the library with a cup of Earl Grey tea. The main character of the… Trust the inventory SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
cup of* wine whitewine … coffee cappuccino … earlgrey tea tea … water seawater … Classessorted by frequency! SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
EVALUATION SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
1st Evaluation Semanticclass ranking quality SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Experimental Setup DATASET 1 50 random lexicalpredicates from Oxford Advanced Learner's Dictionary SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Precision @ K [Wine] • [Feeling] [Coffee] [Water] • [Dairyproduct] [Country] … Top K semantic classes Importance # correct P@K = K SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli
Results for dataset 1 Precision@K K (semanticclasses) SPred: Large-scale Harvesting of Semantic Predicates Flati, Navigli