530 likes | 653 Views
Outline . Motivation Information overload in a scientific congress scenario Conference Participant Advisor Service Profile-driven paper recommending User Profiles as Bayesian Text Classifiers User Profiles learned from documents semantically indexed through a WSD procedure [*]
Outline • Motivation • Information overload in a scientific congress scenario • Conference Participant Advisor Service • Profile-driven paper recommending • User Profiles as Bayesian Text Classifiers • User Profiles learned from documents semantically indexed through a WSD procedure [*] • Empirical Evaluation • Conclusions and Future Work [*] Combining Learning and Word Sense Disambiguation for Intelligent User Profiling - IJCAI 2007
Motivation • Information overload in the scientific congress scenario
Motivation • Information overload in the scientific congress scenario
Web Personalization • Personalized systems adapt their behavior to individual users by learning user profiles • Structured model of the user interests • Exploitable for providing personalized content and services • Personalization usually done automatically based on the user profile and possibly the profiles of other users with similar interests (collaborative approach) • How personalization can be used in the scientific congress scenario?
Web Personalization in the scientific congress scenario • Learn research interests of participants from papers they rated • Store research interests in personal profiles • Used to build personalized programs delivered to participants
Learning User Profiles as a Text Categorization problem OUR STRATEGY content-based recommendations by learning from TEXTand USER FEEDBACK on items
doc1 AI is a branch of computer science doc2 the 2007 International Joint Conference on Artificial Intelligence will be held in India USER PROFILE artificial 0.02 intelligence 0.01 apple 0.13 AI 0.15 … doc3 apple launches a new product… Keyword-based profiles: problems MULTI-WORD CONCEPTS
doc1 AI is a branch of computer science doc2 the 2007 International Joint Conference on Artificial Intelligence will be held in India USER PROFILE artificial 0.02 intelligence 0.01 apple 0.13 AI 0.15 … doc3 apple launches a new product… Keyword-based profiles: problems SYNONYMY
doc1 AI is a branch of computer science doc2 the 2007 International Joint Conference on Artificial Intelligence will be held in India USER PROFILE artificial 0.02 intelligence 0.01 apple 0.13 AI 0.15 … doc3 apple launches a new product… Keyword-based profiles: problems POLYSEMY
ITem Recommender (ITR) • Advanced NLP techniques used to represent documents • Naïve Bayes text classification to assign a score (level of interest) to items according to the user preferences • Result: semantic user profile - as a binary text classifier (user-likes and user-dislikes) - containing the probabilistic model of user preferences
Word Sense Disambiguation (WSD) • Process of deciding which sense of a word is used in a specific context • WordNet as sense inventory • nouns, verbs, adverbsand adjectivesorganized into SYNonym SETs (synset), each one representing an underlying lexical concept • change of text representation from vectors (bag) ofwords (BOW) into vectors (bag) of synsets (BOS)
JIGSAW WSD algorithm • Three different strategies to disambiguate nouns, verbs, adjectives and adverbs • Effectiveness of WSD strongly influenced by the POS tag of the target word • Input: d = {w1, w2, …. , wh} document • Output: X = {s1, s2, …. , sk} (kh) • Each siobtained by disambiguating wibased on the context of each word • Some words not recognized by WordNet • Groups of words recognized as a single concept
Adaptation of the Resnik algorithm Semantic similarity between synsets inversely proportional to their distance in the WordNet IS-A hierarchy Path length similarity between synsets used to assign scores to the candidate synsets of a polysemous word JIGSAWnouns: The idea
Placentalmammal Carnivore Rodent 3 4 Mouse (rodent) 5 Feline, felid 2 Cat (feline mammal) 1 Synset Semantic Similarity SINSIM(cat,mouse) = -log(5/32)=0.806 Leacock-Chodorow similarity
mouse cat 02244530: any of numerous small rodents… 02037721: feline mammal… cat 03651364: a hand-operated electronic device … 00847815: computerized axial tomography… mouse JIGSAWnouns “The white cat is hunting the mouse” w = cat C = {mouse} white cat hunt mouse Wcat={02037721,00847815} T={02244530,03651364}
cat 02244530: any of numerous small rodents… 0.806 02037721: feline mammal… 0.806 0.0 0.806 0.0 cat 03651364: a hand-operated electronic device … 00847815: computerized axial tomography… 0.107 mouse JIGSAWnouns “The white cat is hunting the mouse” w = cat C = {mouse} white hunt Wcat={02037721,00847815} T={02244530,03651364}
Glosses JIGSAWverbs: synset description • Descriptionof synset si = gloss + example phrases in WordNet for si
JIGSAWverbs: synset description • Descriptionof synset si = gloss + example phrases in WordNet for si Example phrases
JIGSAWverbs: The idea • It tries to establish a relation between verbs and nouns • Not directly linked in WordNet • Verb w disambiguated using: • nounsin the context of w • nounsinto thedescription of each candidate synset for w
JIGSAWverbs: Example (1/4) w=play N={basketball, soccer} I play basketball and soccer • (70) play -- (participate in games or sport; "We played hockey all afternoon"; "play cards"; "Pele played for the Brazilian teams in many important matches") • (29) play -- (play on an instrument; "The band played all night long") • … nouns(play,1): game, sport, hockey, afternoon, card, team, match nouns(play,2): instrument, band, night … nouns(play,35): …
JIGSAWverbs: Example (2/4) w=play N={basketball, soccer} nouns(play,1): game, sport, hockey, afternoon, card, team, match game1 basketball1 game2 game … basketball … basketballh gamek sport1 sport2 sport MAXbasketball = MAXiSinSim(wi,basketball) winouns(play,1) … sportk
JIGSAWverbs: Example (3/4) w=play N={basketball, soccer} nouns(play,1): game, sport, hockey, afternoon, card, team, match game1 soccer1 game2 game soccer … soccerh gamek sport1 sport2 sport MAXsoccer = MAXiSinSim(wi, soccer) winouns(play,1) … sportk
JIGSAWverbs: Example (4/4) MAXbasketball Φ (play,1)= Weighted average of MAX values taking into account the position of each word in the context wrt the verb nouns(play,1) MAXsoccer ... ... Φ (play,i) nouns(play,i) Synset assigned to “play” = argmax Φ (play,i) i
JIGSAWothers • Based on the Lesk algorithm • Similarity between the glosses of each candidate sense of target wordand the glosses of words in the context
JIGSAWothers:Example (1/5) • 1. {01703749} aged, elderly, older, senior -- (advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen") • 2. {01546830} aged, ripened - (of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses") • … w=agedN={bottle, wine} I bought a bottle of aged wine Candidate synsets for the target word
JIGSAWothers:Example (2/5) • 1. {01703749} aged, elderly, older, senior --(advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen") • 2. {01546830} aged, ripened -(of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses") • … w=agedN={bottle, wine} I bought a bottle of aged wine Keep glosses of candidate synsets
JIGSAWothers:Example (2/5) • 1. {02848798} bottle --(a glass or plastic vessel used for storing drinks or other liquids; typically cylindrical without handles and with a narrow neck that can be plugged or capped) • 2. {13584548} bottle, bottleful -- (the quantity contained in a bottle) • … w=agedN={bottle, wine} I bought a bottle of aged wine Keep glosses of each word in the context
JIGSAWothers:Example (2/5) • 1. {02848798} bottle --(a glass or plastic vessel used for storing drinks or other liquids; typically cylindrical without handles and with a narrow neck that can be plugged or capped) • 2. {13584548} bottle, bottleful -- (the quantity contained in a bottle) • … w=agedN={bottle, wine} I bought a bottle of aged wine • 1. {07784932} wine, vino -- (fermented juice (of grapes especially)) • 2. {04907195} wine, wine-colored -- (a red as dark as red wine)
JIGSAWothers:Example (3/5) • 1. {02848798} bottle --(a glass or plastic vessel used for storing drinks or other liquids; typically cylindrical without handles and with a narrow neck that can be plugged or capped) • 2. {13584548} bottle, bottleful -- (the quantity contained in a bottle) • … w=agedN={bottle, wine} I bought a bottle of aged wine + • 1. {07784932} wine, vino -- (fermented juice (of grapes especially)) • 2. {04907195} wine, wine-colored -- (a red as dark as red wine) = Gloss of the whole context • a glass or plastic vessel used for storing drinks or other liquids typically cylindrical without handles and with a narrow neck that can be plugged or cappedthe quantity contained in a bottle fermented juice (of grapes especially) a red as dark as red wine
No overlap JIGSAWothers:Example (4/5) • 1. {01703749} aged, elderly, older, senior --(advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen") • 2. {01546830} aged, ripened -(of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses") w=agedN={bottle, wine} I bought a bottle of aged wine Overlap between Glosses • a glass or plastic vessel used for storing drinks or other liquids typically cylindrical without handles and with a narrow neck that can be plugged or cappedthe quantity contained in a bottle fermented juice (of grapes especially) a red as dark as red wine
JIGSAWothers:Example (4/5) • 1. {01703749} aged, elderly, older, senior --(advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen") • 2. {01546830} aged, ripened -(of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses") w=agedN={bottle, wine} I bought a bottle of aged wine • a glass or plastic vessel used for storing drinks or other liquids typically cylindrical without handles and with a narrow neck that can be plugged or cappedthe quantity contained in a bottle fermented juice (of grapes especially) a red as dark as red wine Overlap
selected synset: 01546830 JIGSAWothers:Example (5/5) • 1. {01703749} aged, elderly, older, senior --(advanced in years; "aged members of the society"; "elderly residents could remember the construction of the first skyscraper"; "senior citizen") • 2. {01546830} aged, ripened -(of wines, fruit, cheeses; having reached a desired or final condition; "mature well-aged cheeses") w=agedN={bottle, wine} I bought a bottle of aged wine • a glass or plastic vessel used for storing drinks or other liquids typically cylindrical without handles and with a narrow neck that can be plugged or cappedthe quantity contained in a bottle fermented juice (of grapes especially) a red as dark as red wine
Paper Recommending Keyword-based representation (BOW) Tokenization + Stopword + Stemming Sense-based representation (BOS) Tokenization + Stopword + POS + disambiguation Title content-based recommendations by learning from TEXT and USER RATINGS (1-5) on papers Instance (paper) Authors Abstract
Conference Participant Advisor: Login Conference Participant Advisor service
Conference Participant Advisor: Selecting Papers to train the system
Conference Participant Advisor: Getting the Personalized Program
1 - personalized conference program 2 - details about recommended papers Personalized Program delivered by mail
Conference Participant Advisor: Personalized Program + Paper details
Experimental Evaluation • Experiments: BOW-generated profiles vs. BOS-generated profiles • ISWC dataset • 100 papers accepted at ISWC 02-03 • 288 ratings collected by 11 users • 5-fold stratified cross-validation • Precision, Recall, F-measure, NDPM • Paper relevant if rating >3 • Probability of class “likes” >0.5 • Wilcoxon signed rank test • Classification for each user is a trial • Low number of independent trials • Significance level p < 0.05
Results of Semantic Profiles Evaluation +2% = +2% +1%
Conclusions & Future Works • Conference Participant Advisor • Intelligent service relying on concept-based profiles • WSD based on linguistic ontology • As a future work integration of: • domain-specific ontologies in the process of semantic representation and indexing of documents • social networks of conference participants as additional source of information
Service details • Service deployed in VIKEF project at:
Bag of Synsets • Reduction of features • Recognition of bigrams • Synonyms represented by the same synsets Bag of Words Bag of Synsets