150 likes | 240 Views
Exploring Folksonomy for Personalized Search. Shengliang Xu , Shenghua Bao , Shanghai Jio Tong University Ben Fei , Zhong Su , Yong Yu , IBM China Research Lab. SIGIR 2008 Summarized and presented by Dongjoo Lee , IDS Lab. Center for E -Business Technology Seoul National University
E N D
Exploring Folksonomy for Personalized Search ShengliangXu, ShenghuaBao, Shanghai Jio Tong University Ben Fei, Zhong Su, Yong Yu, IBM China Research Lab. SIGIR 2008 Summarized and presented by Dongjoo Lee, IDS Lab. Center for E-Business Technology Seoul National University Seoul, Korea
Contents • Introduction • Using Folksonomy for Personalized Search • Analysis of Folksonomy • A Personalized Search Framework • Automatic evaluation framework • Experiments IDS Lab. 2009 Winter Seminar
Personalized Search • People differ significantly in the search results they considered to be relevant for the same query • Personal data • User manually selected interests • Web browser bookmarks • User’s personal document corpus • Search engine click-through history • Folksonomy • Social annotations or tags • A user has a given annotation may be interested in the web pages that have the same annotation • ODP was used to process search results (re-rank the results) IDS Lab. 2009 Winter Seminar
Analysis of Folksonomy • Social annotations as category names • Social annotations as keywords • Collaborative link structure W wi,j is the number of annotations that ui gives to pj IDS Lab. 2009 Winter Seminar
A Personalized Search Framework • Ranking with the Vector Space Model • Ranking aggregation with Weighted Borda-Fuse (WBF) Query and page are term vector User and page are topic vector IDS Lab. 2009 Winter Seminar
Topic Space Selection • Folksonomy: Social annotations as topics • Weighting schemes • Taxonomy: ODP categories as topics Interests Topics tf-idf BM25 User annotated tags Web page has tags ODP categories have descriptions IDS Lab. 2009 Winter Seminar
Interest and Topic Adjusting T W R Users’ interests matrix Pages’ topic matrix Adjacent matrix of users and pages 1) User interest adjusting by related web pages 2) Web page topic adjusting by related users □ Matrix notation IDS Lab. 2009 Winter Seminar
Evaluation Framework for Personalized Search • Assumption.The users’ bookmarking and tagging actions reflect their personal relevance judgment. • Three consideration • Keyword query is the most popular query representation. • Different users may consider a web page to be relevant to different queries. • Different users may choose various terms as social annotations for the same web page. IDS Lab. 2009 Winter Seminar
Experiment Setup • Dataset • Del.icio.us (web bookmarking), Dogear (eterprise bookmarking) • Manual Preprocessing • Removed too personal tags (toread, system:imported) • Split concatenated tags (javaprogramming) • Randomly split 80% training part and 20% test part IDS Lab. 2009 Winter Seminar
Experiment Setup • Personalized Search Framework Implementation • Firstly, retrieve documents with text matching model • Re-rank by using two IR model (BM25 and LMIR) • Parameter setting • α:0.5, β: 0.5, γ:use procedure 1. • Baseline Models • Text: Non-personalized text matching • ODP1: 16 ODP categories • ODP2: 1171 ODP categories • AC: topic space is folksonomy, topic matching is done by counting the matched annotations • Evaluation Metric • Mean MAP (MMAP), (MAP is Mean Average Precision) IDS Lab. 2009 Winter Seminar
Performance (BM25) IDS Lab. 2009 Winter Seminar
Performance (LMIR) IDS Lab. 2009 Winter Seminar
Performance Analysis • Text < ODP1 < ODP2 < AC < tfidf, bm25 • Topic Adjusting Algorithm is effective • Search effectiveness seems to reduce when the amount of data increase • Social annotations owned by the users who own a small amount of total social annotations are much semantically richer than the social annotations owned by the users who own a relatively large amount of total social annotations. • Most of the users who have many bookmarks directly exported their desktop bookmarks into the folksonomy systems. IDS Lab. 2009 Winter Seminar
Discussions • Integrating folksonomy systems with search engines • Yahoo purchased Del.icio.us and Flickr • Sparseness of social annotations • Collect tagging data automatically from user click through histories • Automatically generate personalized annotations based on users’ personal document corpus • Folksonomy topic dimension reduction • Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (PLSA) IDS Lab. 2009 Winter Seminar
My Opinions • Pros • Idea is simple but the paper is written well • Reasonable assumptions • Various analysis with various statistical measures • MMAP, t-test, … • Cons • Not much new idea • Too many abbreviations without references or full words • Needs more graphs IDS Lab. 2009 Winter Seminar