240 likes | 521 Views
Toward the Next generation of Recommender systems. 2008. 11.05 IEEE Transactions on Knowledge and Data Engineering Volume 17 , Issue 6 (June 2005) Written by Gediminas Adomavicius , Alexander Tuzhilin Summarized by Gihyun Gong. About paper.
E N D
Toward the Next generation of Recommender systems 2008. 11.05 IEEE Transactions on Knowledge and Data Engineering Volume 17 , Issue 6 (June 2005) Written by GediminasAdomavicius, Alexander Tuzhilin Summarized by Gihyun Gong
About paper • This paper is about an overview of recommendation system • Focused on rating based recommendation which is most popular • Content based • Collaborative filtering • Hybrid methods • Extending capabilities of recommendation system
Outline • About recommendation • Recommendation methods • Demographic filtering • Content-based Methods • Collaborative Methods • Hybrid Methods • Current research issues in recommendation system
Recommendation • Recommendation is type of information filtering technique that attempts to present information items (movies, music, books, news, images, web pages) that are likely of interest to the user • Recommendation can be formulated as : C : all users S : set of all possible item u : function that measures the usefulness of item s to user c • Recommendation is reduced to the problem ofestimating ratings for the items that have not been seen by a user • How to rating? • How to estimating?
Recommendation (cont’d) • Problem of recommender system • Usually not defined on the whole C X S space, but only on some subset of it • Recommendation engine should be able to estimate the ratings of the non-rated movie/user
Recommendation system • Recommendation system is a system which has the effect of guiding the user in a personalized way to interesting or useful objects in a large space of possible options • Recommender systems are usually classified into the following categories, based on how recommendations are made: • Demographic filtering • Content-based recommendations: The user will be recommended items similar to the ones the user preferred in the past • Collaborative recommendations: The user will be recommended items that are preferred by other people with similar tastes and preferences • Hybrid approaches: These methods combine collaborative and content-based methods.
Demographic filtering • Uses demographic information • Ages, Jobs, Location, … • Advantages • No feedback is needed • No cold start problem • Disadvantages • Can not provide personalization • Low accuracy • Too general
Content-based recommendation • Recommend items similar to those users preferred in the past • User preference profile is the key • Matching “user preferences” with “item characteristics” • Designed mostly to recommended text-based items • The content in these system is usually described with keywords • Similarity measure • TF-IDF • Cosine similarity
Similarity function • TF-IDF • N is the number of documents • Ni is How many times keyword ki is appears in the document • Fi,j is the number of times keyword ki is appears in the document j • Cosine Similarity • For text matching, the attribute vectors A and B are usually the tf-idf vectors of the documents. v1 user v2
Limitation of Content-based method • Limited Content Analysis • This method is based on text, but not all content is well represented by keywords • Picture, Taste, … • Overspecialization • User is limited to being recommended items already rated • Unrated items not shown • Use random or mutation in genetic algorithm to solve • New User Problem • This method uses user preference profile • New user have very few ratings (or no history available) • System needs new user’s rating of sample items • However, people usually do not want to rate sample items
Collaborative Filtering • Using Trend information, 『Word of Mouth』 • Basic idea of CF • Build a ratings table from user rating. • Compare user’s ratings, and calculate similarity between users.We call the user group which presents high similarity that ‘Nearest Neighborhood’ • Predict user preference based on rating of Nearest neighborhood.
Collaborative Filtering methods • Memory-based (or Nearest-Neighborhood) • Similarity based model • Use entire collection of previously rate item by the user • Store all user information in a Database • Model-based • Probabilistic model • Use collection of rating to learn a model, which is used to make rating prediction • Based on machine-learning • Bayesian network, Clustering, NN, …
Advantages of Collaborative Filtering • Can deal with multimedia contents • Can recommend based on user preference and quality of item • Can recommend serendipity item
Limitation of Collaborative method • New User Problem • Mustfirst learn the user’s preferences from the ratings that theuser gives • New Item Problem • Until the new item is rated by a substantial number of users, the recommender system would not be able to recommend it • User’s rating problem • Different users might use different scales • Sparsity • The number of ratings alreadyobtained is usually very small compared to the number ofratings that need to be predicted • Scalability • Computing cost grows with C X S space • System typically have to search millions of users and items, it causes a serious scalability problem • However, these correlations will change when new users are added • Adaptability • Requirement of a user may change over time
Surveys on Hybrid method • Combining separate recommender • Linear combination of two outputs • Voting scheme • Adding Content-based to Collaborative model • Add Content-based profile for each user • Use filterbot, the virtual user • Adding Collaborative to Content-based model • Add user profiles presented by term vector for each items • Single unifying model • Knowledge-based techniques • Entrée uses some domain knowledge • Quickstep, Foxtrot system uses topic ontology
Extending capabilities • Comprehensive understanding of Users and Items • Profiles in pure content-based and collaborative-based still tend to be quite simple and do not utilize some of the more advanced profiling techniques • In addition to using traditional profile features, such as keywords and simple user demographics more advanced profiling techniques based on data mining rules, sequences, and signatures that describe a user’s interests can be used to build user profiles
Extending capabilities (cont’d) • Multidimensionality of Recommendations • Current recommendation system uses only 2-dimension • User x Item • We can extend dimension of recommendation • Context(TPOK), Demographic information, …
Extending capabilities (cont’d) • Example of multidimension : The movie • Traditional recommendation consider just 2 space • Who is the user? • What movie? • We can consider other information • Characteristics of the movie? • Person wants to see movie? • Where and how the movie will be seen? • With whom the movie will be seen? • When will the movie be seen?
Extending capabilities (cont’d) • Multicriteria Rating • To expand rating criteria • Taking a linear combination of multiple criteria and reducing the problem to a single-criterion optimization problem • Optimizing the most important criterion and converting other criteria to constraint
Extending capabilities (cont’d) • Restaurant example :
Extending capabilities (cont’d) • Nonintrusiveness • The problem of feedback normalizing • One way to explore the intrusiveness problem is to determine an optimal number of ratings the system should ask from a new user • This topic is related to Opinion Mining
Extending capabilities (cont’d) • Flexibility • Most of the recommendation methods are “hard-wired” into the systems • Therefore, the end-user cannot customize recommendations according to his or her needs in real time. • Also, most of the recommender systems recommend only individual items to individual users and do not deal with aggregation. • However, it is important to be able to provide aggregated recommendations in a number of applications, such as recommend brands or categories of products to certain segments of users (e.g. Vacations in Florida - Students). • One way to support aggregated recommendations is by utilizing the OLAP-based approach. • Recommendation Query Language (RQL)
Extending capabilities (cont’d) • RQL is SQL-like language forexpressing flexible user-specified recommendation requests • “recommend to each userfrom New York the best three movies that are longer thantwo hours” can be expressed in RQL”.