140 likes | 307 Views
IDEAS 2011 Lisbon 21-23 September. Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems. Data Mining Research Group http://mida.usal.es. María N. Moreno, Saddys Segrera , Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez. Department of
E N D
IDEAS 2011 Lisbon 21-23 September Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems Data MiningResearchGroup http://mida.usal.es María N. Moreno, SaddysSegrera, Vivian F. López, M. Dolores Muñoz and Ángel Luis Sánchez Department of Computing and Automatic
Contents • Introduction • Recommender Systems • Recommendation framework • Case Study • Conclusions
commerce Server Catalog Information Client Introduction • Recommender systems • Applications: e-commerce, e-learning, tourism, news’ pages… • Drawbacks: low performance, low reliability of recommendations… Recommender systems provide users with intelligent mechanisms to find products to purchase
Introduction • Proposal • Objective: overcome critical drawbacks in recommender systems • Methodology: Semantic based Web Mining • Associative classification (Web Mining) • Machine learning technique that combines concepts from classification and association • Domain-specific ontology (Semantic Web) • Enrichment of the data to be mined with semantic annotations
Recommender Systems • Classification of recommendation methods • Content-based: compare text documents to user profiles • Collaborative filtering: is based on opinions of other users (ratings) • Memory based (User-based): find users with similar preferences (neighbors) by means of statistical techniques • Model based (Item-based): use data mining techniques to develop a model of user ratings
Recommender Systems • Critical drawbacks • Sparsity: the number of ratings needed for prediction is greater than the number of the ratings obtained from users • Scalability: performance problems presented mainly in memory-based methods where the computation time grows linearly with both the number of customers and the number of products in the site • First-rater problem: new products never have been rated, therefore they cannot be recommended • Cold-Start problem: new users cannot receive recommendations since they have no evaluations about products
Recommendation framework • Associative classification (Web Mining) • Sparsity: slightly sensitive to sparse data • Scalability: model based approach • Domain-specific ontology (Semantic Web) • First-rater problem: • Use of taxonomies to classify products • Induction of abstracts patterns which relate user profiles with categories of products • Cold-Start problem: • Recommendations based on user profiles
Case Study • MovieLens Data Movies Data • UserData Gender Binary Occupation String ID Num. Age Num. ID Num. Title String Zip Num. Genre (19 attributes) Binary Ratings Data Rating Num. (1 - 5) ID Num. UserID Num. MovieID Num.
Case Study • MovieLens Data *User Age < 18 [18, 24] [25, 34] [35, 44] [45, 49] [50, 55] > 55 ID Num. User Gender Binary User Occupation String Movie Title String *Movie Genre String
Case Study • Ontology definition
Case Study • Results • Associative classification methods (CBA, CMAR, FOIL and CPAR) were compared to non-associative classification algorithms
Conclusions • A framework for recommender systems is proposed in order to overcome some critical drawbacks • The proposal combines web mining methods and domain specific ontologies in order to induce models at two abstraction levels: • The low level model relates users, movies and ratings for making the recommendations • High level model is used for recommender not rated movies or for making recommendation to new users and overcome the first-rater and the cold-start problem • The off-line model induction avoids scalability problems in recommendation time • Associative classification methods provides a way to deal with sparsity problem
IDEAS 2011 Lisbon 21-23 September THANKS FOR YOUR ATTENTION ! Mining Semantic Data for Solving First-rater and Cold-start Problems in Recommender Systems María N. Moreno*, SaddysSegrera, Vivian F. López, M. Dolores Muñoz & Ángel Luis Sánchez *mmg@usal.es Department of Computing and Automatic