8 th International Workshop on Cooperative Information Agents (CIA’04) – Net.ObjectDays 2004

“Qualitative Analysis of User-based and Item-based Prediction Algorithms for Recommendation Agents” by Manos Papagelis1,2 & Dimitris Plexousakis1,2 1ICS-FORTH 2Computer Science Department, University of Crete 8th International Workshop on Cooperative Information Agents (CIA’04) – Net.ObjectDays 2004 Erfurt, Germany, September 27 - 29, 2004

Outline • Problem Statement • Methodology • Similarity Measures • Prediction Algorithms • Experimental Evaluation and Results • Going Research & Conclusions

Recommendation Agents • Approaches • Content-based String comparison, categorization, etc. • Collaborative Filtering (CF) based User similarities based on user models, rating behavior, etc. • Hybrid • Challenges • Sparsity Even active users result in rating only a fraction of items in db • Scalability Maintain accuracy while the number of users and items scales up to millions • Cold-start New and obscure items are barely recommended

Unfolding the Recommendation Process • Which items to recommend? The list of N items with respect to the top N predictions • How could predictions be achieved? Exploitation of other users’ activity • Which users’ activity to take up? Those who share the same or relevant interests “it may be of benefit to one’s search for information to consult the behavior of other users who share the same or relevant interests” ? ? ? Collaborative Filtering

z u1 u6 u2 y u3 Motivating illustration of user models based on rating activity x u5 u4 Collaborative Filtering (CF) “CF algorithms aim to identify users that have relevant interests by calculating similarities between user profiles” Co-rated Items How similar are they? • - Cosine Vector Similarity • - Spearman Correlation • - Mean-squared Difference • - Entropy-based Uncertainty • Pearson Correlation Coefficient @ [Breese et al., Herlock et al.]

Our Framework We define: • A set of m users U={ux: x=1, 2, …, m} • A set of n items I={ix: x=1, 2, …, n} • A set of p item categories C={cx: x=1, 2, …, p} • A set of q explicit ratings R={rx: x=1, 2, …, q} • A set of t explicit ratings R’={rx’: x=1, 2, …, t} Explicit Rating: A rating that expresses the preference of a user to a specific item Implicit Rating: Each explicit rating of a user to a specific item “implicitly” identifies the user’s preference to the categories that this item belongs to Example Action Result r Implicit Explicit item user CatA CatC CatB

Similarity Measures (1/3) • Distinctions • User-based vs. Item-based Similarity • Explicit Rating vs. Implicit Rating • Definition of three matrices • User-Item, User-Category, Item-Category Matrices

Similarity Measures (2/3) • User-based Similarity derived from • Explicit Ratings • Implicit Ratings

Similarity Measures (3/3) • Item-based Similarity derived from • Explicit Ratings • Item-categorybitmap matrix

Prediction Algorithms (1/2) • User-based Prediction algorithms • Explicit Ratings (CFUB-ER) • Explicit Ratings, Content Boosted (CFUB-ER-CB) • Implicit Ratings (CFUB-IR) Similarity Measures

Prediction Algorithms (2/2) • Item-based Prediction algorithms • Explicit Ratings (CFIB-ER) • Implicit Ratings (CFIB-IR) Similarity Measures

Experimental Evaluation & Results • Data Set • 2100 ratings (range from 1 to 10), 115 users, 650 items, 20 item categories • Sparsity 0.97% • 300-item sample sets • Metrics • Coverage ~99% • Accuracy • Statistical Accuracy Mean Absolute Error (MAE) • Decision-support Accuracy Receiver Operating Curve (ROC)

Mean Absolute Error (MAE) We plot MAE vs. Sparsity Remarks • Item-based predictions result in higher accuracy than user-based predictions • Predictions based on explicit ratings result in higher accuracy than predictions based on implicit ratings • CFIB-ERis very sensitive to sparsity levels • CFIB-ERperforms as much as 39.5% better than classic CF for sparsity 97.2 • Content boosted algorithms slightlyimprove accuracy but require expensive computations Q : What is the deviation of Predicted Ratings about their Actual Rating?

Example Sparsity 0,972 1,703 1,385 1, 35 1, 34 0,838

Receiver Operating Curve (ROC) (1/2) Notation: PR Predicted Rating, AR  Actual Rating, QT  Quality Threshold 4 cases defined by the prediction process for one item: True Positive (TP) when PR≥QT and AR≥QT False Positive (FP) when PR≥QT and AR<QT True Negative (TN) when PR<QT and AR<QT False Negative (FN) when PR<QT and AR≥QT We plot sensitivity vs. 1-specificity for Quality Threshold values between 1-9 where: tp, fn, fp, tn is the total number of occurrences for TP, FN, FP, TN over the set of items respectively

Receiver Operating Curve (ROC) (2/2) Remarks • Item-based predictions provide better accuracy than user-based predictions • Predictions based on explicit ratings perform better than the ones based on implicit ratings • CFIB-ERoutperforms by 95%, 89%, 29% the classic CF for ROC-9, ROC-8, ROC-7. • CFIB-ER outperforms all other algorithms comparing the area under the ROC curve We plot sensitivity vs. 1-specificity for Quality Threshold values between 1-9 Q: How effectively predictions help a user to select high-quality items?

Example Quality Threshold  7 ROC-7 Values 0,71 0,59 0,55 0,53 0,39

Going Research (1/2) • Incremental Collaborative Filtering (ICF) • Incremental computation of e, f, g factors after each single rating • Cases to be examined • Submission of a new rating vs. Update of an old rating? • Co-rated item or not? • Caching scheme to support the Incremental Collaborative Filtering (ICF) • ICF deals with Scalability problem (Orders of magnitude better performance) • ICF provides highest accuracy as it is NOT based on any Dimensionality Reduction technique (No clustering, Single Value Decomposition or other approximation techniques)

Going Research (2/2) • Trust Propagation in Small Worlds • Identification of self-organized, virtual online communities • Exploitation of “Trust Chains” to deal with • The sparsity problem in (Trust Propagation) • Cold-start problem (Trust Propagation) • Trust Computation based on Collaborative Filtering CF can be applied (There are co-rated items) CF cannot be applied (There are not any co-rated items)

Conclusions and Discussion • Evaluation of user-based and item-based prediction algorithms • Item-based algorithms perform better • Category-boosted predictions slightly improve prediction accuracy • Explicit ratings are more functional than implicit ratings • Incremental Collaborative Filtering (ICF) to deal with Scalability • Trust Propagation to deal with Sparsity and Cold-start Questions?

Thanks! For further details contact: papaggel AT ics.forth.gr dp AT ics.forth.gr

8 th International Workshop on Cooperative Information Agents (CIA’04) – Net.ObjectDays 2004