220 likes | 333 Views
Using an Ontological A-priori Score to Infer User’s Preferences. W17: Workshop on Recommender Systems – ECAI 2006 Advisor: Prof Boi Faltings – EPFL. Presentation Layout. Introduction Introduce the problem and existing techniques Transferring User’s Preference
E N D
Using an Ontological A-priori Score to Infer User’s Preferences W17: Workshop on Recommender Systems – ECAI 2006 Advisor: Prof Boi Faltings – EPFL
Presentation Layout • Introduction • Introduce the problem and existing techniques • Transferring User’s Preference • Introduce the assumptions behind our model • Explain the transfer of preference • Validation of the model • Experiment on MovieLens • Conclusion • Remarks & Future work
Problem Definition • Recommendation Problem (RP): Recommend a set of items I to the user from a set of all items O, based on his preferences P. • Use a Recommender System, RS, to find the best items • Examples: • NotebookReview.com (O=Notebooks, P= criteria (Processor Type, Screen Size)) • Amazon.com (O=Books, DVDs,… , P= grading) • Google (O=Web Documents, P= keywords)
Recommendation Systems • Three approaches to build a RS: [1][2][3][4][5] • Case-Based Filtering: uses previous cases i.e.: Collaborative Filtering (cases – user’s ratings) • Good performances – low cognitive requirements • Sparsity, latency, shilling attacks and cold start problem • Content-Based Filtering: uses item’s description i.e.: Multi-Attribute Utility Theory (descriptions-attributes) • Match user’s preferences – very good precision • Elicitation of weights and value function. • Rule-Based Filtering: uses association between items i.e.: Data Mining (associations – rules) • Find hidden relationships – good domain discovery • Expensive and time consuming
A Major Problem in RS: The Elicitation Problem => Incomplete user’s model • Collaborative Filtering • Multi-Attribute Utility Theory I134 4 I245 3 I55 4 I4 5 Central Problem of RS
Presentation Layout • Introduction • Introduce the problem and existing techniques • Transferring User’s Preference • Introduce the assumption behind our model • Explain the transfer of preference • Validation of the model • Experiment on MovieLens • Conclusion • Remarks & Future work
Transport On-land On-sea Vehicle Boat <7 >6 Car Bus City All_terrain SUV Compact Ontology D1 Ontology λ is a graph (DAG) where • nodes models concepts • Instances being the items • edges represents the relations (features). • Sub-concepts are distinguished by certain features • Feature are usually not made explicit
S is a function that satisfies the assumptions: • A1: S depends on the features of the item • Items are models by a set of features • A2: Each feature contributes independently to S • Eliminates the inter-dependence between features • A3: unknown|disliked features make no contribution • Reflects the fact that users are risk-averse • Liking a concept liking a sub-concept The Score of Concept -S • The RP viewed as predicting the scoreS assigned to a concept (group of items). • The score can be seen as a lower bound function that models how much a user likes an item
E(c)= ∫xfc(x)dx = 1 leafs n+2 1 APS 0,5 • APS(c)= root n+2 #descendants A-priori Score - APS • The structure of the ontology contains information • Use APS(c) to capture the knowledge of concept c • If no information, assume S(c) uniform [0..1] • P(S(c)>x)=1-x • Concepts can have n descendants • Assumption A3 => P(S(c)>x)=(1-x)n+1 • APS uses no user information
Inference Idea Select the best Lowest Common Ancestor lca(SUV, bus) – AAAI’06 Vehicle Car Bus S(bus)=??? SUV S(SUV)=0.8
Upward Inference • Going up k levels ⇒ remove k known features A1the score depends on the features of the item vehicle K levels SUV • Removing features ⇒ S↘ or S ↔ (S =∑S) • S( vehicle | SUV)= α( vehicle, SUV) * S(SUV) • α ∈[0..1] is the ratio of feature in common liked • How to compute α? • α =#feature(vehicle) / #feature(SUV) • Does not take into account the feature distribution • α =APS(vehicle) / APS(SUV)
A3 Users are pessimistic liking some features liking others Downward Inference A2Features contributes independently to the score • Going down l levels ⇒ adding l unknown features vehicle l levels bus • Adding features ⇒ S↗ or S↔ (S =∑S) S(bus|vehicle)=α S(vehicle) α≥1 ⇏ • S(bus|vehicle)= S(vehicle) + β(vehicle, bus) • β∈[0..1] is ∑features in bus not present in vehicle • How to compute β? • β= APS(bus) - APS(vehicle)
Elicited from the user Use APS Overall Inference • There exist a chain between “city” and vehicle but not a path Vehicle • As for Bayesian Networks, we assume independence Car Bus • S(Bus|SUV)= αS(SUV) + β SUV • The score of a concept x knowing y is defined as: S(y|x)= α(x,lcax,y)S(x) + β(y,lcax,y) • The score function is asymmetric
Presentation Layout • Introduction • Introduce the problem and existing techniques • Transferring User’s Preference • Introduce the assumption behind our model • Explain the transfer of preference • Validation of the model • WordNet (built best similarity metric – see paper) • Experiment on MovieLens • Conclusion • Remarks & Future work
Validation – Transfer - I • MovieLens database used by CF community: • 100,000 ratings on 1682 movies done by 943 users. • MovieLens – movies are modeled by 23 Attributes • 19 themes, MPPA rating, duration, and released date. • Extracted from IMDB.com • Built an ontology modeling the 22 attributes of a movies • Used definitions found in various online dictionaries
Validation – Transfer - II • Experiment Setup – for each 943 users • Filtered users with less than 65 ratings • Split user’s data into learning set and test set • Computed utility functions from learning set • Frequency count algorithm for only 10 attributes • Our inference approach for other 12 attributes • Predicted the grade of 15 movies from the test set • Our approach – HAPPL (LNAI 4198 – WebKDD’05) • Item-Item based CF (using adjusted Cosine) • Popularity ranking • Computed the accuracy of predictions for Top 5 • Used the Mean Absolute Error (MAE) • Back to 3 with a bigger training set {5,10,20,…,50}
1 n+2 Conclusions • We have introduced the idea that ontology could be used to transfer missing preferences. • Ontology can be used to compute A-priori score • Inference model - asymmetric property • Outperforms CF without other people information • Requirements & Conditions: • A2 - Features contributes to preference independent. • Need an ontology modeling all the domain • Next steps: Try to learn the ontology • Preliminary results shows that we still outperform CF • Learn ontology gives a more restricted search space
Questions? Thank-you Slides: http://people.epfl.ch/vincent.schickel-zuber
References - I [1] Survey of Solving Multi-Attribute Decisions Problems Jiyong Zang, and Pearl Pu, EPFL Technical Report, 2004. [2] Improving Case-Based Recommendation A Collaborative Filtering Approach Derry O’Sullivan, David Wilson, and Barry Smyth, Lecture Notes In Computer Science, 2002. [3] An improved collaborative Filtering approach for predicting cross-category purchases based on binary market data. Andreas Mild, and Thomas Reutterer, Journal of Retailing and Consumer Services Special Issue on Model Building in Retailing & consumer Service, 2002. [4] Using Content-Based Filtering for Recommendation Robin van Meteren and Maarten van Someren, ECML2000 Workshop, 2000. [5] Content-Based Filetering and Personalization Using Structure Metadata A. Mufit Ferman, James H. Errico, Peter van Beek, and M Ibrahim Sezan, JCDL02, 2002.
References - II [AAAI’06] Inferring User’s Preferences Using Onotlogies Vincent Schickel and Boi Faltings, In Proc. AAAI’06 pp 1413 – 1419, 2006. [LNAI 4198] Overcoming Incomplete User Models In Recommendation Systems via an Ontology. Vincent Schickel and Boi Faltings, LNAI 4198, pp 39 -57, 2006.