User Profiling based on Folksonomy Information in Web 2.0 for Personalized Recommender Systems

User Profiling based on Folksonomy Information in Web 2.0 for Personalized Recommender Systems Huizhi (Elly) Liang Supervisors: YueXu, Yuefeng Li, RichiNayak Queensland University of Technology, Australia

Agenda 1 Introduction 2 Literature Review The Proposed Approaches 3 4 Experiments 5 Conclusion

1 Introduction

Information overload • Personalization “Personalization is the ability providing content and services tailored to individuals based on knowledge about their preferences and behaviours” (Hagen, 1999) • Recommender systems • User profiling • Explicit user profiles • Explicit ratings • Implicit user profiling • Web log • Other information sources

Web 2.0 • Web 2.0: Read and Write web (O’Reilly Media, 2004) • A platform for users to conduct online participation, collaboration and interaction. • Expressing opinions, sharing information, building networks • Wikipedia, Facebook, Delicious, Tweeter • Plenty of new user information • Folksonomy (Tags), reviews, networks, blogs, micro-blogs etc. • Opportunities • Providing possible new solutions to profile users

Folksonomy • Folksonomy= folk + taxonomy • Tags: Typical Web 2.0 information • Keywords given by users to organize and classify items • The wisdom of crowds • Multiple functions • Item organizing and sharing • Building networks • Expressing users’ explicit topic interests and opinions

Tag Cloud

Folksonomy • Given by users explicitly and proactively • Reflecting users’ personal viewpoints and topic preferences • Less intrusive & Multiple function • Lightweight textural information • Contains a lot of noise Folksonomy Tags • Taxonomy • Given by experts • Standard vocabulary & Structural relationship • Well recognized as common knowledge • Independent with user communities • No users’ personal viewpoints or preferences information Taxonomy categories

2 Literature Review

User Profiling • Web User profiling • Web content & structure • Web log & Web usage • Taxonomy & Ontology • User Profiling in Web 2.0 • New user information sources • Folksonomy, blogs, reviews, micro-blogs • Videos, audios, images • Friends, trust network, followers, following

User Profiling2 • User Profiling based on folksonomy • Approaches • Users’ own tags • Associated tags • Latent topics of tags • Popular tags • Challenges • Distinctive features of tags • Tag quality problem • Semantic ambiguity and synonyms • About 60% of tags are personal tags

Recommender system • Recommendation tasks • Top N Recommendation (Precision, Recall, F1) • Rating Prediction (Mean Absolute Error, Root Mean Squared Error) • Recommendation approaches • Content based • Term vector model • Latent Dirichlet Allocation (LDA) • Collaborative Filtering (CF) • Memory based CF: User-KNN & Item-KNN • Model based CF: Matrix Factorization techniques • Hybrid

Recommender system 2 • Recommender systems based on Taxonomy • Ziegler’s approach (CIKM, 2004) • Recommender systems based on Folksonomy • Tag recommendations • Tensor based approach (KDD, 2009) • Graph based approach (SIGIR, 2009) • Item recommendations • Tso-Sutter’s approach(SAC, 2008) • Clustering (RecSys, 2009) • LDA approach (HT, 2009) • Graph Rank (2010) • Special tag rating function(WWW,2009)

Research Problem • Research Gap • Features of folksonomy • Noise of folksonomy • Combining with taxonomy • Research Problem • Profiling users based on folksonomy information in Web 2.0 and enhance recommender systems

3 The Proposed Approaches

The Proposed Approaches User Profiling • User Profiling Models • User Profiling based on Folksonomy • User Profiling based on Taxonomy • Hybrid User Profiling • Recommender System • Top N item recommendation User Profiling-Folksonomy User Profiling-Taxonomy User Profiling-Hybrid Recommendation making

The Relationship Modelling • The Multiple relationships of tagging • Two dimensional relationships • User-Item relationship • User-Tag relationship • Item-Tag relationship • Three dimensional relationship • Personal tagging behavior User-Tag-Item relationship • (User×Tag)-Item mapping • Item-(User×Tag) mapping

Part 1: User Profiling Approaches based on Folksonomy • Tag representation-Folksonomy • Item representation-Folksonomy • User representation-Folksonomy Tag Representation-Folksonomy Item Representation-Folksonomy User Representation-Folksonomy • User Profiling-Folksonomy

Tag representation-Folksonomy • Reduce the noise of tags • Find the personally related tags of each tag • Determine the relevance weight • Relevance weight of two tags with respect to a user • The collected items of a tag • The expectation of the probability of a tag being used for the collected items Number of users used the tag for the item Number of users collected the item “garden” “apple” “apple” 0.34 “globalization” 0.16 “internet”

Item representation-Folksonomy • Expand the tags of each item • Find the relevant tags of each item • Determine the relevance weight • The relevance of an item to a tag • User-tag pairs • The relevance of two tags with respect to a user • Inverse item frequency “garden” “apple” “globalization” “internet” “0403”

User Representation-Folksonomy • Find users’ preferences to tags • The preference weight of a user to a tag • Preferences to one tag • The relevance of two tags with respect to a user • Inverse user frequency “garden” “apple” “globalization” “internet” Number of items collected with the tag by the user “0403” Number of items collected by the user

User Profiling-Folksonomy • User • Item preferences • Implicit ratings • Topic preferences • Tag vocabulary • Item • Tag vocabulary “garden” “apple” “globalization” “internet” “0403” “garden” “apple” “globalization” “internet” “0403”

Part 2: User Profiling based on Taxonomy • Advantages of Taxonomy • Standard vocabulary • Well recognized • Independent with user communities • Experts’ viewpoints • Representations • Item representation-Taxonomy • Tag representation-Taxonomy • User representation-Taxonomy “apple” Item Representation-Taxonomy Tag Representation-Taxonomy User Representation-Taxonomy • User Profiling-Taxonomy

Item Representation-Taxonomy • Find the relevant taxonomic topics of each item • The relevance of an item to a taxonomic topic • The average weight of a taxonomic topic in all descriptors • The weight of a taxonomic topic in an item descriptor • Deploy weight from leaf topic to root topic • Inverse item frequency “book” “computers” “programming” “networks”

Tag Representation-Taxonomy • Reduce the noise of tags • Find the personal semantic meaning of each tag • The relevance of a tag to a taxonomic topic with respect to a user • The collected items of a tag • Average relevance weight of a taxonomic topic to the collected items “garden” “apple” “apple” “flowers” “fruit” “apple” “computers” “programming” “apple” “networks” “databases”

User Representation-Taxonomy • Find users’ preferences to taxonomic topics • The preference weight of a user to a taxonomic topic • Preference to a tag • Relevance of a tag to a taxonomic topic with respect to the user • Inverse user frequency “book” “0403” “computers” “programming” “databases”

User Profiling-Taxonomy • User • Item preferences • Implicit ratings • Topic preferences • Taxonomy vocabulary • Item • Taxonomy vocabulary “book” “computers” “programming” “networks” “book” “computers” “programming” “databases”

Part 3: Hybrid User Profiling • Combine Part 1 and Part 2 • Wisdom of crowds • Tag vocabulary & Users’ viewpoints • Wisdom of experts • Taxonomy vocabulary & Experts’ viewpoints • Tag representation-Hybrid • Item representation-Hybrid • User representation-Hybrid

Personalized Recommendation Making • Top N item recommendation User Profiling-Folksonomy User Profiling User Profiling-Taxonomy User Profiling-Hybrid Neighborhood Formation Recommendation Making Recommendation Generation

NeighbourhoodFormation • K-Nearest Neighbourhood • User-KNN • Similarity of item preferences • Similarity of topic preference • Tags • Taxonomic topics • Linear combination Taxonomic topics Tags User Similarity Item Preferences Topic Preferences

NeighbourhoodFormation 2 • K-Nearest Neighbourhood • Item-KNN • Similarity of Tags • Similarity of Taxonomic topics • Linear combination Item similarity Tags Taxonomic Topics

RecommendationGeneration • Candidate items • Neighbour items & Not tagged by the target user • User based recommendation • Item based recommendation Prediction Score User Similarity Content matching Taxonomic Topics Tags Item Similarity Prediction Score

4 Experiments

Datasets • D1: Amazon.com • 4112 users, 34201 tags, 30467 items, 9919 taxonomic topics • D2: CiteULike “Who-posted-what” dataset • 7103 users, 78414 tags, 117279 items • Power Law Distributions Tags Items

Experiment setup • Top N item recommendation • Experiment setup • 5-folded • 80% training & 20% testing • Evaluation Metrics • Precision, Recall, F1 Measure • Comparisons • Proposed Models • Folksonomy Model: FM-User, FM-Item • Taxonomy Model: TM-User, TM-Item • Hybrid Model: FTM-User, FTM-Item • Baseline Models

Results-I Folksonomy Model • Tag Noise Removing Approaches (Dataset D1) • Parameter setting • FM-User: • : 0.8-1.0 , 1: 0.4-0.5 • FM-Item: •  1: 0.4-0.5

Results-I • The Comparison of the State-of-the-art approaches (Dataset D1)

Results-I • Comparison results of Dataset D2

Results-2 Taxonomy Model • Parameter setting (Dataset D1) • TM-User: •  : 0.8-1.0 , 1: 0.4-0.5 • TM-Item: •  1: 0.4-0.5

Results-3 Hybrid Models • Parameter setting (Dataset D1) • FTM-User: • FTM-Item:1=0.3, • Hybrid Models v.s. Single Models • Folksonomy Model v.s. Taxonomy Model

Results-3 • The influence of personal tags • D1 personal tags: 67%,   10: 4.8% • D2 personal tags: 70% ,  10: 5.2% • Findings • Personal tags can improve the precision results • Precision values decreased dramatically when large number (i.e., 90%) of tags (i.e.,  5) was removed. TM-User, D1 (9919, 0.24)

Discussions • The proposed approaches outperformed other related work • The Hybrid Model performed the best • Each tag counts • Folksonomy can be used as quality information source (rich personalization information)

5 Conclusions

Conclusions • Web 2.0 • New user information • Modelling the relationships of tagging behaviour • Tag quality problem • The wisdom of crowds & experts • Proposed three user profiling models • User profiling based on folksonomy • User profiling based on taxonomy • Hybrid user profiling • Utilized the proposed user profiles to improve recommender systems • User based • Item based • Evaluation Experiments

Contributions • Advantages • Domain free • Language free • Information overload • User profiling and web personalization • Recommender systems • Web 2.0

Future Work • Time factor • Cross folksonomy recommendations • Mobile platform application • Integrate with other user information • Explicit ratings • Tweets • Friendship network

Published Work • Liang, H. et al. (2010). Personalized Recommender System Based on Item Taxonomy and Folksonomy. CIKM • Liang, H. et al. (2010). Connecting Users and Items with Weighted Tags for Personalized Item Recommendations. Hypertext • Liang, H. et al. (2010). A Hybrid Recommender System based on Weighted Tags. SDM Workshop • Liang, H. et al. (2010). Mining Users’ Opinions based on Item Folksonomy and Taxonomy for Personalized Recommender Systems. ICDM Workshop • Liang, H. et al. (2010). Parallel User profiling based on folksonomy for Large Scaled Recommender Systems-An implementation of Cascading MapReduce. ICDM Workshop • Liang, H. et al. (2009). Collaborative Filtering Recommender Systems based on Popular Tags. ADCS • Liang, H. et al. (2009). Tag Based Collaborative Filtering for Recommender Systems. RSKT • Liang, H. et al. (2009). Personalized Recommender Systems Integrating Social tags and Item Taxonomy. WI • Liang, H. et al. (2008). Collaborative Filtering Recommender Systems Using Tag Information. WI Workshop • Bhuiyan, T., Xu, Y., Jøsang, A., & Liang, H. (2010). Developing Trust Networks Based on User Tagging Information for Recommendation Making. WISE

Acknowledgements Time Supervisor Team HPCgroup Penal MembersISSAnonymous ReviewersPapersStaffs ColleaguesFriendsGoogleBooksSunshineCSC Trees Stars Music TripsBlogsBeachesFamily …

Questions & Answers oklianghuizi@gmail.com ? ?

User Profiling based on Folksonomy Information in Web 2.0 for Personalized Recommender Systems