1 / 25

Mining Tag Semantics for Social Tag Recommendation

Mining Tag Semantics for Social Tag Recommendation. Hsin-Chang Yang Department of Information Management National University of Kaohsiung. Outline. Introduction Text Mining by SOM Tag Recommendation Process Experimental Results Conclusions. Social Bookmarking –Why?.

Download Presentation

Mining Tag Semantics for Social Tag Recommendation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung

  2. Outline • Introduction • Text Mining by SOM • Tag Recommendation Process • Experimental Results • Conclusions

  3. Social Bookmarking –Why? • Social bookmarking services (aka folksonomy) are gaining popularity since they have the following benefits: • Alleviation of efforts in Web page annotation • Improvement of retrieval precision • Simplification of Web page classification

  4. How a folksonomy works? • Simple • A user (ui) annotates a Web page (oj) with a set of tags or post (Tij). • Generally represented as a set of tuples (ui, oj, Tij) interesting… GrC2011 program Let me add some tags. Granular Computing ui oj Tij

  5. Characteristics of Folksonomy • Collaboration • Semantic relatedness • help improving retrieval precision • Social tagging is not a trivial task

  6. Tag Recommendation • the mechanism of suggesting proper tags to normal users when they try to adding tags to some Web page • save the effort of users to select tags from the ground up • constrain the formulation of tags • Automatic tag recommendation process is thus beneficial for social bookmarking services as well as search engines.

  7. Outline • Introduction • Text Mining by SOM • Tag Recommendation Process • Experimental Results • Conclusions

  8. Text Mining by SOM Training Web pages Training Posts Preprocessing Web page vectors Post vectors SOM training Synaptic weight vectors Labeling page associations tag associations Page clusters tag clusters Association discovery Page/tag associations

  9. Preprocessing • bag of words approach for describing pages and posts • post: collection of tags annotated to a page at once • Web page Pi is transformed to a binary vector Pi. • Ti, which is the post of Pi, is transformed to a binary vector Ti.

  10. SOM Training • All Pi and Ti were trained by the self-organizing map algorithm separately. • Two maps MP and MT were obtained after the training.

  11. Labeling • We labeled each Web page on MP by finding its most similar neuron. A page cluster map (PCM) was obtained after all pages being labeled. • The same approach was applied on all posts on MT and obtained tag cluster map (TCM). PCM TCM P1, P5, P65 T1, T8

  12. Association Discovery • Finding associations between page clusters and post clusters. • We used a voting scheme to find the associations. Ty Ti PCM TCM Tj +1 Pi +1 Pj Px

  13. Association Discovery • Similarity between a page cluster Px and a post cluster Ty : • I: index set operator • Ck,l = 1 if Pk is annotated by Tl; = 0 otherwise • Px is associated with a post cluster Ty with maximum similarity

  14. Outline • Introduction • Text Mining by SOM • Tag Recommendation Process • Experimental Results • Conclusions

  15. Architecture of Tag Spam Detection Incoming Web page Preprocessing Incoming page vector Labeling PCM Labeled page cluster Tag Recommendation Page/tag associations Recommended tags

  16. Tag Recommendation • Px : the incoming Web page • Let Px be labeled to Px. • Let Tx be the most related tag cluster of Px , all tags in Tx will be recommended. PCM TCM recommended! Tx Tx Px Px

  17. Outline • Introduction • Text Mining by SOM • Tag Recommendation Process • Experimental Results • Conclusions

  18. Experimental Results • Dataset • ECML/PKDD Discovery Challenge 2008 (RSDC 2008) tag recommendation dataset • over 132K tags posted by 468 users • 16235 bookmarked items, either Web pages or BibTeX entries • contains some noisy data • items without too much content • items without tags

  19. Experimental Results • Preprocessing • Discard tags that contain non-English characters • Remove numeric tags • Remove tags that are stop words such as ’for’ and ’the’ • Transform all tags to lowercase • Ignore extremely short tags • Ignore extremely long tags • Stemming the remaining tags

  20. Experimental Results • Parameters for SOM training

  21. Experimental Results • Summary of PCM and TCM

  22. Experimental Results • We recommended each page with a set of 10 ranked tags. • These recommended tags were then compared to the original tags. • We use F1-measure to compare with the results in RSDC 2008.

  23. Experimental Results • Evaluation result

  24. Conclusions • A novel scheme for tag recommendation based on text mining. • Relatedness between Web pages and tags were discovered based on clustering result of self-organizing map. • Use only the content of Web pages instead of user behaviors.

  25. Thank you!

More Related