180 likes | 188 Views
Explore the importance of tagging in organizing unstructured web content and discover how tag distribution affects categorization. Analyze tag relations discovery methods, including cooccurrence, cosine similarity, FolkRank, and tag frequency.
E N D
Tomas Michalek BP/IIT.SRC Tag relations discovery
Tag relations discovery • Why we tag? • Tag distribution • Tag categories • Tag relations discovery systems • Which will work?
Why we tag content? • Unstructured text • Most of the web content • Delicious.com • Structred text • Better automatic text processing • Better results for text mining methods • Pictures, audio, video • Very hard automatic processing, even if impossible • All information are from users • flickr.com
Tag distribution • Graph tag distribution for 3 systems – flickr, delicious, citeULike • Is purpouse of tag collecting affecting it's distribution?
Delicious.com • 1 - 43,83% • 1-2 - 66,08% • 1-3 - 79,03% • 1-4 - 86,3% • 1-5 - 90,53%
siteULike.com • 1 - 43,53 • 1-2 - 58,99 • 1-3 - 71,28% • 1-4 - 79,3% • 1-5 - 84,27% • 1-6 - 87,5% • 1-7 - 90,34%
Relations discovery • Methods • Coocurrence • Cosine similarity • I found two versions • FolkRank • Tag Frequency • Based on • Tag coocurrence on resources • User history
Relations discovery • Methods • Coocurrence • Cosine similarity • I found two versions • FolkRank • Tag Frequency • Based on • Tag coocurrence on resources • User history
http://217.67.16.40:800 • Apple