260 likes | 421 Views
Flickr Tag Recommendation based on Collective Knowledge. BÖrkur SigurbjÖnsson , Roelof van Zwol Yahoo! Research WWW 2008 2009. 03. 13. Summarized and presented by Hwang Inbeom , IDS Lab., Seoul National University. Overview. Recommending tags for an image
E N D
Flickr Tag Recommendationbased on Collective Knowledge BÖrkurSigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW 2008 2009. 03. 13. Summarized and presented by Hwang Inbeom, IDS Lab., Seoul National University
Overview • Recommending tags for an image • More tags, more semantic meanings • Solves two questions • How much would the recommending be effective? • Analyzing tagging behaviors • How can we recommend tags? • Presenting some recommending strategies
Tagging • Tagging • The act of adding keywords to objects • Popular means to annotate various web resources • Web page bookmarks • Academic publications • Multimedia objects • …
Advantages of Tagging Images SagradaFamilia Barcelona Gaudi Spain Catalunya architecture church SagradaFamilia Barcelona • Content-based image retrieval is progressing, but it has not yet succeeded in reducing semantic gap • Tagging is essential for large-scale image retrieval systems to work in practice • Extension of tags • Richer semantic description • Can be used to retrieve the photofor a larger range of keyword queries
Analysis of Tagging Behaviors • How do users tag photos? • Distribution of tag frequency • Distribution of the number of tags per photo • What kind of tags do they provide? • Tag categorization with WordNet
Tag Frequency Head Tail • Distribution of tag frequency could be modeled by a power law • Tags residing in the head of power law • Too generic tags • 2006, 2005, wedding • Tags in tail of powerlaw • Incidentally occurring words • ambrosetompkins,ambient vector
Number of Tags per Photo Head Tail • Distribution could be modeled by power law too • Photos in head of power law • Exhaustively annotated • Photos in tail of power law • Tag recommendationsystem could be useful • Covers 64% of the photos
Number of Tags per Photo (contd.) • Photos classified by number of tagsannotated • To be used to analyze the performance of recommending for different annotation levels
Tag Categorization • 52% of tags could be categorized by WordNet categories • Users provide a broader context by tags, not only visual contents of the photo • Where / when the photo was taken • Actions people in the photo are doing • …
Tag Recommendation System SagradaFamilia Barcelona Gaudi Spain architecture Catalunya church Gaudi Spain Catalunya architecture church Barcelona Spain Gaudi 2006 Catalunya Europe travel SagradaFamilia Barcelona
Tag Recommendation Strategies • Finding candidate tags based on tag co-occurrence • Symmetric measures • Asymmetric measures • Aggregation and ranking of candidate tags • Voting strategy • Summing strategy • Promotion
Tag Co-occurrence • Finding tags co-occurring with a specific tag • Co-occurring tags with higher score become candidate tags • Could be measured in two ways • Symmetric measures • Asymmetric measures
Tag Co-occurrence (contd.) • Symmetric measures • Jaccard’s coefficient • Statistic used for computing the similarity and diversity of sample sets • Useful to identify equivalent tags • Example – Eiffel tower • Tour Eiffel, Eiffel, Seine, La tour Eiffel, Paris
Tag Co-occurrence (contd.) • Asymmetric measures • Tag co-occurrence can be normalized using the frequency of one of the tags • Can provide more diverse candidates than symmetric method • Example – Eiffel Tower • Paris, France, Tour Eiffel, Eiffel, Europe • Asymmetric tag co-occurrence will provide a more suitable diversity
Tag Aggregation SagradaFamilia Barcelona Gaudi Spain architecture Catalunya church Gaudi Spain Catalunya architecture church Barcelona Spain Gaudi 2006 Catalunya Europe travel SagradaFamilia Barcelona • Definitions • U is user-defined tags • Cu is top-m most co-occurring tags of a tag u in U • C is the union of all candidate tags for all user-defined tag u • R is recommended tags
Tag Aggregation (contd.) SagradaFamilia Barcelona Gaudi Spain architecture Catalunya church Barcelona Spain Gaudi 2006 Catalunya Europe travel • Vote • For each candidate tag c in C, whenever c is in Cu a vote is cast • R is obtained by sorting the candidate tags on the number of votes
Tag Aggregation (contd.) • Sum • Sums over co-occurrence values of the candidate tags c in Cu
Promotion Head Tail • Stability-promotion • To make user-defined tags with low frequency less reliable • Descriptiveness-promotion • To avoid generaltags ranked too highly
Promotion (contd.) • Rank-promotion • Co-occurrence values used in summing strategy declines too fast • To make co-occurrence values work better • Applying promotion
Experimental Setup • For different strategies • Assessments • Top 10 recommendations from each of the four strategies make a pool • Assessors were asked to assess the descriptiveness of each tags • Assessed as very good, good, not good, don’t know • Assessors could access and view photo directly on Flickr, to find additional context
Experimental Setup (contd.) • Evaluation metrics • Mean Reciprocal Rank (MRR) • Evaluates probability that the system returns a “relevant” tag at the top of the ranking • Tag is relevant if its relevance score is bigger than average of relevance • Success at rank k (S@k) • Probability of finding a good descriptive tag among the top k recommended tags • Precision at rank k (P@k) • Proportion of retrieved tags that is relevant, averaged over all photos
Experiment Results Promotion worked well Without promotion, summing is better With promotion, voting is better
Experiment Results (contd.) Promotion acted better with more user-defined tags
Experiment Results (contd.) • Semantic analysis • Tags related to visual contents of the photo are more likely to accepted • Higher acceptance ratio of more physical categories
Conclusions • Tag behavior in Flickr • Tag frequency follows a power law • Majority of photos are not annotated well enough • Users annotate their photos using tags with broad spectrum of the semantic space • Extending Flickr annotations • Co-occurrence model with aggregation and promotion was effective • Can incrementally updated • Future work • This model could be implemented as a recommendation system
Discussion • Pros • Analysis can be useful with other work • Easy to understand and implement • Reasonable evaluation strategy • Cons • There should be a comparison with other recommending models • Results are not so impressive • Not much technical contribution