Flickr Tag Recommendation based on Collective Knowledge

Flickr Tag Recommendationbased on Collective Knowledge BÖrkurSigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW 2008 2009. 03. 13. Summarized and presented by Hwang Inbeom, IDS Lab., Seoul National University

Overview • Recommending tags for an image • More tags, more semantic meanings • Solves two questions • How much would the recommending be effective? • Analyzing tagging behaviors • How can we recommend tags? • Presenting some recommending strategies

Tagging • Tagging • The act of adding keywords to objects • Popular means to annotate various web resources • Web page bookmarks • Academic publications • Multimedia objects • …

Advantages of Tagging Images SagradaFamilia Barcelona Gaudi Spain Catalunya architecture church SagradaFamilia Barcelona • Content-based image retrieval is progressing, but it has not yet succeeded in reducing semantic gap • Tagging is essential for large-scale image retrieval systems to work in practice • Extension of tags • Richer semantic description • Can be used to retrieve the photofor a larger range of keyword queries

Analysis of Tagging Behaviors • How do users tag photos? • Distribution of tag frequency • Distribution of the number of tags per photo • What kind of tags do they provide? • Tag categorization with WordNet

Tag Frequency Head Tail • Distribution of tag frequency could be modeled by a power law • Tags residing in the head of power law • Too generic tags • 2006, 2005, wedding • Tags in tail of powerlaw • Incidentally occurring words • ambrosetompkins,ambient vector

Number of Tags per Photo Head Tail • Distribution could be modeled by power law too • Photos in head of power law • Exhaustively annotated • Photos in tail of power law • Tag recommendationsystem could be useful • Covers 64% of the photos

Number of Tags per Photo (contd.) • Photos classified by number of tagsannotated • To be used to analyze the performance of recommending for different annotation levels

Tag Categorization • 52% of tags could be categorized by WordNet categories • Users provide a broader context by tags, not only visual contents of the photo • Where / when the photo was taken • Actions people in the photo are doing • …

Tag Recommendation System SagradaFamilia Barcelona Gaudi Spain architecture Catalunya church Gaudi Spain Catalunya architecture church Barcelona Spain Gaudi 2006 Catalunya Europe travel SagradaFamilia Barcelona

Tag Recommendation Strategies • Finding candidate tags based on tag co-occurrence • Symmetric measures • Asymmetric measures • Aggregation and ranking of candidate tags • Voting strategy • Summing strategy • Promotion

Tag Co-occurrence • Finding tags co-occurring with a specific tag • Co-occurring tags with higher score become candidate tags • Could be measured in two ways • Symmetric measures • Asymmetric measures

Tag Co-occurrence (contd.) • Symmetric measures • Jaccard’s coefficient • Statistic used for computing the similarity and diversity of sample sets • Useful to identify equivalent tags • Example – Eiffel tower • Tour Eiffel, Eiffel, Seine, La tour Eiffel, Paris

Tag Co-occurrence (contd.) • Asymmetric measures • Tag co-occurrence can be normalized using the frequency of one of the tags • Can provide more diverse candidates than symmetric method • Example – Eiffel Tower • Paris, France, Tour Eiffel, Eiffel, Europe • Asymmetric tag co-occurrence will provide a more suitable diversity

Tag Aggregation SagradaFamilia Barcelona Gaudi Spain architecture Catalunya church Gaudi Spain Catalunya architecture church Barcelona Spain Gaudi 2006 Catalunya Europe travel SagradaFamilia Barcelona • Definitions • U is user-defined tags • Cu is top-m most co-occurring tags of a tag u in U • C is the union of all candidate tags for all user-defined tag u • R is recommended tags

Tag Aggregation (contd.) SagradaFamilia Barcelona Gaudi Spain architecture Catalunya church Barcelona Spain Gaudi 2006 Catalunya Europe travel • Vote • For each candidate tag c in C, whenever c is in Cu a vote is cast • R is obtained by sorting the candidate tags on the number of votes

Tag Aggregation (contd.) • Sum • Sums over co-occurrence values of the candidate tags c in Cu

Promotion Head Tail • Stability-promotion • To make user-defined tags with low frequency less reliable • Descriptiveness-promotion • To avoid generaltags ranked too highly

Promotion (contd.) • Rank-promotion • Co-occurrence values used in summing strategy declines too fast • To make co-occurrence values work better • Applying promotion

Experimental Setup • For different strategies • Assessments • Top 10 recommendations from each of the four strategies make a pool • Assessors were asked to assess the descriptiveness of each tags • Assessed as very good, good, not good, don’t know • Assessors could access and view photo directly on Flickr, to find additional context

Experimental Setup (contd.) • Evaluation metrics • Mean Reciprocal Rank (MRR) • Evaluates probability that the system returns a “relevant” tag at the top of the ranking • Tag is relevant if its relevance score is bigger than average of relevance • Success at rank k (S@k) • Probability of finding a good descriptive tag among the top k recommended tags • Precision at rank k (P@k) • Proportion of retrieved tags that is relevant, averaged over all photos

Experiment Results Promotion worked well Without promotion, summing is better With promotion, voting is better

Experiment Results (contd.) Promotion acted better with more user-defined tags

Experiment Results (contd.) • Semantic analysis • Tags related to visual contents of the photo are more likely to accepted • Higher acceptance ratio of more physical categories

Conclusions • Tag behavior in Flickr • Tag frequency follows a power law • Majority of photos are not annotated well enough • Users annotate their photos using tags with broad spectrum of the semantic space • Extending Flickr annotations • Co-occurrence model with aggregation and promotion was effective • Can incrementally updated • Future work • This model could be implemented as a recommendation system

Discussion • Pros • Analysis can be useful with other work • Easy to understand and implement • Reasonable evaluation strategy • Cons • There should be a comparison with other recommending models • Results are not so impressive • Not much technical contribution

Flickr Tag Recommendation based on Collective Knowledge

Flickr Tag Recommendation based on Collective Knowledge

Presentation Transcript

Content-based Recommendation Systems

Flickr Tag Recommendation based on Collective Knowledge B. Sigurbjörnsson and Roelof van Zwol.

Historical Data Integration based on Collective Intelligence

Movie Recommendation based on movie feature

Tag Ranking (Flickr)

Content-based recommendation

An Asymmetric Similarity Measure for Tag Clustering on Flickr

SheepDog – Group and Tag Recommendation for Flickr Photos by Automatic Search-based Learning

Probability based Recommendation System

Kelsey’s pic on Flickr

Case-Based Recommendation

Ranking and Recommendation Based on Usage Data

Knowledge-based recommendation

Semantics-Based News Recommendation

Getting started on Flickr

Mining Tag Semantics for Social Tag Recommendation

Flickr Tag Analysis

Workshop on Knowledge Based Economies

Semantics-Based News Recommendation

Content-based recommendation

Implementation of Recommendation on Location Based Services