Todays topic

Todays topic Social Tagging By Christoffer Hirsimaa

Stop thinking, start tagging:TagSemantics arise from Collaborative Verbosity Christian Körner, Dominik Benz, Andreas Hotho, Markus Strohmaier, Gerd Stumme From WWW2010

Where do Semantics come from? • Semantically annotated content is the „fuel“ of the next generation World Wide Web – but where is the petrol station? • Expert-built  expensive • Evidence for emergent semantics in Web2.0 data  Built by the crowd!  Which factors influence emergence of semantics?  Do certain users contribute more than others?

Overview Pragmatics of tagging Emergent Tag Semantics Semantic Implications of Tagging Pragmatics Conclusions

Emergent Tag Semantics • tagging is a simple and intuitive way to organize all kinds of resources • formal model: folksonomyF = (U, T, R, Y) • UsersU, Tags T, ResourcesR • Tag assignmentsY  (UTR) • evidence of emergent semantics • Tag similarity measures canidentify e.g. synonym tags (web2.0, web_two)

Tag Similarity Measures: Tag Context Similarity Tag Context Similarity is a scalable and precise tag similarity measure [Cattuto2008,Markines2009]: Describe each tag as a context vector Each dimension of the vector space correspond to another tag; entry denotes co-occurrence count Compute similar tags by cosine similarity 6 … JAVA design software blog web programming  Will be used as indicator of emergent semantics!

WordNet Hierarchy Mapping Average JCN(t,tsim) over all tags t: „Quality of semantics“ = tag = synset Assessing the Quality of Tag Semantics Folksonomy Tags JCN(t,tsim) = 3.68 TagCont(t,tsim) = 0.74

bev donuts duff alc nalc bart Duff-beer wine beer marge beer barty Tagging motivation • „Describers“… • tag „verbously“ with freely chosen words • vocabulary not necessarily consistent (synonyms, spelling variants, …) • goal: describe content, ease retrieval • Evidence of different ways HOW users tag (Tagging Pragmatics) • Broad distinction by tagging motivation [Strohmaier2009]: • „Categorizers“… • use a small controlled tag vocabulary • goal: „ontology-like“ categorization by tags, for later browsing • tags a replacement for folders

high low Tagging Pragmatics: Measures • How to disinguish between two types of taggers? • Vocabulary size: • Tag / Resource ratio: • Average # tags per post:

Tagging Pragmatics: Measures • Orphan ratio: • R(t): set of resources tagged by user u with tag t high low

Tagging pragmatics: Limitations of measures • Real users: no „perfect“ Categorizers / Describers, but „mixed“ behaviour • Possibly influenced by user interfaces / recommenders • Measures are correlated • But: independent of semantics; measures capture usage patterns

= user Subset of 30% categorizers Influence of Tagging Pragmatics on Emergent Semantics Complete folksonomy • Idea: Can we learn the same (or even better) semantics from the folksonomy induced by a subset of describers / categorizers? Extreme Categorizers Extreme Describers

TagCont(t,tsim)= … JCN(t,tsim)= … CF5 DF20 Experimental setup • Apply pragmatic measures vocab, trr, tpp, orphan to each user • Systematically create „sub-folksonomies“ CFi / DFi by subsequently adding i % of Categorizers / Describers (i = 1,2,…,25,30,…,100) • Compute similar tags based on each subset (TagContext Sim.) • Assess (semantic) quality of similar tags by avg. JCNdistance

Dataset From Social Bookmarking Site Delicious in 2006 Two filtering steps (to make measures more meaningful): Restrict to top 10.000 tags FULL Keep only users with > 100 resources MIN100RES 14

Results – adding Describers (DFi)

Results – adding Categorizers (CFi)

Summary & Conclusions • Introduction of measures of users‘ tagging motivation (Categorizers vs. Describers) • Evidence for causal link between tagging pragmatics (HOW people use tags) and tag semantics (WHAT tags mean) • „Mass matters“ for „wisdom of the crowd“, but composition of crowd makes a difference („Verbosity“ of describers in general better, but with a limitation) • Relevant for tag recommendation and ontology learning algorithms

My thoughts and remarks • Confirmed deleting spammers is useful once again, but how useful? • Try to recursively combine the set of describers / categorizers

Q&A and discussion!

Thank you for your attention!

21 Extras:

22 References • [Cattuto2008] Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme: Semantic Grounding of Tag Relatedness in Social Bookmarking Systems. In: Proc. 7th Intl. Semantic Web Conference (2008), p. 615-631 • [Markines2009] Benjamin Markines, Ciro Cattuto, Filippo Menczer, Dominik Benz, Andreas Hotho, Gerd Stumme: Evaluating Similarity Measures for Emergent Semantics of Social Tagging. In: Proc. 18th Intl. World Wide Web Conference (2009), p.641-641 • [Strohmaier2009] Markus Strohmaier, Christian Körner, Roman Kern: Why do users tag? Detecting users‘ motivation for tagging in social tagging systems. Technical Report, Knowledge Management Institute – Graz University of Technology (2009)

Todays topic

Todays topic

Presentation Transcript

Todays Schedule

TODAYS QUIZ

Todays agenda:

Todays Topics

Todays schedule

Todays lecture

Guess todays new topic!

Todays Lesson (1)

What's todays topic?

Todays Scenarios

Todays Drill

TODAYS PLAN

Todays Agenda

Todays Lesson

Todays Presentation

Todays presentation

TODAYS AGENDA

Todays Session

Computing for Todays

Todays Big Question:

Astrology todays