1 / 80

Thoughts on Social Tagging

Thoughts on Social Tagging. Marti Hearst UC Berkeley. Taxonomy Bootcamp ’07 Keynote Talk. Outline. What are Tags? Organizing Tags for Navigation Facets and faceted navigation How to (semi)automatically create facet hierarchies What’s up with Tag Clouds?. Social Tagging.

bess
Download Presentation

Thoughts on Social Tagging

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Thoughts on Social Tagging Marti Hearst UC Berkeley Taxonomy Bootcamp ’07 Keynote Talk

  2. Outline • What are Tags? • Organizing Tags for Navigation • Facets and faceted navigation • How to (semi)automatically create facet hierarchies • What’s up with Tag Clouds?

  3. Social Tagging • Metadata assignment without all the bother • Spontaneous, easy, and tends towards single terms • Usually used in the context of social media

  4. The Tagging Opportunity • At last! Content-oriented metadata in the large! • Attempts at metadata standardization always end up with something like the Dublin Core • author, date, publisher, … • I’ve always thought the action was in the subject metadata, and have focused on how to navigate collections given such data.

  5. The Tagging Opportunity • Tags are inherently faceted ! • It is assumed that multiple labels will be assigned to each item • Rather than placing them into a folder • Rather than placing them into a hierarchy • Concepts are assigned from many different content categories • Helps alleviate the metadata wars: • Allows for both splitters and lumpers • Is this a bird or a robin • Doesn’t matter, you can do both! • Allows for differing organizational views • Does NASCAR go under sports or entertainment? • Doesn’t matter, you can do both!

  6. Tagging Problems • Tags aren’t organized • Tags don’t attempt exhaustive coverage • Different tags for the same meanings • Morphological variants (airplane, airplanes) • Lexical variants (sf, sanfrancisco, san francisco) • Synonyms (boat, ship) • See how this author attempts to compensate:

  7. Tagging Problems / Opportunities • Some tags are fleeting in meaning or too personal • toread todo • Tags are not “professional” • (I personally don’t think this matters) • Great example from Trant: • "Anecdotal evidence also shows that ‘professional’ cataloguers find the basic description of visual elements surprisingly difficult: a curator exhibited significant discomfort during this description task. When asked what was wrong, he blurted out "everything I know isn't in the picture". Investigating social tagging and folksonomy in the art museum with steve.museum", J. Trant, B. Wyman, WWW 2006 Collaborative Tagging Workshop

  8. Investigating social tagging and folksonomy in the art museumwith steve.museum", J. Trant, B. Wyman, WWW 2006 Collaborative Tagging Workshop

  9. What about Browsing? • I think tags need some organization • Currently most tags are used as a direct index into items • Click on tag, see items assigned to it, end of story • Co-occurring tags are not shown • Grouping into small hierarchies is not usually done • del.icio.us now has bundles, but navigation isn’t good • IBM’s dogear and RawSugar come the closest • I think the solution is to organize tags into faceted hierarchies and do browsing in the standard way

  10. Faceted Metadata and Navigation

  11. The Idea of Facets • Facets are a way of labeling data • A kind of Metadata (data about data) • Can be thought of as properties of items • Facets vs. Categories • Items are placed INTO a category system • Multiple facet labels are ASSIGNED TO items

  12. The Idea of Facets • Create INDEPENDENT categories (facets) • Each facet has labels (sometimes arranged in a hierarchy) • Assign labels from the facets to every item • Example: recipe collection Ingredient Cooking Method Chicken Stir-fry Bell Pepper Curry Course Cuisine Main Course Thai

  13. The Idea of Facets • Break out all the important concepts into their own facets • Sometimes the facets are hierarchical • Assign labels to items from any level of the hierarchy Preparation Method Fry Saute Boil Bake Broil Freeze Desserts Cakes Cookies Dairy Ice Cream Sorbet Flan Fruits Cherries Berries Blueberries Strawberries Bananas Pineapple

  14. Using Facets • Now there are multiple ways to get to each item Preparation Method Fry Saute Boil Bake Broil Freeze Desserts Cakes Cookies Dairy Ice Cream Sherbet Flan Fruits Cherries Berries Blueberries Strawberries Bananas Pineapple Fruit > Pineapple Dessert > Cake Preparation > Bake Dessert > Dairy > Sherbet Fruit > Berries > Strawberries Preparation > Freeze

  15. Advantages of Faceted Navigation • Systematically integrates search results: • reflect the structure of the info architecture • retain the context of previous interactions • Gives users control and flexibility • Over order of metadata use • Over when to navigate vs. when to search • Allows integration with advanced methods • Collaborative filtering, predicting users’ preferences

  16. Advantages of Faceted Navigation • Can’t end up with empty results sets • (except with keyword search) • Helps avoid feelings of being lost. • Easier to explore the collection. • Helps users infer what kinds of things are in the collection. • Evokes a feeling of “browsing the shelves” • Is preferred over standard search for collection browsing in usability studies. • (Interface must be designed properly)

  17. Incorporating Tags into Library Catalogs • I think this is where semi-automated techniques for tag conversion will be most helpful. • Some libraries are already going this route: • Michigan State University Library: • http://discover.lib.msu.edu/iii/encore/app • Scottsdale Public Library • http://libcat.scottsdaleaz.gov/ • (search within the Encore box)

  18. Can Tags be Converted to Faceted Metadata?

  19. One attempt: RawSugar • A company/website that organizes bookmark tags into facet hierarchies • Current demo is sparse

  20. (Stoica & Hearst, HLT-NAACL ’07) CastaNet:Creating Facet Hierarchies from Text

  21. Example: Recipes (3500 docs)

  22. Castanet Output (shown in Flamenco)

  23. Castanet Output (shown in Flamenco)

  24. Castanet Output (shown in Flamenco)

  25. Example: Biology Journal TitlesCastanet Output (shown in Flamenco)

  26. Build tree Compress tree Select terms Get hypernym paths WordNet Divide into facets Castanet Algorithm • Leverage the structure of WordNet Documents

  27. Select well distributed terms from collection red blue 1. Select Terms Build tree Comp. tree Documents Select terms Get hypernym paths WordNet

  28. Build tree Comp. tree Documents Select terms Get hypernym paths abstraction abstraction property property WordNet visual property visual property color color chromatic color chromatic color red, redness blue, blueness 2. Get Hypernym Path red blue

  29. abstraction abstraction abstraction property property property visual property visual property visual property color color color chromatic color chromatic color chromatic color red, redness blue, blueness red, redness blue, blueness red blue 3. Build Tree Build tree Comp. tree Documents Select terms Get hypernym paths WordNet red blue

  30. color chromatic color red blue green 4. Compress Tree Build tree Comp. tree Documents Select terms Get hypernym paths WordNet color chromatic color red, redness blue, blueness green, greenness red blue green

  31. 4. Compress Tree (cont.) Build tree Comp. tree Documents Select terms Get hypernym paths WordNet color color chromatic color red blue green red blue green

  32. 5. Divide into Facets Divide into facets

  33. 2 paths for same word Sense 2 for word “tuna” organism, being => fish => food fish => tuna => bony fish => spiny-finned fish => percoid fish => tuna Sense 1 for word “tuna” organism, being => plant, flora => vascular plant => succulent => cactus => tuna 2 paths for same sense Disambiguation • Ambiguity in: • Word senses • Paths up the hypernym tree

  34. How to Select the Right Senses and Paths? • First: build core tree • (1) Create paths for words with only one sense • (2) Use Domains • Wordnet has 212 Domains • medicine, mathematics, biology, chemistry, linguistics, soccer, etc. • Automatically scan the collection to see which domains apply • The user selects which of the suggested domains to use or may add own • Paths for terms that match the selected domains are added to the core tree • Then: add remaining terms to the core tree.

  35. Castanet Evaluation Method • Information architects assessed the category systems • For each of 2 systems’ output: • Examined and commented on top-level • Examined and commented on two sub-levels • Also compared to a baseline system • Then comment on overall properties • Meaningful? • Systematic? • Likely to use in your work?

  36. CastaNet Evaluation Results • Results on recipes collection for “Would you use this system in your work?” • # “Yes in some cases” or “yes, definitely”: • Castanet: 29/34 • LDA: 0/18 • Subsumption: 6/16 • Baseline: 25/34 • Average response to questions about quality(4 = “strongly agree”)

  37. Will Castanet Work on Tags? • Class project by Simon King and Jeff Towle, 2004 • 1650 captions captured from mobile phones • “Blocks with Grandpa”, “Weezer” , “A veterans day tour of berkeley in front of south hall.”, “Bad photo”, “Kitchen”, “Jgj ” • Wanted to organize them. • Use the CastaNet wordnet-based facet-hierarchy creation algorithm • by Stoica & Hearst, to appear at HLT-NAACL ’07 • Had to first remove proper names

  38. Example Photos & Captions (King & Towle) very scary x-mas tree Hp presentation chasing a cat in the dark My cat

  39. instrumentality, (112) vehicle (26) car (9) bike (8) vessel, watercraft (4) mayflower (2) ferry (1) gig (1) truck (3) airplane (2) device (20) machine (7) computer (4) laptop (1) sander (1) game (8) auction (1) skittles (1) diversion, recreation (6) athletic game (4) baseball (1) basketball (1) football (1) soccer (1) playing (2) frolic (1) container (16) vessel (7) bottle (5) water_bottle (2) jug (1) pill_bottle (1) bath (2) bowl (1) can (2) backpack (1) bumper (1) empty (1) salt_shaker (1) furniture, piece of furniture, article of furniture (12) seat (8) bench (2) chair (2) couch (2) lounge (1) bed (4) desk (1)

  40. Next Steps • We need more analysis of: • What characterizes tags? • What makes for useful tags? • This will support automatic tag assignment and organization • Other Research Questions: • How can the interface encourage consistency, coherence, and coverage? • How to get tag expertise? • Right now, in many cases it is least-common-denominator

  41. What’s up with Tag Clouds? What does a typical tag cloud look like?

  42. Definition Tag Cloud: A visual representation of social tags, organized into paragraph-style layout, usually in alphabetical order, where the relative size and weight of the font for each tag corresponds to the relative frequency of its use.

  43. Definition Tag Cloud: A visual representation of social tags, organized into paragraph-style layout, usually in alphabeticalorder, where the relative size and weightofthefont for each tag correspondsto the relative frequency of its use.

  44. flickr’s tag cloud

  45. del.icio.us

  46. del.icio.us

  47. blogs

  48. ma.gnolia.com

More Related