1 / 14

Searching and Browsing Using Tags

Searching and Browsing Using Tags. Nikos Sarkas Social Information Systems Seminar DCS, University of Toronto, Winter 2007. Social Resource Sharing. The del.icio.us paradigm. Users store links to web pages of interest along with arbitrary, user-specified tags in a server.

hachi
Download Presentation

Searching and Browsing Using Tags

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Searching and Browsing Using Tags Nikos Sarkas Social Information Systems Seminar DCS, University of Toronto, Winter 2007

  2. Social Resource Sharing • The del.icio.us paradigm. • Users store links to web pages of interest along with arbitrary, user-specified tags in a server. • The model is independent of the resource being shared. • Music (Last.fm) • Photos (Flickr) • Publications (CiteULike) • …

  3. Part I: Searching

  4. Ranking Web Search Results • Two prevalent models. • Ranking based on query-document similarity. • TF/IDF • Metadata extraction • Link analysis • Query independent static ranking. • PageRank • “Quality” based

  5. Similarity Ranking, Take I • Query q={q1,q2,…,qn}. • Tags of URL p, T(p)={t1,t2,…,tm}. • Define similarity as |q∩T(p)|/|T(p)|. • Problems • Synonymy (according to the authors) • Others? • Synonymy example • Linux, Ubuntu and Gnome

  6. Similarity Ranking, Take II • Use tags with “similar” meaning to enrich query. • Create 3 matrices • MTP,tag-URL count matrix • ST, tag-tag similarity matrix • SP, URL-URL similarity matrix

  7. Similarity Ranking, Take II • Iterate • Similarly update SP, until convergence. • Then, similarity between a query q and a url p is

  8. Social PageRank • “Popular web pages are tagged by many up-to-date users, using hot tags”. • Transfer popularity between entities. • Define matrices MPU, MUT, MTP. • Iterate

  9. Putting It All Together • Train a ranking function (RankSVM) using the following features • BM25 similarity between query and url content • Simple query-url tags similarity measure • Complex query-url tags similarity measure • PageRank • Social PageRank • Results • Precision, NDCG at k • Small improvement over BM25, up to 25% for NDCG and synthetic queries

  10. Part II: Browsing

  11. Tag Assisted Browsing • Currently two methods for tag driven browsing • Keyword search • Clouds of popular tags • We would like to support • Semantic browsing: also present URLs annotated with similar tags • Hierarchical browsing: browse in a top-down fashion

  12. Semantic Browsing • Define similarity between tags: • Synonymic tags: similarity above a threshold. • The synonymic tags and the tag itself defines its semantic concept. • Given that the user has selected L tags, that define semantic concepts Sc={C1,…,CL}, related URLs are:

  13. Hierarchical Browsing • Observations • No neat tree structure • Multiple ways to target resource • URLs associated with different categories • Dynamic structure: leafs can become inner nodes

  14. Hierarchical Browsing • Generating sub-tags • Train a classifier to identify which of the tags in the semantic concept are sub-tags • Features used: ratio of tag counts, intersection size, etc. • Clustering sub-tags • Ranks tags based on a complex formula • Greedy clustering technique

More Related