330 likes | 465 Views
Social Bookmarking and Collaborative Filtering. Christopher G. Wagner. What is Social Bookmarking?. Bookmark storage Online storage vice locally in a browser No folders Items can belong to more than one “folder” Finding others with similar interests
E N D
Social Bookmarking and Collaborative Filtering Christopher G. Wagner
What is Social Bookmarking? • Bookmark storage • Online storage vice locally in a browser • No folders • Items can belong to more than one “folder” • Finding others with similar interests • Using interests of others to locate more interesting sites
Views of Social Bookmarks • View personal bookmarks and tags • View all items with a particular tag(s) • New way of searching • View tags of another user • Create private and public groups for sharing • View ratings of bookmarks
Social Bookmarking Projects • Del.icio.us • Furl.net • Flickr.com • Simpy.com • Gmail.com • Clusty.com • Stumbleupon.com • IBM’s dogear
What is Collaborative Filtering? • “Collaborative filtering (CF) is the method of making automatic predictions (filtering) about the interests of a user by collecting taste information from many users (collaborating).” -Wikipedia (http://en.wikipedia.org/wiki/Collaborative_filtering) • Take advantage of users’ input and behavior to make recommendations. • “System for helping people find relevant content” -Rashmi Sinha (http://www.rashmisinha.com)
TraditionalCollaborative Filtering • Each user represented by an N-dimensional vector, where N is the number of items • Elements of vector can be ratings, or indicator of purchase, etc. • Typically multiplied by the inverse frequency • Use algorithm to measure similarity of vectors, e.g. cosine similarity
Problems • M customers, N items • O(MN) is worst case • Typically O(M+N) • Still problematic when M,N ~ 106
Cluster Models • View customers as a classification problem • Create clusters of customers • Assign user to “nearest” cluster • Base recommendations on user’s cluster
Search Based Methods • Construct searches based on keywords from user’s existing items • Not practical if user has many items • Recommendations tend to be poor
Types ofCollaborative Filtering • Active • Sending pointers to a resource • User ratings • Passive • Observing user behavior • Item Based • Items become the focus, not users
Active Collaborative Filtering • Uses a peer-to-peer approach • Users want to actively share information, recommendations, evaluations, ratings, etc. • Usually, information is from a user who has direct experience with the product • Biased opinions • Less data available
Netflix Prize • October 2, 2006 - October 2, 2011 • Improve their recommendation system by at least 10% over the current method • $1M Grand Prize • $50k Yearly Prizes
Passive Collaborative Filtering • Monitor user’s activity • Purchasing item • Repeated use of an item • Number of times queried • Makes use of implicit filters • Requires nothing additional from users • Doesn’t capture user’s evaluation
Google’s Sponsored Links • www.AreYouASlackerMom.com • www.royalsaharajasper.com • Related to Pi Mu Epsilon • “Will pay stipend to Grad” • “Cheap Faculty Flights” • “Greek Ringtone”
Item-to-ItemCollaborative Filtering • Focus is on finding similar items, not similar customers • Originally proposed by Vucetic and Obradovic in 2000 • Matches user’s items to similar items to create recommendations • Association Rule Mining
Amazon Slide • Similar to impulse items in checkout line • Tailored to each user
Amazon’s Algorithm For each item in product catalog, I1 For each customer C who purchased I1 For each item I2 purchased by customer C Record that a customer purchased I1 and I2 For each item I2 Compute the similarity between I1 and I2 • Only items purchased by common customer are compared, not all pairs of items
Run Time of Algorithm • Worst case O(N2M) • In practice, more like O(NM) • Is run offline, so it does not affect customer • For customer, you only have to aggregate items similar to their purchases and make recommendations, which is fast
Collaborative FilteringWith Tags • User input is usually a barrier, not so with tags • User’s bookmarks reveal information about their interests, which is useful for finding others of similar interests • Applications to corporate repositories of information (IBM’s dogear) • Both active (tags) and passive (logs) filtering
References • G. Linden, B. Smith, and J. York, “Amazon.com Recommendations: Item-to-Item Collaborative Filtering,” IEEE Internet Computing, 2003, pp. 76-80. • R. Sinha, “Collaborative Filtering strikes back (this time with tags)”, http://www.rashmisinha.com/archives/05_10/tags-collaborative-filtering.html. • S. Vucetic and Z. Obradovic, “A Regression-Based Approach for Scaling-Up Personalized Recommender Systems in E-Commerce,” Workshop on Web Mining for E-Commerce, at the 6th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), Boston, MA, 2000. • R. Wash and E. Rader, “Collaborative Filtering with del.icio.us”, working paper. • R. Wash and E. Rader, “Incentives for Contribution in del.icio.us: The Role of Tagging in Information Discovery”, working paper. • Wikipedia, “Collaborative Filtering”, http://en.wikipedia.org/wiki/Collaborative_filtering.