Amazon.com Recommendation Item-to-Item Collaborative Filtering

Amazon.com Recommendation Item-to-Item Collaborative Filtering Authors: Greg Linden, Brent Smith, and Jeremy York Origin: Jan/ Feb 2003 Published by the IEEE Computer Society Presented By: Ankita Khosla (ak12x) Date: April 10, 2014

Outline • Introduction • Problems • Recommendation Algorithms • Comparison • Conclusion

Introduction • Recommendation algorithms are best known for their use on e-commerce Web sites • Many applications use different attributes to generate recommendations : items that customers purchase , ratings , items viewed, demographic data, subject interests, and favorite artists. • Amazon.com uses recommendation algorithms to personalize the online store for each customer

Problems/ Challenges • Many applications require the results set to be returned in real time, while still producing high-quality recommendations • New customers typically have extremely limited information, based on few purchases or ratings • Customer data is volatile: the algorithm must respond immediately to new information

Three common approaches to solving the problem • Traditional collaborative filtering • Cluster models • Search-based methods Amazon.com • Item-to-Item CF Algorithm: for each user’s purchased and rated items, it attempts to find similar items and then aggregates them

Traditional Collaborative Filtering • Represents a customer as an N-dimensional vector of items, where N is the number of distinct catalog items • Components of vectors are +ve or –ve based on the ratings • The algorithm generates recommendations based on a few customers who are most similar to the user • Similarity measure between two customers can be calculated using Cosine Measure

Traditional Collaborative Filtering Disadvantage • Computationally expensive - O(MN) in worst case ~ O(M+N) Approx • Scaling issues can be partially improved by reducing data size (reducing customer or items) • If algorithm examines only a small customer sample, the selected customers will be less similar to the user • If it discards the most popular or unpopular items, they will never appear as recommendations

Cluster Models Goal: The algorithm’s goal is to Divide the customer base into many segments and assign the user to the segment containing the most similar customers. It then uses the purchases and ratings of the customers in the segment to generate recommendations.

Cluster Models • Advantage Smaller size of group have better online scalability and performance because it compares the user with controlled number of segments rather then entire customer base. • Disadvantage Complex and expensive clustering computation is run offline. However, recommendation quality is low.

Search-or Content-Based Methods • Given the user’s purchased and rated items, constructs a search query to find other popular items • For example, same author, artist, director, or similar keywords

Search-or Content-Based Methods • If the user has few purchases or ratings, search-based recommendation algorithms scale and perform well • If users with thousands of purchases, itis impractical to base a query on all the items. In such case the algorithm must use a subset of data which would further reduce the quality

Search-or Content-Based Methods Disadvantage • Too general – Eg. Best selling drama DVD titles • Too narrow- Eg. All the books by the same author Recommendations should help a customer find and discover new , relevant and interesting items . Popular items by the same author or in the same subject category fail to achieve this goal.

Item-to-Item Collaborative Filtering • Rather than matching the user to similar customers , this algorithm matches each of the user’s purchased and rated items to similar items and then combine those items into a recommendation list • It builds a similar-itemstable by finding items that customers tend to purchase together

Amazon.com

Item-to-Item CF Algorithm Iterative algorithm which provides better approach by calculating the similarity between a single product and all related products: For each item in product catalog, I1 For each customer C who purchased I1 For each item I2 purchased by customer C Record that a customer purchased I1and I2 For each item I2 Compute the similarity between I1and I2

Item-to-Item CF Algorithm • The similarity between two items can be computed by the Cosine Measure • Where each vector corresponds to an item rather than a customer, and vector’s M dimension correspond to customers who have purchased that item • Given a similar-items table, the algorithm finds items similar to each of the user’s purchases and ratings , aggregates those items and then recommend the most popular or correlated items

Scalability: A Comparison • Traditional CF: Very little offline computation and its online computation scales with no. of customers and items. Impractical on large data sets • Cluster models: Perform much of the computation offline, but recommendation quality is relatively poor • Search-based models: They build indexes offline but fail to provide recommendations with interesting , targeted titles. Scale poorly for customers with numerous purchases and ratings

Scalability: A Comparison • Item-to-Item CF: • Creates the similar-items table offline • The online component looks up similar items for user’s purchases and ratings • Scales independently of the catalog size or the no. of customers • Fast for extremely large data set • Quality is excellent • Performs well with limited user data

Conclusion • Unlike other algorithms , item-to-item CF is able to meet the challenges of large retailers like Amazon.com • It is scalable over very large customer bases and product catalogs • Requires only sub-second processing time to generate online recommendations • It is able to react immediately to changes in a user’s data • Makes compelling recommendations for all users regardless of the number of purchases and ratings

References • Linden, G.; Smith, B.; York, J. Amazon.com recommendations: item-to-item collaborative filtering. Internet Computing, IEEE 2003 http://ieeexplore.ieee.org.proxy.lib.fsu.edu/stamp/stamp.jsp?tp=&arnumber=1167344

Thank You Q & A

Amazon.com Recommendation Item-to-Item Collaborative Filtering

Amazon.com Recommendation Item-to-Item Collaborative Filtering

Presentation Transcript

, refer . item Bottom-up item 3 To do List Item , refer

Item Analysis

Item Analysis

Item Based Collaborative Filtering Recommendation Algorithms

Item Development

Collaborative Filtering Recommendation

ITEM ANALYSIS

Test Item

Item #8

Bullet item Bullet item Bullet item

Item-Based Collaborative Filtering Recommendation Algorithms

Item Based Collaborative Filtering Recommendation Algorithms

Item

Objects : ( item 0 , item 1 ,  , item N  1 )

ITEM ANALYSIS

Collaborative Filtering Recommendation

Item