1 / 22

Amazon.com Recommendation Item-to-Item Collaborative Filtering

Amazon.com Recommendation Item-to-Item Collaborative Filtering. Authors: Greg Linden, Brent Smith, and Jeremy York Origin: Jan/ Feb 2003 Published by the IEEE Computer Society Presented By: Ankita Khosla (ak12x) Date: April 10, 2014. Outline. Introduction Problems

summer
Download Presentation

Amazon.com Recommendation Item-to-Item Collaborative Filtering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Amazon.com Recommendation Item-to-Item Collaborative Filtering Authors: Greg Linden, Brent Smith, and Jeremy York Origin: Jan/ Feb 2003 Published by the IEEE Computer Society Presented By: Ankita Khosla (ak12x) Date: April 10, 2014

  2. Outline • Introduction • Problems • Recommendation Algorithms • Comparison • Conclusion

  3. Introduction • Recommendation algorithms are best known for their use on e-commerce Web sites • Many applications use different attributes to generate recommendations : items that customers purchase , ratings , items viewed, demographic data, subject interests, and favorite artists. • Amazon.com uses recommendation algorithms to personalize the online store for each customer

  4. Problems/ Challenges • Many applications require the results set to be returned in real time, while still producing high-quality recommendations • New customers typically have extremely limited information, based on few purchases or ratings • Customer data is volatile: the algorithm must respond immediately to new information

  5. Three common approaches to solving the problem • Traditional collaborative filtering • Cluster models • Search-based methods Amazon.com • Item-to-Item CF Algorithm: for each user’s purchased and rated items, it attempts to find similar items and then aggregates them

  6. Traditional Collaborative Filtering • Represents a customer as an N-dimensional vector of items, where N is the number of distinct catalog items • Components of vectors are +ve or –ve based on the ratings • The algorithm generates recommendations based on a few customers who are most similar to the user • Similarity measure between two customers can be calculated using Cosine Measure

  7. Traditional Collaborative Filtering Disadvantage • Computationally expensive - O(MN) in worst case ~ O(M+N) Approx • Scaling issues can be partially improved by reducing data size (reducing customer or items) • If algorithm examines only a small customer sample, the selected customers will be less similar to the user • If it discards the most popular or unpopular items, they will never appear as recommendations

  8. Cluster Models Goal: The algorithm’s goal is to Divide the customer base into many segments and assign the user to the segment containing the most similar customers. It then uses the purchases and ratings of the customers in the segment to generate recommendations.

  9. Cluster Models • Advantage Smaller size of group have better online scalability and performance because it compares the user with controlled number of segments rather then entire customer base. • Disadvantage Complex and expensive clustering computation is run offline. However, recommendation quality is low.

  10. Search-or Content-Based Methods • Given the user’s purchased and rated items, constructs a search query to find other popular items • For example, same author, artist, director, or similar keywords

  11. Search-or Content-Based Methods • If the user has few purchases or ratings, search-based recommendation algorithms scale and perform well • If users with thousands of purchases, itis impractical to base a query on all the items. In such case the algorithm must use a subset of data which would further reduce the quality

  12. Search-or Content-Based Methods Disadvantage • Too general – Eg. Best selling drama DVD titles • Too narrow- Eg. All the books by the same author Recommendations should help a customer find and discover new , relevant and interesting items . Popular items by the same author or in the same subject category fail to achieve this goal.

  13. Item-to-Item Collaborative Filtering • Rather than matching the user to similar customers , this algorithm matches each of the user’s purchased and rated items to similar items and then combine those items into a recommendation list • It builds a similar-itemstable by finding items that customers tend to purchase together

  14. Amazon.com

  15. Amazon.com

  16. Item-to-Item CF Algorithm Iterative algorithm which provides better approach by calculating the similarity between a single product and all related products: For each item in product catalog, I1 For each customer C who purchased I1 For each item I2 purchased by customer C Record that a customer purchased I1and I2 For each item I2 Compute the similarity between I1and I2

  17. Item-to-Item CF Algorithm • The similarity between two items can be computed by the Cosine Measure • Where each vector corresponds to an item rather than a customer, and vector’s M dimension correspond to customers who have purchased that item • Given a similar-items table, the algorithm finds items similar to each of the user’s purchases and ratings , aggregates those items and then recommend the most popular or correlated items

  18. Scalability: A Comparison • Traditional CF: Very little offline computation and its online computation scales with no. of customers and items. Impractical on large data sets • Cluster models: Perform much of the computation offline, but recommendation quality is relatively poor • Search-based models: They build indexes offline but fail to provide recommendations with interesting , targeted titles. Scale poorly for customers with numerous purchases and ratings

  19. Scalability: A Comparison • Item-to-Item CF: • Creates the similar-items table offline • The online component looks up similar items for user’s purchases and ratings • Scales independently of the catalog size or the no. of customers • Fast for extremely large data set • Quality is excellent • Performs well with limited user data

  20. Conclusion • Unlike other algorithms , item-to-item CF is able to meet the challenges of large retailers like Amazon.com • It is scalable over very large customer bases and product catalogs • Requires only sub-second processing time to generate online recommendations • It is able to react immediately to changes in a user’s data • Makes compelling recommendations for all users regardless of the number of purchases and ratings

  21. References • Linden, G.; Smith, B.; York, J. Amazon.com recommendations: item-to-item collaborative filtering. Internet Computing, IEEE 2003 http://ieeexplore.ieee.org.proxy.lib.fsu.edu/stamp/stamp.jsp?tp=&arnumber=1167344

  22. Thank You Q & A

More Related