1 / 19

Prastava

Prastava. An open source pure ruby based generic recommendation system. Submitted by :. Himanshu Gahlot , MNNIT, Allahabad, India/ WING, NUS, Singapore Tarun Kumar, IIIT, Allahabad, India/ WING, NUS, Singapore Project Guide : Prof. Min Yen Kan, Assistant Professor, NUS, Singapore.

ryann
Download Presentation

Prastava

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prastava An open source pure ruby based generic recommendation system

  2. Submitted by : • HimanshuGahlot, MNNIT, Allahabad, India/ WING, NUS, Singapore • Tarun Kumar, IIIT, Allahabad, India/ WING, NUS, Singapore Project Guide : Prof. Min Yen Kan, Assistant Professor, NUS, Singapore

  3. Motivation • Many websites use recommendation systems built in other languages. • Why ruby?

  4. Methods Used • Collaborative Filtering • Content Based Filtering

  5. Collaborative Filtering (CF) • The most commonly adopted technique in crafting academic and commercial [1] recommender systems. • Making recommendations based upon ratings that users have assigned to items. • Two types : - User-based collaborative filtering - Item-based collaborative filtering

  6. The CF Engine

  7. User Based Collaborative Filtering • An item x user matrix with each user giving ratings to each item is taken as input.

  8. An example rating matrix with four users having rated 6 seasons :

  9. Item Based Collaborative Filtering

  10. An example of item based CF

  11. Algorithms for similarity measure • Cosine Similarity • Pearson Correlation

  12. Cosine Similarity • Cosine similarity between two vectors can be defined as : where A and B are the two ranking vectors

  13. Pearson Correlation • A correlation is a number between -1 and +1 that measures the degree of association between two variables (call them X and Y). • The formula for Pearson correlation between two vectors X and Y having elements xi and yi is as follows :

  14. Difference between Cosine Similarity and Pearson Correlation

  15. Content Based Filtering • If the content of items is available then if one is selected we can recommend another based on the similarity of content. • Trem Frequency – Inverse Document Frequency (TF-IDF) algorithm is used for finding similarity between documents. • TF = (Freq. of the term in a doc.)/(Total sum of freq. of all terms in the same doc.) • IDF = log [(Total no. of docs.)/(Total no. of docs. which contain this term + 1)] • TFIDF = TF * IDF

  16. Method for calculating similarity between documents • All the stop words are first removed. • The remaining terms are then changed to their root form. • TFIDF values are calculated for unique and stored in a vector. • Such vectors are produced for all documents. • Now similarity between documents can be calculated by using cosine similarity between two vectors.

  17. Screen Shot • screenshots of working code….

  18. CVS on

  19. Thank You

More Related