1 / 37

The Summary of My Work In Graduate Grade One

The Summary of My Work In Graduate Grade One. Reporter: Yuanshuai Sun E-mail: sunyuan_2008@yahoo.cn. 1. 2. 3. 4. 5. Content. Recommender System. KNN Algorithm—CF. Matrix Factorization. MF on Hadoop. Thesis Framework. Recommender System. 1.

lea
Download Presentation

The Summary of My Work In Graduate Grade One

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Summary of My Work In Graduate Grade One Reporter: Yuanshuai Sun E-mail: sunyuan_2008@yahoo.cn

  2. 1 2 3 4 5 Content Recommender System KNN Algorithm—CF Matrix Factorization MF on Hadoop Thesis Framework

  3. Recommender System 1 Recommender system is a system which can recommend something you are maybe interested that you haven’t a try. For example, if you have bought a book about machine learning, the system would give a recommendation list including some books about data mining, pattern recognition, even some programming technology.

  4. Recommender System 1

  5. Recommender System 1 But how sheget the recommendation list ? Machine Learning 1. Nuclear Pattern Recognition Method and Its Application 2. Introduction to Robotics 3. Data Mining 4. Beauty of Programming 5. Artificial Intelligence

  6. Recommender System 1 There are many ways by which we can get the list. Recommender systems are usually classified into the following categories, based on how recommendations are made, 1. Content-based recommendations: The user will be recommended items similar to the ones the user preferred in the past;

  7. Recommender System 1 2. Collaborative recommendations: The user will be recommended items that people with similar tastes and preferences liked in the past; Corated Item recommend it to target user Top 1 The similar user favorite but target user not bought

  8. Recommender System 1 3. Hybrid approaches: These methods combine collaborative and content-based methods, which can help to avoid certain limitations of content-based and collaborative. Different ways to combine collaborative and content-based methods into a hybrid recommender system can be classified as follows: 1). implementing collaborative and content-based methods separately and combining their predictions, 2). incorporating some content-based characteristics into a collaborative approach, 3). incorporating some collaborative characteristics into a content-based approach, 4). constructing a general unifying model that incorporates both content-based and collaborative characteristics.

  9. KNN Algorithm—CF 2 KDD CUP 2011 website: http://kddcup.yahoo.com/index.php Recommending Music Items based on the Yahoo! Music Dataset. The dataset is split into two subsets: - Train data: in the file trainIdx2.txt - Test data: in the file testIdx2.txtAt each subset, user rating data is grouped by user. First line for a user is formatted as: <UsedId>|<#UserRatings>\n Each of the next <#UserRatings> lines describes a single rating by <UsedId>. Rating line format: <ItemId>\t<Score>\n The scores are integers lying between 0 and 100, and are withheld from the test set. All user id's and item id's are consecutive integers, both starting at zero

  10. KNN Algorithm—CF 2 KNN is the algorithm used when I participate the KDD CUP 2011 with my advisor Mrs Lin, KNN belongs to collaborative recommendation. Corated Item recommend it to target user Top 1 The similar user’s favorite song but target user not seen

  11. KNN Algorithm—CF 2 item user

  12. KNN Algorithm—CF 2 1. Cosine distance 2. Pearson correlation coefficient Where Sxy is the set of all items corated by both users x and y.

  13. KNN Algorithm—CF 2 1. Cosine distance where and

  14. KNN Algorithm—CF 2 2. Pearson correlation coefficient where and

  15. KNN Algorithm—CF 2 trackData.txt - Track information formatted as:<TrackId>|<AlbumId>|<ArtistId>|<Optional GenreId_1>|...|<Optional GenreId_k>\n albumData.txt - Album information formatted as:<AlbumId>|<ArtistId>|<Optional GenreId_1>|...|<Optional GenreId_k>\n artistData.txt - Artist listing formatted as:<ArtistId>\n genreData.txt - Genre listing formatted as:<GenreId>\n

  16. KNN Algorithm—CF 2

  17. KNN Algorithm—CF 2 1. The distance between parent node with child node where is comentropy. 2. Similarity between c1 and c2

  18. KNN Algorithm—CF 2

  19. KNN Algorithm—CF 2

  20. Matrix Factorization 3 i1 i2 i3 Users Feature Matrix Items Feature Matrix u1 u2 u3 x11*y11 + x12*y12 = 1 x11*y21 + x12*y22 = 3 x21*y11 + x22*y12 = 2 x31*y21 + x32*y22 = 1 x31*y31 + x32*y32 = 3 x11*y31 + x12*y32 = ? x21*y21 + x22*y22 = ? x21*y31 + x22*y32 = ? x31*y11 + x32*y12 = ? U,V

  21. Matrix Factorization 3 Matrix factorization (abbr. MF), just as the name suggests, decomposes a big matrix into the multiplication form of several small matrix. It defines mathematically as follows, We here assume the target matrix , the factor matrix and , where k << min (m, n), so it is

  22. Matrix Factorization 3 Kernel Function Kernel Function decides how to compute the prediction matrix , that is, it’s a function with the features matrix U and V as the arguments. We can express it as follows:

  23. Matrix Factorization 3 Kernel Function For the kernel K : one can use one of the following well-known kernels: ……………… linear ………… polynomial ……….. RBF ……… logistic with

  24. Matrix Factorization 3 We quantify the quality of the approximation with the Euclidean distance, so we can get the objective function as follows, Where i.e. is the predict value.

  25. Matrix Factorization 3 1. Alternating Descent Method This method only works, when the loss function implies with Euclidean distance. So, we can get The same to .

  26. Matrix Factorization 3 2. Gradient Descent Method The update rules of U defines as follows, where The same to .

  27. Matrix Factorization 3 Stochastic Gradient Algorithm Gradient Algorithm

  28. Matrix Factorization 3 Online Algorithm Online-Updating Regularized Kernel Matrix FactorizationModels for Large-Scale Recommender Systems

  29. MF on Hadoop 4 Loss Function We update the factor V for reducing the objective function f with the conventional gradient descendent, as follows, Here we set , so it is reachable , the same to factor matrix U.

  30. MF on Hadoop 4

  31. MF on Hadoop 4

  32. MF on Hadoop 4

  33. MF on Hadoop 4 × = + Left Matrix × = + Right Matrix × = ||

  34. MF on Hadoop 4

  35. MF on Hadoop 4 where

  36. Recommendation System Thesis Framework 5 • Introduction to recommendation system • My work to KNN • Matrix factorization in recommendation system • MF incremental updating using Hadoop

  37. 谢谢观赏! Thanks!

More Related