180 likes | 348 Views
H AROKOPIO U NIVERSITY of A THENS Department of Informatics & Telematics. A Scalable Solution for Personalized Recommendations in Large-scale Social Networks. Sardianos Christos, Varlamis Iraklis Harokopio University of Athens Dept. of Informatics & Telematics
E N D
HAROKOPIOUNIVERSITY of ATHENS Department of Informatics & Telematics A Scalable Solution for Personalized Recommendations in Large-scale Social Networks Sardianos Christos, VarlamisIraklis Harokopio University of Athens Dept. of Informatics & Telematics {sardianos}{varlamis}@hua.gr PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Role of Recommender Systems • In many Web 2.0 applications users can interact with the applications in terms of social activity. • They can express their trust for another user or another user’s review. • A recommender system is responsible for recommending items (e.g. products, articles etc.) to users, based on their previous activity. • This can be a difficult process, using existing techniques, in large social and bipartite graphs. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Structure of Recommender Systems PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Recommender Systems Approaches • There are many Recommender Systems approaches, which can be broadly categorized into the following categories.* * P. Melville, V. Sindhwani. "Recommender Systems", Encyclopedia of Machine Learning, Springer, 2010. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Limitations of Existing Approaches Social networks like Facebook & Twitterhave over 1.5BN & 95Musers respectively. Thus, a major limitation for Recommender Systems is scalability. • The process of generating recommendations for users, for whom the system has insufficient information (Cold-Start users) is a known issue of Recommender Systems. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Scientific Research Question-Definition • Is it possible to achieve equally good recommendations by applying CF over subgraphs of the original graph? • Is it possible to use these subgraphs for providing a solution for the Cold-Start problem? • Proposed Solution: The creation of subgraphs based on social information content. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Proposed Approach & Tools SVD User-User Item-Item Social Graph Subgraphs Partitioning • Partitioning using Metisfrom KarypisLab* • CF using LensKitRecommenderToolkit (GroupLens Research**) Bipartite Graph Collaborative Filtering Recommendations * http://glaros.dtc.umn.edu/gkhome/index.php ** http://lenskit.grouplens.org/ PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Description of the model functionality Bipartite Graph PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Evaluation Metrics PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Dataset Characteristics Comparison PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Experimental Procedure Platform used for experiments • Use ofOkeanos IaaS Cloud provided by The Greek Research and Technology Network (GRNET S.A.) • TwoLinux based systems: • Ubuntu Desktop 64-bit • 2-CPUs QEMU Virtual CPU v.:1.7.0 • 2.1GHz CPU Speed, 512KB cache • 6GB RAM memory Experimental procedure implementation • Model implementation inJava • Evaluationprocess run through Groovy scripts PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Evaluation of the Experimental Procedure • Algorithms evaluated: • User-User • Item-Item • FunkSVD (SVD Implementation) • We performed a 5-fold Cross-Validation over the Training & Testing samples. • The range of the different number of subgraphsexamined was:s = {1, 2, 4, 8, 16, 33, 65, 125, 250, 500, 1000}, using the whole neighborhood as k-nearest neighbors. • Fors = {4, 65, 1000}we examined the performance ofUser-User algorithm for differentNeighborhood–Size (knn),with k = {1, 3, 5, 10, 25, 50, 100, 500, 1000}. • The number of features used for training by FunkSVD algorithm was set to:FeatureCount=100. • The number ofListsize for the Top-N nDCGmetric was set to:Ν =5. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Evaluation Findings • Evaluation time is rapidly reduced, while number of subgraphsincreases. • Fors>16(~7.530 users), Item-Item algorithm performs fasterthan User-User και SVD. • Execution of Item-Item & User-Useralgorithms over the full graph was impossible, whileSVD algorithm could not be executed for s<4(~30.123users), due to memory insufficiency because of the way SVD algorithm works. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Evaluation Findings • Algorithms SVD & Item-Item appear to have normalized gain, unlikeUser-Userthat performs poorly, due to the notable large number of items per subgraph. • Algorithm Item-Item can predict similar items (based on the ratings), whileSVDcreates a smaller and denser item space. Better performance! PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Evaluation Findings • Results are comparable to those from Epinions. • User-User algorithm still doesn’t perform well, but has more stable behavior. • There is however, a larger standard deviation of the performance ofUser-User algorithm over each subgraph for the different values ofs, unlike Item-Item & SVD algorithms. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Conclusions Is it possible to create a model that will take into account the social network of the users for creating personalized recommendations in large-scale social networks? • In conclusion, we can say that the performance of the proposed model (CF in subgraphs) is comparable to that of the traditional techniques (CF in full graph). • In sparse bipartitegraphs, the performance of this model may be reduced. • But, using algorithms such as SVD, we can provide a solution even in the case of sparse bipartite graphs. • The proposed approach could be utilized to implement a distributed recommender system, minimizing the execution time and producing high quality recommendations. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Future Work PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics
Thank you for your time. PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics