190 likes | 201 Views
This paper explores the use of the PageRank algorithm to identify key users in social networks. Key users, who have a large impact on the network, are identified based on their connectivity and communication activity. The weighted activity graph and users' centrality scores are used to determine key users. The algorithm is demonstrated and evaluated using a Facebook dataset. The paper concludes that identifying key users can help generate sustainable revenues in social networks.
E N D
PageRankIdentifying key users in social networks Student : Ivan Todorović, 3231/2014 Mentor : Prof. Dr Veljko Milutinović
Introduction • Social Networks – Connecting people • Sustainable revenues • Full advertising potential • Key Users • Novel PageRank 2/19
What is a Key User ? • Large community • Affects a large number of persons • Unlikely to live OSN • Pay for Premium services 3/19
Users’ Connectivity in OSN • Structural characteristics of the network • Well-connected users • Social Graph • Centrality measures • Degree • Closeness • Betweenness 4/19
Users’ Communication Activity • Exchange of information • User interaction • Activity Graph • Strong/Weak connection 5/19
PageRank • An algorithm used by Google • PageRank is a link analysis algorithm • Outputs a probability distribution • Apply to any graph or network • Personalized PageRank is used by Twitter 6/19
Novel PageRank • Identify key users • First step • Derive a weighted activity graph • Second step • Determine users’ centrality scores 7/19
Weighted Activity Graph • Users who actually communicate • Graph Links • Informational and Normative influence 8/19
Weighted Activity Graph • Graph representation • Symmetric adjacency matrix • Weight of an undirected activity link Cij – number of communication activities (i j) Cji – number of communication activities (j i) • Activity Graph n – Number of users 9/19
Users’ Centrality Scores • PageRank used by Google N – Total number of webpages Oj – Number of outgoing links from page j Bi – Set of web pages pointing to web page i d – dampening factor (usually set to 0.85) • Novel PageRank • Fi – Set of users connected to i 10/19
Demonstration and Evaluation • Facebook dataset – New Orleans • Set of users (63,731) • Set of social links (817,090) • Communication activity • 832,277 wall posts • BFS Crawler 11/19
Pros and Cons • Great results • Complexity O(n²) • Social and Activity Graph • Offline contacts • Direction of posts/messages • Privacy risks 13/19
Conclusion • Potential to generate sustainable revenues • Easy to implement • Efficient 14/19
Improvements • Text Mining to detect influence • Scan user messages • Detect positive/negative user response • Use it to form directed activity graph 15/19
Improvements Hey, check this movie(…) Detected negative response Well, I don’t like comedy moves A Okay, maybe we could watch this one (…) B That trailer looks really good Influence confirmed A B 16/19
Improvements • Distributed PageRank algorithm • Monte Carlo approximation • Perform K random walks in parallel • Walk to a random neighbor (probability 1- Ɛ) • Terminate in current node (probability Ɛ) • After walk termination • Each node computes its PageRank value • Complexity O(log n / Ɛ) 17/19
Literature • Antonio Caso, Silvia Rossi, “Users Ranking in Online Social Networks to Support POI Selection in Small Groups”, University of Naples • Wikipedia, “PageRank”, http://en.wikipedia.org/wiki/PageRank, December 2014. • Julia Heidemann, Mathias Klier,Florian Probst, “IdentifyingKey Users in Online Social Networks – PageRank Based Approach”, Research Paper, University of Augsburg, University of Innsbruck 18/19
Thank you for your attention Questions ? 19/19