220 likes | 396 Views
CS 590 Term Project Epidemic model on Facebook. ChoungRyeol LEE, Shubham Agrawal , Ashwin Jiwane. Facebook (partial) Network. Source : Facebook ego network, Stanford Network Analysis Project. Data Limitation and Processing.
E N D
CS 590Term ProjectEpidemic model on Facebook ChoungRyeol LEE, ShubhamAgrawal, Ashwin Jiwane
Facebook (partial) Network Source: Facebook ego network, Stanford Network Analysis Project
Data Limitation and Processing • It is infeasible for us to access (and handle) the complete Facebook data • Analysis is done on partial dataset obtained from Stanford Network Analysis Project • The original data is the directed ego-network (without ego) of 10 nodes which we had to reconstruct, i.e. make it undirected and add the ego-edges
What is Ego Network? Source: Slides by GiorgosCheliotis, National University of Singapore
Basic Analysis Observations:The graph follows the “Small World Phenomenon” as the average path length is 3.776 but it is not a “Scale-Free” network since it doesn’t follow Power-Law
Friendship Strength • In FB, possible ways to measures friendship: • Mutual friends • Common biography (location, education, etc) • Mutual interests (pages, likes, etc) • Common social groups • Due to limitation of data, we considered only Mutual Friends as the weighing measure
Cosine Similarity • Cosine similarity measures the normalized number of common friends • Basic principle is to take the cosine of the vectors (rows) from adjacency matrix • In study network: • Maximum cosine value = 0.961454 • Minimum cosine value = 0.003408
Epidemic Models • SI and SIR Model: • A node is susceptible to infected node with certain probability • You repost/share from friends • SI Model: • Once a node is infected, it remains infected • Post remains active on the wall • SIR Model: • Once a node is infected, it remains infected for certain time period • Post gets inactive after certain time period
Model Simulation • Simulated epidemic model on the graph • Pre-infected a particular node • Compared the results with different nodes of importance • Checked for the time steps required for complete cascade in SI model • Checked for the time steps required to reach stable condition in SIR model
Model Simulation • Model assumptions: • Probability of infection • Discrete time intervals • Assumed two scenarios of probability: • Function of weight • Similar to ‘Top News’ posts • Independent of weight • Similar to ‘Most Recent’ posts
SIR Interpretation • Unweighted graph (p=0.2): • Degree: steepest curve, infects less people, • EigenVector: steep curve, infects most people • Pagerank: grows slowest, infects more people • Weighted graph(p=0.2): • Degree: steepest curve, infects less people • EigenVector: grows slowest, infects most people • Pagerank: grows slow, infects more people, better than eigenvector due to weights
Currently working on.. • Quarantine Strategy: • Choose the nodes to quarantine at a certain time interval such that they don’t affect others • Account blocked (reported as spam) • Vaccination Strategy: • Choose the nodes to vaccinate, i.e. make them safe from certain viral, such that epidemic doesn’t flow through them • Spam filter • Objective is to minimize the cost of prevention and/or precaution with the aim of ‘curing’ epidemic
Communities in Facebook network • Held together by some common interests and ideas of a large group of people in Facebook • Any one person may be part of many communities which are overlapping and nested structure • Groups within social networks might highly correspond to social units or communities in reality
Reviews of Community Detection • Two methods for discovering groups in networks • Graph partitioning • Pre-fixed number of parts by minimizing “cut edge” • Computation load(NP-Hard) • Community structure detection • Suitable for the structure of large-scale network data • Provides information on topology of the network