240 likes | 475 Views
A Study of social influence in diffusion of innovation over Facebook. Shaomei Wu sw475@cornell.edu Information Science Cornell University Information Science Breakfast, Dec 5, 2008. Diffusion of Innovation.
E N D
A Study of social influence in diffusion of innovation over Facebook Shaomei Wu sw475@cornell.edu Information Science Cornell University Information Science Breakfast, Dec 5, 2008
Diffusion of Innovation “ Diffusion is the process in which an innovation is communicated through certain channels over time among the members of a social system. ” –––– Everett M. Rogers * • “innovation”: Friendship Quiz – a Facebook application • “Communicated”: Invitations among Facebook friends • “time”: September 25, 2008 – Now • “social system”: Facebook * Rogers, Everett M. (2003). Diffusion of Innovations, 5th ed.. New York, NY: Free Press, pp 5-6
Basic Diffusion Models Threshold Model ⇔ Cascade Model Statistically Equivalent * *David Kempe, Jon Kleinberg, Eva Tardos. Maximizing the Spread of Influence through a Social Network.KDD, 2003
Cascade Model • Each recommendation will succeed with certain probability. h k b pgk c i pab pab pac pdi pgl g pag d a pad l pdj paf j pae non-adopter adopter social link recommendation f e Question: how to estimate puv ?
Question: how to estimate puv? • Current practice • Constant [1] • Based on ONLY network structure (e.g., in/out-degree) [2] Do individuals and the social relationship among them matter? [1] Jure Leskovec, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst,Cascading Behavior in Large Blog Graphs. SDM 2007. [2] Jure Leskovec, Lada Adamic, Bernardo Huberman. The Dynamics of Viral Marketing. ACM Conference on Electronic Commerce (EC) 2006.
Theories from Empirical Diffusion Research: • Opinion leaders: who own “greater exposure to mass media than their followers”, “are more cosmopolite”, “have greater social participation” , “have higher socioeconomic status”, and “are more innovative” [Rogers 2003, pp 316-318]. • The importance of heterophilybetween participants on certain attributes (i.e., education and socioeconomic status) at determining the efficiency of diffusion, despite the fact that “more effective communication occurs when two or more individuals are homophilous” [Rogers, 2003, pp19]
This project is to… • Model puv’s for cascade model • Identify the most influential factors at determining puv • Predict the success of contagion • Exploit Facebook data • A real-world, ongoing diffusion instance; • Rich and (most of the time) trustable profile information of individuals and their social connections/activities; • Precisely timestamped diffusion process, a complete log of events;
Status • Launched: Sep 25, 2008. • Currently used data is until: Nov25, 2008. • 216 adopters, • 375 individuals, • 737 edges between 266 pairs of people, • 90 successful infection • 178 failed infection • Network Evolution (in the first month after release)
Predict the success of invitation with SVM • A Binary classifier: • each invitation is either successful or failed. • Features • Individual features • Pair features (homophily/heterophily)
Individual Features # of events attended/invited # of photo tagged # of wall posts # of networks # of groups participated # of notes Religion Political View Gender Age Culture Background Relationship Status Work Info Education Info Social Activeness Innovativeness Socioeconomics Education
Pair-wise Features Biological traits Age difference Same gender? Same political view? Same religion? Same culture background? # of same networks # of photos both tagged # of groups both participated # of events both attended Same education level? Same high school? Same college? Same workplace? Same current city? Belief Socioeconomics Proximity
Each invitation is a training example - machine learning. Training Data * all numerical features are normalized across examples.
AdaBoost (with DecisionDump) A popular way to do feature selection. • Selected Features • sender wall post count • sender group count • sender network count • receiver age • receiver group count • sender & receiver common group count • Performance (10-fold cross validation) • Accuracy: 83.6%
SVM performance • SVM-light (10-fold cross-validation)
Result • SVM-light performance • 209 records into 5 folds, 4 for training, 1 for testing. • Performance on the testing set: • Accuracy: 71.43% (30 correct, 12 incorrect, 42 total) • Precision/recall: 55.56%/38.46% • Feature weights distribution Top weighted features: 8, sender_events_invited,4, sender_friend_count,11, sender_gender35, receiver_is_It's Complicated5, sender_wall_post_count,9, sender_note_count27. sender_is_In a Relationship So, the story can be: when a sender who has been invited to greater number of events in Facebook, has more friends, wrote more Facebook notes (blog entries), is female, has less wall posts, in a relationship, tried to infect a person whose relationship status is “it’s complicated”, it’s more like the infection will happen compared to other cases.
Background • Diffusion of Innovation • Question: • How does it work in largeonline social networks? • What are the key factors at determining the success of infection? • Can we predict the propagation path?
Hypothesis • Social influence depends on 5 dimensions of similarities: • geographical distance current location(country/state/city), current school, current major, year of class, current workplace, current courses enrolled; • background similarity sex, sexual preference, dating interest, relationship interest, relationship status, birthday, political view, religious view, hometown address, previous school, previous workplace; • social similarity number of mutual networks they belong to, number of mutual friends; • interest similarity activities, favorite books, favorite music, favorite movies, favorite TV shows, favorite quotas; • social status distance difference of numbers of friends, difference of wallpost counts, difference of counts of message sent and received, difference of counts of notes.
Project Description • Objectives • Identify the key factors for social influence; • Predict occurrence of adoption based on the key factors. • Friendship Quiz • A Facebook application we developed; • Enable users to make quizzes and send to their friends (take a peek!); • We track the spread of application.
Highlights • A real-world diffusion of innovation; • Rich and (most of the time) trustful profile information of individuals and their social connections/activities; • Precisely timestamped diffusion process, a complete log of events; • Ongoing diffusion process