230 likes | 329 Views
A glimpse on social influence and link prediction in OSNs. Workshop on Data Driven Dynamical Networks. Speaker:. Luca Maria Aiello, PhD student Università degli Studi di Torino Computer Science Department aiello@di.unito.it.
E N D
A glimpse on social influence and link prediction in OSNs Workshop on Data Driven Dynamical Networks Speaker: • Luca Maria Aiello, PhD student • UniversitàdegliStudidi Torino • Computer Science Department • aiello@di.unito.it Keywords : link creation, link prediction, homophily, social influence, aNobii
Acknowledgments Giancarlo Ruffo RossanoSchifanella UniversitàdegliStudidi Torino ISI Foundation Alain Barrat CiroCattuto People: School of Informatics and Computing, Indiana University FilippoMenczer
Dynamics leading to link creation Food networks Collaboration networks Social media 2nd part: exploit the observations on these phenomena to predict future links • Several theories from sociology • Self-interest • Mutual-interest • Exchange • Contagion (influence) • Balance • Homophily • Proximity Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Social network for bookworms • Profile features • Library and wishlist • Groups • Tags • Social network • Directed • Friendship + neighborhood • 6 snapshots, 15 days apart • Full giant connected component Data-driven analysis on anobii.com Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Basic statistics ng(kout) nb(kout) 103 nw(kout) 102 101 100 103 100 101 102 kout • Broad distributions • Positive correlations between connectivity and activity • Assortativity Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Triadic closure • Classification of new links at time t+1 between nodes already present at time t (t ∈ {1,…,5}) Double closure Closure Bidirectional Direct Reciprocated 75% 20% 30% 25% 10% Reciprocation is strong (exchange) Users tend to choose “friends of their friends” as new friends (balance) Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Profile similarity vs. social distance Does similarity between user profiles depend on the social distance? • Topical overlap • Statistical correlation because of assortative biases? • Null model to discern real overlap from purely statistical effects • No topical overlap other than that caused by statistical mixing patters Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Geographical overlap Null model test with random link rewire Country-level overlap due to language barriers City level overlap SocialCom 2010 - Luca Maria Aiello, Università degli Studi di Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Causality between similarity and link creation • What is the cause of topical overlap? • Topical overlap is observed for all profile features • Three possible explanations: • Homophily (people connect with similar people) • Social influence(social connection conveys similarity) • Mixture of the two • Explore the causality relationship between profile similarity and social linking Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Similarity link creation (homophily) Average similarity of pairs forming new links between t and t+1 (t=4), compared with average similarity of all the pairs at distance 2 at time t Pairs that are going to get connected show a substantially higher similarity Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Link creation similarity (influence) Groups Books Evolution of the similarity between pairs linking together at different times Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Summary Can we exploit the observations on these phenomena to predict future links? • Theories to explain link creation • Self-interest • Mutual-interest • Exchange Reciprocity in linking • Contagion Social influence • Balance Triangle closure • Homophily For all profile features • Proximity Geographical and on social graph Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Link prediction Learning set example Snapshots at time t and t+1 Predict links created between t and t+1 given the whole information at time t Supervised learning approach to combine profile and structural features Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Features • Structural • Common neighbors • Distance on graph • Preferential attachment • Resource allocation • Local path • Profile • Library (cosine) • Groups (cosine) • Groups (size) • Gender {0,1} • Town {0,1} • Age (|age1 – age2|) • Country {0,1} • Vocabulary (cosine) • Wishlists (cosine) • Tagging behavior Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Link prediction: preliminary results • Rotation forest, 10-fold cross-validation, balanced sets • Rotation forest, 10-fold cross-validation, unbalanced sets Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Conclusions and future work • Theories on social network growth are verified • Causality between similarity and social connection • Effective link detection/prediction • Topical information seems to be predictive as well as structural information • RFC: • Link prediction sampling/evaluation procedure • New challenges in prediction Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Workshop on Data Driven Dynamical Networks Thank you for your attention! Speaker: Luca Maria Aiello aiello@di.unito.it www.di.unito.it/~aiello Reference: L. M. Aiello, A. Barrat, C. Cattuto, G. Ruffo, R. Schifanella"Link creation and profilealignment in the aNobii social network"In SocialCom'10: Proceedingsof the 2nd IEEE International Conference on Social Computing, Minneapolis, MN, USA, August 2010