310 likes | 677 Views
Link Prediction and Recommendation across Multiple Heterogeneous Networks. Yuxiao Dong * $ , Jie Tang $ , Sen Wu $ , Jilei Tian # Nitesh V. Chawla * , Jinghai Rao # , Huanhuan Cao #. *University of Notre Dame $ Tsinghua University # Nokia Research China.
E N D
Link Prediction and Recommendation acrossMultiple Heterogeneous Networks Yuxiao Dong*$, Jie Tang$, Sen Wu$, Jilei Tian# Nitesh V. Chawla*, Jinghai Rao#, Huanhuan Cao# *University of Notre Dame $Tsinghua University #Nokia Research China Yuxiao Dong, Jie Tang, Sen Wu, Jilei Tian, Nitesh V. Chawla, Jinghai Rao, Huanhuan Cao. Link Prediction and Recommendation across Heterogeneous Social networks. In IEEE ICDM'12.
Introduction Link prediction and recommendation is ubiquitous …
Topic: Transfer Link Prediction Framework ? x ? x 1 General Features Target network Source network Transfer Model 2 3 x
Link Prediction and Recommendation What is link prediction? • G=(V, E): social network • vs: a particular user • C: candidates for vs • Y: candidates’ rank ? ? • Input: G, vs, C • Output: f: (G, vs, C) Y
Link Prediction and Recommendation Basic Idea y13 ? ? y14 y12
Ranking Factor Graph Model (RFG) Social factor Latent Variable Attribute factor Joint distribution: Attribute factors Social factors
Ranking Factor Graph Model • Model Initialization • Joint distribution: Attributes Social factors • Exponential-linear functions to initialize factors • Attribute factor: • Social factor:
Ranking Factor Graph Model • Objective function and model learning • RFG objective function: • Learning[1]: 1. Wenbin Tang, Honglei Zhuang, Jie Tang. Learning to infer social ties in large networks. In ECML/PKDD'11, pp 381-397.
Still Problems? • Challenges in traditional link prediction problem • Unbalanced Data: the number of potential candidates grows exponentially (d(vs)n-1) as the number of hops n increases. • Few Training Data: obtaining sufficient training data is difficult.
Link Prediction Framework ? ? 1 Features Model 2 3 x
Transfer Link Prediction Framework ? x ? x 1 General Features Target network Source network Transfer Model 2 3 x
General social factors • How do we create social connections? • What are the general factors driving people make friends in real world and form links in social networks? • Homophily • Social Balance • Preferential Triad Closure
General social factors Homophily/ Social Balance / Preferential Triad Closure • The principle of homophily suggests that users with similar characteristics tend to associate with each other. 1. The likelihood of two users creating a link increases when the number of their common neighbors increases in the four networks. 2. This effect of homophily is more pronounced when the number reaches 100, where the probabilities are all higher than 50% in the four networks.
General social factors Homophily/ Social Balance / Preferential Triad Closure • Social balance theory is based on the principles that “the friend of my friend is my friend” and “the enemy of my enemy is my friend”. It is more likely (more than 80% likelihood) for users to establish balanced triangle of friendships in all four online networks.
General social factors Homophily / Social Balance / Preferential Triad Closure • Preferential attachment and triadic closure are the basic models, concerning the nature of human social interactions and agency on a local scale. x Lady Gaga student x Why? student Barack Obama
General social factors Homophily / Social Balance / Preferential Triad Closure Four networks share a very similar distribution on probabilities of close triad formation in all six cases, though the four networks are totally different. The enumeration is conditioned on whether X, Y, Z are opinion leaders (green means it is an opinion leader).
Transfer Ranking Factor Graph Model • RFG • Objective • function: Attributes factor in target network Attributes factor in source network • TRFG Objective function: General social factors across source and target networks General social factors across source and target networks Bridge source & target networks
Experiment Setup Networks and Candidate Generation • Randomly select 2000 nodes as the source users[2] from the network. • For each source user, we generate the candidate list for her/him. • 1. Epinions, Slashdot, Wikivote are available at http://snap.stanford.edu and Twitter available at http://arnetminer.org/reciprocal • 2. Lars Backstrom, Jure Leskovec. Supervised Random Walks: Predicting and Recommending Links in social Networks. In WSDM’11
Experiment Setup • Features
Experiment Setup • Baseline Predictors • Unsupervised methods • Common neighbors • Adamic/Adar • Jaccard Index • Preferential Attachment • Supervised methods • SVMRank (SVM-light) • Logistic Regression (Weka) • Ranking Factor Graph Model
Results Non-transfer case • AUC • Precision @ 30
Results Transfer: one source network to one target network 1 source networks
Results Transfer: multiple source networks to one target network 2 source networks 3 source networks 4 source networks
Summary Study the novel problem of Transfer Link Prediction across Multiple Heterogeneous Networks Discovery general social theories on link formation across networks, including homophily, social balance and preferential triadic closure Propose transfer ranking factor graph model to leverage the observed general factors, and demonstrate the effectiveness of it in five real-world networks
Thanks Yuxiao Dong, Jie Tang, Sen Wu, Jilei Tian, Nitesh V. Chawla, Jinghai Rao, Huanhuan Cao. Link Prediction and Recommendation across Heterogeneous Social networks. In IEEE ICDM'12.