250 likes | 383 Views
Scalable Learning of Collective Behavior Based on Sparse Social Dimensions. Lei Tang, Huan Liu CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/02/01. Outline. Introduction Collective Behavior Learning Social Dimensions Algorithm Edge-Centric View K-means Variant Experiment Setup
E N D
Scalable Learning of Collective Behavior Based on Sparse Social Dimensions Lei Tang, Huan Liu CIKM’09 Speaker: Hsin-Lan, Wang Date: 2010/02/01
Outline • Introduction • Collective Behavior Learning • Social Dimensions • Algorithm • Edge-Centric View • K-means Variant • Experiment Setup • Experiment Results • Conclusions and Future Work
Introduction • Social media facilitate people of all walks of life to connect to each other. • We study how networks in social media can help predict some sorts of human behavior and individual preference.
Introduction • In social media, the connections of the same network are not homogeneous. However, this relation type information is not readily available in reality. • A framework based on social dimensions is proposed to address this heterogeneity.
Introduction • In the initial study, modularity maximization is exploited to extract social dimensions. • With huge number of actors, the dimensions cannot even be held in memory. • In this work, we propose an effective edge-centric approach to extract sparse social dimensions.
Collective Behavior Learning • When people are exposed in a social network environment, their behaviors can be influenced by the behaviors of their friends. • People are more likely to connect to others sharing certain similarity with them.
Collective Behavior Learning • K class labels • network V is the vertex set, E is the edge set and are the class labels of a vertex • Given known values of for some subsets of vertices . • How to infer the values of for the remaining vertices
Social Dimensions • To address the heterogeneity presented in connections, we have proposed a framework (SocDim) for collective behavior learning. • Framework SocDim is composed of two steps: 1. social dimension extraction 2. discriminative learning
Social Dimensions • These social dimensions can be treated as features of actors. • Since network is converted into features, typical classifier such as support vector machine can be employed.
Social Dimensions • Concerns about the scalability of SocDim with modularity maximization: • The social dimensions extracted according to modularity maximization are dense. • Requires the computation of the top eigenvectors of a modularity matrix which is of size n*n. • The dynamic nature of networks entails efficient update of the model for collective behavior prediction.
Algorithm -Edge-Centric View • Treat each edge as one instance, and the nodes that define edges as features.
Algorithm -Edge-Centric View • Based on the features of each edge, we can cluster the edges into two sets. • One actor is considered associated with one affiliation as long as any of his connections is assigned to that affiliation.
Algorithm -Edge-Centric View • In summary, to extract social dimensions, we cluster edges rather than nodes in a network into disjoint sets. • Because the affiliations of one actor are no more than the connections he has, the social dimensions based on edge-centric clustering are guaranteed to be sparse.
Experiment Results -Prediction Performance • Prediction performance on all the studied social media data is around 20-30% for F1 measure. This is partly due to : • large number of labels in the data • only employ the network information
Conclusions and Future Work • To address the scalability issue, we propose an edge-centric clustering scheme to extract social dimensions and a scalable k-means variant to handle edge clustering. • The model based on the sparse social dimensions shows comparable prediction performance as earlier proposed approaches to extract social dimensions.
Conclusions and Future Work • In reality, each edge can be associated with multiple affiliations while our current model assumes only one dominant affiliation. • The proposed EdgeCluster model is sensitive to the number of social dimensions.