20 likes | 189 Views
Integrating Social Media Data of Different Types for Community Detection . Jiliang Tang Computer Science and Engineering, Arizona State University, USA {Jiliang.Tang@asu.edu}. Motivation Link information in social media can be noisy and sparse
E N D
Integrating Social Media Data of Different Types for Community Detection Jiliang Tang Computer Science and Engineering, Arizona State University, USA {Jiliang.Tang@asu.edu} • Motivation • Link information in social media can be noisy and sparse • Additional sources can provide complemental information , e.g., bookmarking and tagging data indicate user interests, frequency of comments suggests the strength of ties • Since multiple sources can provide a more comprehensive information for users, integrating them should improve the performance of community detection • Experimental Results • Integrating multi-source can significantly improve the performance of community detection • Different sources make uneven contributions to the improvement • The proposed integration framework outperforms other representative methods • Integrating more data sources does not necessary bring better performances • Our Framework • Integrating multi-source can be formulated as the following joint optimization problem: • m is the number of sources, n is the number of nodes and is used to control the sparseness of each column of H • Future work • Since different sources contribute unevenly to the performance, one way to distinct the importance of each source is to weight each source. • Exploring new additional sources of social media data is another promising direction such as incomplete user profiles, short and unconventional text s like tweets • The joint optimization problem is equivalent to the following optimization problem: • and • Contributions • Identifying the need for integrating multiple sources • Proposing a novel framework for multi-source integration • Relating our framework with NMF to show the generality of the framework • Our framework can be interpreted to stack multi-source • Our framework can be applicable to any problems that are suitable for NMF This work was funded, in part, by ONR Data Mining and Machine Learning Lab