1 / 18

Community-based Greedy Algorithm for Mining Top-K Influential Nodes in Mobile Social Networks

Community-based Greedy Algorithm for Mining Top-K Influential Nodes in Mobile Social Networks. Yu Wang 1 , Gao Cong 2 , Guojie Song 1 , Kunqing Xie 1. 1 Peking University, China 2 Nanyang Technological University, Singapore. Problem and Background.

colin
Download Presentation

Community-based Greedy Algorithm for Mining Top-K Influential Nodes in Mobile Social Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Community-based Greedy Algorithm for Mining Top-K Influential Nodes in Mobile Social Networks Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1 1 Peking University, China 2Nanyang Technological University, Singapore

  2. Problem and Background • Problem: Given a mobile social network, we aim to mine a set of top-K influential nodes S such that R(S) is maximized using the extended Independent Cascade information diffusion model. • A mobile social network plays an essential role as the spread of information and influence in the form of "word-of-mouth“ • The problem is NP-hard. • computationally expensive to run the greedy algorithm on a large network. • The previous greedy algorithms take days to finish on 723k nodes

  3. Basic Idea of the Algorithm Dynamic programming Algorithm & greedy algorithm on selected communities Community Detection: it based on diffusion Model on MSN Construct Network from CDR (call detailed record)

  4. Step1: Extracting Mobile Social Network • Extract a Mobile Social Network from CDR data and model it as a directed weighted graph • A phone user -- a node • A directed edge u v is established, if there exits communication from u to v • communication time -- the weight of the edge

  5. Extended Independent Cascade Model • Two states of nodes • Active & inactive • Diffusion speed λ • When an active node vicontactsan inactive node vj, the inactive node becomes active at a probability(rate) λij.

  6. Extended Independent Cascade Model

  7. Step2: Influential Model Based Community Detection Algorithm • Community Partition • Each node is assigned a unique community label from 1 to N • For each node compute the set of its influenced neighbors using Independent Cascade diffusion model • Iteratively propagate the labels through the network in finite iterations • for each node v ,the label of the community that the majority of its influenced neighbors belong to  the label of v • Community Combination • the difference between the node’s influence degree in its community and its influence degree in the network is smaller than a threshold.

  8. Step3: Community-Based Greedy Algorithm • Choose communities to find the Top-1influential node C2 C1 ΔR2=0.3 ΔR1=0.2 • R[1,1]=max{R[0,1], R[3,0]+ΔR1}=0.2 • s[1,1]=C1; • R[2,1]=max{R[1,1], R[3,0]+ ΔR2}=0.3 • s[2,1]=C2; • R[3,1]=max{R[2,1], R[3,0]+ ΔR3}=0.3 • s[3,1]=C2; • So we mine top-1 node in C2 ΔR3=0.1 C3

  9. Community-Based Greedy Algorithm • Choose communities to find the Top-2 influential node C2 C1 ΔR2=0.06 ΔR1=0.2 • Note ΔR2 is 0.06, but not 0.3. • R[1,2]= max{R[0,2], R[3,1]+ΔR1}=0.5 • s[1,2]=C1; • R[2,2]= max{R[1,2], R[3,1]+ΔR2}=0.5 • s[2,2]=C1; • R[3,2]= max{R[2,2], R[3,1]+ΔR3}=0.5 • s[3,2]=C1; • We mine the second node in C1 ΔR3=0.1 C3

  10. Experiments • Data Sets • Extract a Mobile Social Network from a three-month CDR (call detailed record) data of a city from China Mobile • Node number: 723,201 • Average degree: 13.4

  11. Community distribution • largest community size: 95,690

  12. Experiments • Top-k Nodes Mining Methods • MixedGreedy Algorithm • NewGreedy Algorithm • DegreeDiscount • Random Method • CGA • SPCGA • Parameter study: • k, diffusion speed λ, data size

  13. Results • Influence degree and time vs K

  14. Results • Influence degree and time vs diffusion speed λ

  15. Results • Influence degree and time vs network size

  16. Summary • Handle large-scale networks (power-law distribution degree) • improve the efficiency of existing algorithms by an order of magnitude while the loss in approximation precision is small • Can combine with any existing algorithm to find influential nodes w.r.t. communities

  17. Related work on Top-K Algorithm --None of them considers community property Typical Greedy Algorithm( Kempel et al. KDD2003) CELF Greedy Algorithm (Leskovec et al. KDD2007) An improved greedy algorithm (Kimura et al. AAAI2007) NewGreedy Algorithm, MixedGreedy, DegreeDiscount Algorithm (Chen et al. KDD2009) MIA algorithm (Chen et al. KDD2010)

  18. Thank You !

More Related