1 / 28

Bursty Subgraphs in Social Network

Bursty Subgraphs in Social Network. Tam Siu Lun , Calvin. Subgraphs. Users as the nodes Connections edges Become “friends” or follow each others Like, share, retweet. Graph Models. Action graph Detailed Model of all activities (like, share, become fd , etc ) Holistic graph

alaqua
Download Presentation

Bursty Subgraphs in Social Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BurstySubgraphs in Social Network Tam SiuLun, Calvin

  2. Subgraphs • Users as the nodes • Connections edges • Become “friends” or follow each others • Like, share, retweet

  3. Graph Models • Action graph • Detailed Model of all activities (like, share, become fd, etc) • Holistic graph • Consolidate all actions generated by users • An aggregate view on each user

  4. Action graph • Describe topics by a set of keywords Sk • Contact keyword -> relevant • Each nodes (action) is associated with timestamp and user • Link exists between actions • T(a)

  5. Holistic graph • Each nodes reprseents a user • Link -> socially connected • Weight wu = (ru, du) • ru : faction of actions of u relevant to a given topic • du : total number of actions of u

  6. Intrinsic Burst Model • Based on the graphs • Assign burst state sv (busty B or non-busty N) • Actions are emitted at a higher rate at state • For action graph • The time gap x for each vertex • For holistic graph

  7. Intrinsic Burst Model • Target: To maximize • The problem can be simplified to

  8. Social Burst Model • Fuzzy burst state Su = SuB / SuN • Su > 1 => Bursty • Objective: To minimize

  9. Intrinsic Burst Model • Reduce the problem to Min Cut problem

  10. Experimental Result • Apply DIBA and SODA with parameters α and λ • 30 Days of Twitter fire hose • 300~400m tweets per day • 25TB on disk • 5 Most popular topics in June • (1) “Colorado high park fire” • (2) “the grand prix race” • (3) “French Open 2012 tennis final match between RafalNadal and Novak Djokovic” • (4) “Euro 2012” • (5) “Mexico Presidential TV debate on June 10”

  11. Evaluation • Apply action/holistic graph to the 5 hot topics • Action Graph • #Vertex (tweets) varies from 10k to 1.1m • Holistic Graph • #users > 27.5m • More than 1.5b edges

  12. Results • Action graph • Identify users with what actions at what times are bursty • Holistic graph • Identify busty user

  13. Advanced Graph Mining for Community Evaluation in Social Networks and the Web Tam SiuLun, Calvin

  14. Objective • Identify the modules and, possibly, their hierarchical organization, by only using the information encoded in the graph topology. • Different metrics/ measurements /methods are used • Hub/authorities • Modularity • Density/Diameter/Link distribution etc.... • Centrality/Betweenness • Clustering coefficient • Structural cohesion

  15. Community • Depends heavily on the application domain and the properties of the graph under consideration • A community corresponds to a group of nodes with more intra- cluster edges than inter-clusters edges

  16. Community Detection • Specify a quality measure (evaluation measure, objective function) that quantifies the desired properties of communities • Apply algorithmic techniques to assign the nodes of graph into communities, optimizing the objective function • They mostly consider that communities are set of nodes with many edges between them and few connections with nodes of different communities

  17. Community Evaluation Measures • Evaluation based on internal connectivity • Evaluation based on external connectivity • Evaluation based on internal and external connectivity • Evaluation based on network model

  18. Evaluation on Internal Connectivity • Internal density • Edges inside f(S) = ms • Average degree f(S) = 2ms / ns

  19. Evaluation on External Connectivity • Expansion f(S) = cs / ns • Cut ratio • Conductance

  20. Evaluation based on both • Maximum out degree fraction • Average out degree fraction

  21. Evaluation based on network model • Modularity

  22. Graph Clustering algorithm • Hierarchical methods • Modularity Based Methods

  23. Hierarchical graph clustering • Clusters form hierarchies • Need for a cluster similarity measure • Single linkage clustering vs. complete linkage • Agglomerative algorithms, clusters are iteratively merged if their similarity is sufficiently high • Hierarchical clustering does not require a preliminary knowledge on the number and size of the clusters • E.g. Dendrogram

  24. Modularity based method • Initially introduced as a measure for assessing the strength of Communities • Q = (fraction of edges within communities) –(expected number of edges within communities)

  25. Modularity based method

  26. Modularity Based Method • Larger modularity Q indicates better communities (more than random intra-cluster density) • The community structure would be better if the number of internal edges exceed the expected number • Modularity value is always smaller than 1 • It can also take negative values • Partitions with large negative modularityExistence of subgraphs with small internal number of edges and large number of inter-community edges

  27. Newman-Girvan algorithm • A modularity based algorithm • A divisive algorithm (detect and remove edges that connect vertices of different communities) • Idea: try to identify the edges of the graph that are most between other verticesresponsible for connecting many node pairs • Select and remove edges based to the value of betweenness centrality • Betweennesscentrality: number of shortest paths between every pair of nodes, that pass through an edge

  28. Newman-Girvan algorithm • Compute betweenness centrality for all edges in the graph • Find and remove the edge with the highest score • Recalculate betweenness centrality score for the remaining edges • Goto step 2

More Related