120 likes | 132 Views
This paper presents a viewpoint-based approach for analyzing interaction graphs, allowing for the extraction of local neighborhoods for nodes and quantification of their relationships. The approach incorporates different activation functions to capture topological, semantic, and domain-specific attributes. The paper also discusses the evolution of these neighborhoods over time and introduces behavioral measures for sociability, stability, impact, and popularity.
E N D
A Viewpoint-based Approach for Interaction Graph Analysis SitaramAsur and Srinivasan Parthasarathy Department of Computer Science and Engineering Ohio State University, Columbus, Ohio
Motivation • Massive social network datasets give you both more and less [Kleinberg, 2007]: • More: can observe global phenomena that are genuine, but literally invisible at smaller scales. • Less: Don’t really know what any one node or link means. Easy to measure things; hard to pose nuanced questions • Community-based analysis useful but limited • All nodes in clusters generally treated the same • Problem : To extract the local neighborhood of interest for a node • To use structure and topology to quantify local relationships • To observe effect of changes in the graph from the viewpoint of given node or set of nodes • Potentially useful in search, recommendation and advertising
Viewpoint Neighborhood (VPN) • VPN(S) : the graph rooted at source node S containing only nodes with some degree of importance to S and their interconnections. • But how to measure importance ? • Initial Solution : Use distance from source • Depth-limited VPN for a node • Subgraph representing the set of nodes reachable at a distance <=k from the node and the interactions among them • Can be constructed using Depth-limited search (DLS) from the source node • But is this enough?
Viewpoint Neighborhoods • Problems with Depth-limited VPN • All nodes the same distance away are treated the same • Hub nodes need to be differentiated • Criteria for constructing a VPN • Inverse Distance Weighting: • Involvement of a node to a VPN inversely proportional to its distance from the source node • Intuition : Node is more affected by closer events • Link Structure: • Local topological information is important • Well-connected nodes in the VPN are more important to source node • Hub Nodes: • Hub nodes can bloat neighborhoods by bringing in many unimportant nodes • Need to expand hub nodes with low probability
Activation Spread Model • Source node begins activation with a budget M • It distributes M among its immediate neighbors activating them • Each node retains some amount, activates its neighbors and continues the distribution • Distribution handled by Activation Function • Each node is activated at most once • If a node is touched more than once, it retains the amount it receives • Threshold used to hasten convergence • Activated nodes form the VPN of source node • Value present with each node represents its commitment value for VPN • Related to the heat diffusion model for graphs
Activation Functions • Inverse-degree Activation • Down-weights nodes with high degrees • Each node x retains 1/degree(x) of the amount received • Rest distributed equally among its descendants • Strong emphasis on hubs • Weaker emphasis on link structure • Betweenness-based Activation • Compute local betweenness values for nodes within VPN • Consider shortest paths between source node and members of the VPN • Ratio of betweenness values used to distribute • Strong emphasis on link structure • Can be made to handle hubs by using inverse-degree to construct basic VPN first M/2 M/2 M/6 5M/6 M/6 M/6 M/6 M/2 Inverse Degree Activation Betweenness-based Activation
Activation Functions • Semantic Activation • Use semantic features from content to extract neighborhoods • Semantic similarity w.r.t source used to decide distribution ratios • Eliminates noise and irrelevant nodes • Useful in personalized and keyword search applications • In practice, combination of different activation functions can be employed • Domain-specific features can be included
Neighborhood Sizes - Wikipedia • Global increase (23x) in number of nodes does not affect size of local neighborhoods too much!
S S S A B C D S S S B A F C H E A B G H C C D G E Temporal Analysis for VPNs • Characterize evolution of Viewpoint neighborhoods over time • Critical Events • Grow • Shrink • Continue • Mutate • Attraction • Repulsion • DBLP : grow/shrink ratio ~1, low continue, high mutate, attract/repel ratio ~ 1 • Wikipedia : grow/shrink ratio >>1, high continue events, attract >> repel
Behavioral Measures • Incremental behavioral measures composed from events • Stability, Sociability, Impact, Popularity
Conclusions • Viewpoint Neighborhoods • To identify a neighborhood of interest for a node and quantify local relationships within • General activation spread model with different activation functions capturing topological, semantic and domain-specific attributes • Extension to find the joint VPN of a group of nodes • Evolutionary analysis to identify changes to VPNs over time • Critical events to define behavior of neighborhoods • Behavioral measures for sociability, stability, impact and popularity • Pattern mining over VPNs • Core Subgraphs to identify core influential structures w.r.t certain nodes • Transformation Subgraphs to measure the effect of changes on the graph on specific viewpoint neighborhoods
Acknowledgements • Grants: • NSF: CAREER-IIS-0347662 • NSF SGER Grant IIS-0742999 • DOE: DE-FG02-04ER2561